§ feed · storyline

Andrew likes Agents

Andrew Ng's The Batch reports that GPT-3.5 in an iterative agent loop scores 95.1% on HumanEval, exceeding GPT-4 zero-shot at 67.0%, highlighting agent workflows as a key performance lever.

Mar 26 · 02:11:50 · primary fetch1 sourceupdated Mar 26 · 02:11:50

Andrew Ng's The Batch writeup on Agents highlighted the significant improvement in coding benchmark performance when using an iterative agent workflow, with GPT-3.5 wrapped in an agent loop achieving up to 95.1% correctness on HumanEval, surpassing GPT-4 zero-shot at 67.0%. The report also covers new developments in Stable Diffusion models like Cyberrealistic_v40, Platypus XL, and SDXL Lightning for Naruto-style image generation, alongside innovations in LoRA and upscaling techniques.

Discussions on local LLM deployment and optimization focus on hardware setups and finetuning strategies for efficient inference and multi-user serving. Emad's departure from Stability AI and new Sora videos from OpenAI were also noted.

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.aiAndrew likes Agentsprimary02:11:50