Evals-based AI Engineering
Hamel Husain outlines an evals-based AI engineering workflow while new models Jamba, Bamboo, Qwen1.5-MoE, and Grok 1.5 debut alongside quantization advances for Llama2-7B.
Hamel Husain emphasizes the importance of comprehensive evals in AI product development, highlighting evaluation, debugging, and behavior change as key iterative steps. OpenAI released a voice engine demo showcasing advanced voice cloning from small samples, raising safety concerns. Reddit discussions introduced new models like Jamba (hybrid Transformer-SSM with MoE), Bamboo (7B LLM with high sparsity based on Mistral), Qwen1.5-MoE (efficient parameter activation), and Grok 1.5 (128k context length, surpassing GPT-4 in code generation).
Advances in quantization include 1-bit Llama2-7B models outperforming full precision and the QLLM quantization toolbox supporting GPTQ/AWQ/HQQ methods.
- news.smol.aiEvals-based AI Engineeringprimary