§ feed · storyline

Skyfall

Google DeepMind released Gemini 1.5 Pro and Flash with sparse multimodal MoE architecture, up to 10M token context, and a dense decoder variant that is 3x faster and 10x cheaper.

May 21 · 01:02:42 · primary fetch1 sourceupdated May 21 · 01:02:42

Between 5/17 and 5/20/2024, key AI updates include Google DeepMind's Gemini 1.5 Pro and Flash models, featuring sparse multimodal MoE architecture with up to 10M context and a dense Transformer decoder that is 3x faster and 10x cheaper. Yi AI released Yi-1.5 models with extended context windows of 32K and 16K tokens. Other notable releases include Kosmos 2.5 (Microsoft), PaliGemma (Google), Falcon 2, DeepSeek v2 lite, and HunyuanDiT diffusion model. Research highlights feature an Observational Scaling Laws paper predicting model performance across families, a Layer-Condensed KV Cache technique boosting inference throughput by up to 26×, and the SUPRA method converting LLMs into RNNs for reduced compute costs.

Hugging Face expanded local AI capabilities enabling on-device AI without cloud dependency. LangChain updated its v0.2 release with improved documentation. The community also welcomed a new LLM Finetuning Discord by Hamel Husain and Dan Becker for Maven course users. "Hugging Face is profitable, or close to profitable," enabling $10 million in free shared GPUs for developers.

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.aiSkyfallprimary01:02:42