shipfeedAI news, curated daily

01:22:38 CET
21 MAY01:22:38shipfeed
pull to refreshlast sync
Just in — 30 new
§ feed · storyline

Jamba: Mixture of Architectures dethrones Mixtral

AI21 Labs releases Jamba, a 52B-parameter hybrid transformer-Mamba MoE model with 256K context length and Apache 2.0 open weights, optimised to run on a single A100 GPU.

Mar 29 · · primary fetch1 sourceupdated Mar 29 ·

AI21 labs released Jamba, a 52B parameter MoE model with 256K context length and open weights under Apache 2.0 license, optimized for single A100 GPU performance. It features a unique blocks-and-layers architecture combining transformer and MoE layers, competing with models like Mixtral. Meanwhile, Databricks introduced DBRX, a 36B active parameter MoE model trained on 12T tokens, noted as a new standard for open LLMs.

In image generation, advancements include Animatediff for video-quality image generation and FastSD CPU v1.0.0 beta 28 enabling ultra-fast image generation on CPUs. Other innovations involve style-content separation using B-LoRA and improvements in high-resolution image upscaling with SUPIR.

read full article on news.smol.ai
§ sources1 publication · timeline below
  1. news.smol.aiJamba: Mixture of Architectures dethrones Mixtralprimary