§ feed · storyline

Adept Fuyu-Heavy: Multimodal model for Agents

Adept launches Fuyu-Heavy, a multimodal model targeting UI understanding and visual QA that outperforms Gemini Pro on the MMMU benchmark, with parameter count estimated between 20B and 170B.

Jan 25 · 22:30:23 · primary fetch1 sourceupdated Jan 25 · 22:30:23

Adept launched Fuyu-Heavy, a multimodal model focused on UI understanding and visual QA, outperforming Gemini Pro on the MMMU benchmark. The model uses DPO (Direct Preference Optimization), gaining attention as a leading tuning method. The size of Fuyu-Heavy is undisclosed but estimated between 20B-170B parameters, smaller than rumored frontier models like Claude 2, GPT4V, and Gemini Ultra. Meanwhile, Mamba was rejected at ICLR for quality concerns. In Discord discussions, DeepSeek Coder 33B was claimed to outperform GPT-4 in coding tasks, and deployment strategies for large models like Yi-34B-200K and Goliath-120B were explored.

Quantization debates highlighted mixed views on Q8 and EXL2 quants. Fine-tuning and instruct-tuning of Mistral 7B Instruct v0.2 were discussed, alongside insights on RMS optimization and heterogeneous AI architectures combining Transformers and Selective SSM (Mamba). The potential of recurrent LLMs like RWKV and techniques like Contrastive Preference Optimization (CPO) were also noted.

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.aiAdept Fuyu-Heavy: Multimodal model for Agentsprimary22:30:23