§ feed · storyline

Parcae: Doing more with fewer parameters using stable looped models

Parcae releases a looped language model that matches 1.3B-parameter Transformer quality at 770M parameters, with new scaling laws showing increased recurrence improves compute efficiency.

Apr 15 · 02:00:00 · primary fetch1 sourceupdated Apr 15 · 02:00:00

Parcae is a stable looped language model that matches the quality of a Transformer twice its size — a 770M model reaching 1.3B-level performance.

We introduce the first scaling laws for looping and show that increasing recurrence, not just data, is a compute-efficient path to bet

read full article on together.ai ↗

§ sources1 publication · timeline below

together.aiParcae: Doing more with fewer parameters using stable looped modelsprimary02:00:00