§ feed · storyline
Parcae: Doing more with fewer parameters using stable looped models
Parcae releases a looped language model that matches 1.3B-parameter Transformer quality at 770M parameters, with new scaling laws showing increased recurrence improves compute efficiency.
Parcae is a stable looped language model that matches the quality of a Transformer twice its size — a 770M model reaching 1.3B-level performance.
We introduce the first scaling laws for looping and show that increasing recurrence, not just data, is a compute-efficient path to bet
§ sources1 publication · timeline below