shipfeedAI news, curated daily

23:53:39 CET
20 MAY23:53:39shipfeed
pull to refreshlast sync
Just in — 30 new
§ feed · storyline

Nvidia introduces 4-bit pretraining method for large language models

Nvidia introduces a 4-bit pretraining method using NVFP4, validated on a 12B hybrid Mamba-Transformer model in the longest publicly documented 4-bit training run.

May 18 · · primary fetch1 sourceupdated May 18 ·

NVIDIA introduced a 4-bit pretraining methodology using NVFP4, validated on a 12B hybrid Mamba-Transformer, marking the longest publicly documented 4-bit training run.

read full article on marktechpost.com
§ sources1 publication · timeline below
  1. marktechpost.comNVIDIA Introduces a 4-Bit Pretraining Methodology Using NVFP4, Validated on a 12B Hybrid Mamba-Transformer at 10T Token Horizonprimary