§ feed · storyline

How AI training scales

OpenAI publishes research showing the gradient noise scale predicts optimal batch sizes for neural network training, suggesting large batches will enable further scaling of AI systems.

Dec 14 · 09:00:00 · primary fetch1 sourceupdated Dec 14 · 09:00:00

We’ve discovered that the gradient noise scale, a simple statistical metric, predicts the parallelizability of neural network training on a wide range of tasks. Since complex tasks tend to have noisier gradients, increasingly large batch sizes are likely to become useful in the future, removing one potential limit to further growth of AI systems.

More broadly, these results show that neural network training need not be considered a mysterious art, but can be rigorized and systematized.

read full article on openai.com ↗

§ sources1 publication · timeline below

openai.comHow AI training scalesprimary09:00:00