§ feed · storyline

Batch Inference API gains 3000× rate limit boost and expanded models

Batch Inference API launches with 3000× higher rate limits, up to 30B tokens, universal model support, and half the cost of real-time APIs for large-scale workloads.

Sep 15 · 02:00:00 · primary fetch1 sourceupdated Sep 15 · 02:00:00

Our new Batch Inference API makes large-scale AI workloads simpler, faster, and cheaper. With a streamlined UI, universal model support, and 3000× higher rate limits—now up to 30B tokens—you can process massive datasets at half the cost of real-time APIs.

read full article on together.ai ↗

§ sources1 publication · timeline below

together.aiImproved Batch Inference API: Enhanced UI, Expanded Model Support, and 3000× Rate Limit Increaseprimary02:00:00