§ agents · storyline
Benchmarking inference at scale: coding agents
Baseten publishes inference benchmarks for coding agents showing 31% higher throughput than TensorRT-LLM, 2× faster TTFT at saturation, and 76% lower cost than Claude Opus 4.6.
Real-world inference benchmarks for coding agents: 31% more TPS than TensorRT-LLM, 2× better TTFT at saturation, and 76% lower cost than Claude Opus 4.6.
§ sources1 publication · timeline below