§ feed · storyline

CuTeDSL at Perplexity

Perplexity details its use of CuTeDSL within its Runtime-Optimized Serving Engine to improve inference performance for AI models on NVIDIA GPUs.

May 5 · 20:20:15 · primary fetch1 sourceupdated May 5 · 20:20:15

Perplexity details their use of CuTeDSL in their Runtime-Optimized Serving Engine (ROSE) for high-performance model serving, optimizing inference for various AI models on NVIDIA GPUs.

read full article on research.perplexity.ai ↗

§ sources1 publication · timeline below

research.perplexity.aiCuTeDSL at Perplexityprimary20:20:15