§ feed · storyline
CuTeDSL at Perplexity
Perplexity details its use of CuTeDSL within its Runtime-Optimized Serving Engine to improve inference performance for AI models on NVIDIA GPUs.
Perplexity details their use of CuTeDSL in their Runtime-Optimized Serving Engine (ROSE) for high-performance model serving, optimizing inference for various AI models on NVIDIA GPUs.
§ sources1 publication · timeline below
- research.perplexity.aiCuTeDSL at Perplexityprimary