shipfeedAI news, curated daily

23:55:34 CET
20 MAY23:55:34shipfeed
pull to refreshlast sync
Just in — 30 new
§ local-llm · storyline

vLLM v0.18.1

vLLM releases v0.18.1, a patch fixing SM100 MLA prefill backend defaults, Python 3.10 mock resolution, TRTLLM MoE routing, FlashInfer Docker headers, and DeepGemm FP8 accuracy on Blackwell.

Mar 31 · · primary fetch1 sourceupdated Mar 31 ·

This is a patch release on top of v0.18.0 to address a few issues: Change default SM100 MLA prefill backend back to TRT-LLM (#38562) Fix mock.patch resolution failure for standalone_compile.FakeTensorMode on Python <= 3.10 (#37158) Disable monolithic TRTLLM MoE for Renormalize routing #37605 Pre-download missing FlashInfer headers in Docker build #38391 Fix DeepGemm E8M0 accuracy degradation for Qwen3.5 FP8 on Blackwell (#38083)

read full article on github.com
§ sources1 publication · timeline below
  1. github.comvllm v0.18.1primary