shipfeedAI news, curated daily

23:55:13 CET
20 MAY23:55:13shipfeed
pull to refreshlast sync
Just in — 30 new
§ local-llm · storyline

vLLM v0.17.1

vLLM releases v0.17.1, a patch fixing MoE fusion issues, re-enabling expert parallelism for TRT-LLM FP8, and adding Nemotron 3 Super model support.

Mar 11 · · primary fetch1 sourceupdated Mar 11 ·

This is a patch release on top of `v0.17.0` to address a few issues: New Model: Nemotron 3 Super Fix passing of activation_type to trtllm fused MoE NVFP4 and FP8 (#36017) Fix/resupport nongated fused moe triton (#36412) Re-enable EP for trtllm MoE FP8 backend (#36494) [Mamba][Qwen3.5] Zero freed SSM cache blocks on GPU (#35219) Fix TRTLLM Block FP8 MoE Monolithic (#36296) [DSV3.2][MTP] Optimize Indexer MTP handling (#36723)

read full article on github.com
§ sources1 publication · timeline below
  1. github.comvllm v0.17.1primary