shipfeedAI news, curated daily

01:20:44 CET
21 MAY01:20:44shipfeed
pull to refreshlast sync
Just in — 30 new
§ local-llm · storyline

vLLM v0.11.2

vLLM releases v0.11.2 with four bug fixes addressing multi-node Ray support, speculative decoding assertions, async scheduling with FlashAttn MLA, and SM100 CUTLASS MoE macro guards.

Nov 20 · · primary fetch1 sourceupdated Nov 20 ·

This release includes 4 bug fixes on top of `v0.11.1`: [BugFix] Ray with multiple nodes (https://github.com/vllm-project/vllm/pull/28873) [BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 (https://github.com/vllm-project/vllm/pull/29036) [BugFix] Fix async-scheduling + FlashAttn MLA (https://github.com/vllm-project/vllm/pull/28990) [NVIDIA] Guard SM100 CUTLASS MoE macro to SM100 builds v2 (https://github.com/vllm-project/vllm/pull/28938)

read full article on github.com
§ sources1 publication · timeline below
  1. github.comvllm v0.11.2primary