§ tools · storyline

llama.cpp b9075

llama.cpp b9075 releases build b9075, adding fused CUDA snake activation support for audio decoder models such as BigVGAN and Vocos across F32, F16, and BF16 precisions.

May 8 · 19:37:29 · primary fetch1 sourceupdated May 11 · 23:27:23

cuda: fuse snake activation (mul, sin, sqr, mul, add) (#22667) cuda: fuse snake activation (mul, sin, sqr, mul, add) Add ggml_cuda_op_snake_fused with F32 / F16 / BF16 templates. The matcher recognizes the naive 5 op decomposition emitted by audio decoders (BigVGAN, Vocos) for snake activation y = x + sin(ax)^2 inv_b and rewrites it to a single elementwise kernel. Add test_snake_fuse comparing CPU naive vs CUDA fused across F32 / F16 / BF16. cuda: address review feedback from @am17an Use ggml_cuda_cast for F32/F16/BF16 conversions and rename kernel_snake to snake_kernel to match upstream conventions.

cuda: snake fusion fastdiv on T_len, Suggested-by: @am17an Update tests/test-backend-ops.cpp Co-authored-by: Aman Gupta cuda: snake fusion check add->type matches x->type Address review feedback from @am17an cuda: snake fusion check add->type matches x->type Moved for readability (equivalent) Address review feedback from @am17an --------- Co-authored-by: Aman Gupta macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan)…

read full article on github.com ↗

§ sources23 publications · timeline below

github.comllama.cpp b9075primary19:37:29
github.comllama.cpp b911023:27:23
github.comllama.cpp b910615:26:49
github.comllama.cpp b910515:24:04
github.comllama.cpp b910313:46:00
github.comllama.cpp b910208:15:51
github.comllama.cpp b910022:06:59
github.comllama.cpp b909921:33:19
github.comllama.cpp b909719:14:52
github.comllama.cpp b909511:43:20
github.comllama.cpp b909410:48:11
github.comllama.cpp b909323:02:12
github.comllama.cpp b909014:45:23
github.comllama.cpp b908913:03:51
github.comllama.cpp b908812:42:53
github.comllama.cpp b908712:18:13
github.comllama.cpp b908507:18:04
github.comllama.cpp b908405:27:11
github.comllama.cpp b908200:21:07
github.comllama.cpp b908023:05:08
github.comllama.cpp b907922:23:42
github.comllama.cpp b907721:29:16
github.comllama.cpp b907620:53:58

§ how this story moved

19:37:29primary — llama.cpp — Releases publishes the launch post.
20:53:58llama.cpp — Releases picks up coverage.
21:29:16llama.cpp — Releases picks up coverage.
22:23:42llama.cpp — Releases picks up coverage.
23:05:08llama.cpp — Releases picks up coverage.
00:21:07llama.cpp — Releases picks up coverage.
05:27:11llama.cpp — Releases picks up coverage.
07:18:04llama.cpp — Releases picks up coverage.
12:18:13llama.cpp — Releases picks up coverage.
12:42:53llama.cpp — Releases picks up coverage.
13:03:51llama.cpp — Releases picks up coverage.
14:45:23llama.cpp — Releases picks up coverage.
23:02:12llama.cpp — Releases picks up coverage.
10:48:11llama.cpp — Releases picks up coverage.
11:43:20llama.cpp — Releases picks up coverage.
19:14:52llama.cpp — Releases picks up coverage.
21:33:19llama.cpp — Releases picks up coverage.
22:06:59llama.cpp — Releases picks up coverage.
08:15:51llama.cpp — Releases picks up coverage.
13:46:00llama.cpp — Releases picks up coverage.
15:24:04llama.cpp — Releases picks up coverage.
15:26:49llama.cpp — Releases picks up coverage.
23:27:23llama.cpp — Releases picks up coverage.