llama.cpp b9075
llama.cpp b9075 releases build b9075, adding fused CUDA snake activation support for audio decoder models such as BigVGAN and Vocos across F32, F16, and BF16 precisions.
cuda: fuse snake activation (mul, sin, sqr, mul, add) (#22667) cuda: fuse snake activation (mul, sin, sqr, mul, add) Add ggml_cuda_op_snake_fused with F32 / F16 / BF16 templates. The matcher recognizes the naive 5 op decomposition emitted by audio decoders (BigVGAN, Vocos) for snake activation y = x + sin(ax)^2 inv_b and rewrites it to a single elementwise kernel. Add test_snake_fuse comparing CPU naive vs CUDA fused across F32 / F16 / BF16. cuda: address review feedback from @am17an Use ggml_cuda_cast for F32/F16/BF16 conversions and rename kernel_snake to snake_kernel to match upstream conventions.
cuda: snake fusion fastdiv on T_len, Suggested-by: @am17an Update tests/test-backend-ops.cpp Co-authored-by: Aman Gupta cuda: snake fusion check add->type matches x->type Address review feedback from @am17an cuda: snake fusion check add->type matches x->type Moved for readability (equivalent) Address review feedback from @am17an --------- Co-authored-by: Aman Gupta macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan)…
- github.comllama.cpp b9075primary
- github.comllama.cpp b9110
- github.comllama.cpp b9106
- github.comllama.cpp b9105
- github.comllama.cpp b9103
- github.comllama.cpp b9102
- github.comllama.cpp b9100
- github.comllama.cpp b9099
- github.comllama.cpp b9097
- github.comllama.cpp b9095
- github.comllama.cpp b9094
- github.comllama.cpp b9093
- github.comllama.cpp b9090
- github.comllama.cpp b9089
- github.comllama.cpp b9088
- github.comllama.cpp b9087
- github.comllama.cpp b9085
- github.comllama.cpp b9084
- github.comllama.cpp b9082
- github.comllama.cpp b9080
- github.comllama.cpp b9079
- github.comllama.cpp b9077
- github.comllama.cpp b9076
§ how this story moved
- primary — llama.cpp — Releases publishes the launch post.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.
- llama.cpp — Releases picks up coverage.