Ollama v0.21.1
Ollama v0.21.1 releases with Kimi CLI integration, MLX logprobs support, faster sampling, and fixes for Gemma 4 structured outputs and the macOS model picker.
What's Changed Kimi CLI You can now install and run the Kimi CLI through Ollama. ``` ollama launch kimi --model kimi-k2.6:cloud ``` Kimi CLI with Kimi K2.6 excels at long horizon agentic execution tasks through a multi-agent system. MLX runner adds logprobs support for compatible models Faster MLX sampling with fused top-P and top-K in a single sort pass, plus repeat penalties applied in the sampler Improved MLX prompt tokenization by moving tokenization into request handler goroutines Better MLX thread safety for array management GLM4 MoE Lite performance improvement with a fused sigmoid router head Fixed model picker showing stale model after switching chats in the macOS app Fixed structured outputs for Gemma 4 when `think=false` Full Changelog: https://github.com/ollama/ollama/compare/v0.21.0...v0.21.1
- github.comollama v0.21.1primary