Ollama v0.19.0
Ollama is now powered by MLX on Apple Silicon in preview Ollama on Apple silicon is now built on top of Apple’s machine learning framework, MLX, to take advantage of its unified memory architecture. https://github.com/user-attachments/assets/600297b0-3167-46a5-8e3a-fefda3a51b84 Read more: https://ollama.com/blog/mlx What's Changed Ollama's app will now no longer incorrectly show "model is out of date" `ollama launch pi` now includes web search plugin that uses Ollama's web search Improved KV cache hit rate when using the Anthropic-compatible API Fixed tool call parsing issue with Qwen3.5 where tool calls would be output in thinking MLX runner will now create periodic snapshots during prompt processing Fixed KV cache snapshot memory leak in MLX runner Fixed issue where flash attention would be incorrectly enabled for `grok` models Fixed `qwen3-next:80b` not loading in Ollama New Contributors @amatas made their first contribution in https://github.com/ollama/ollama/pull/15022 Full Changelog: https://github.com/ollama/ollama/compare/v0.18.3...v0.19.0
- github.comollama v0.19.0primary