ollama
2 articles
Ollama Switches to MLX, Betting Big on Apple Silicon Local Inference
Ollama announces MLX-powered inference on Apple Silicon, targeting faster local performance for personal assistants and coding agents.
Sentdex: I've Fully Replaced Claude Code + Opus with a Local LLM — $0 API Cost
Sentdex replaced Claude Code/Opus 4.5/6 with local LLMs: Ollama + Qwen3-Coder-Next (4-bit, 50GB RAM). Achieves 30-40 t/s (CPU), 100 t/s (GPU), cutting API costs to zero. Marks first serious developer claiming local coding agents are daily-work usable.