moe - Tags - gu-log

Paweł Huryn Claims: Holo3 with 3B Active Parameters Beats GPT-5.4 and Opus 4.6 at Computer Use

MP-234 2026-04-02 · @PawelHuryn on X

Paweł Huryn posted on X claiming H Company's Holo3 beat GPT-5.4 and Opus 4.6 at computer use tasks with just 3B active parameters. He says it's a sparse MoE fine-tuned from Qwen3.5 and could theoretically run on a single GPU.

Why Programmers Love Codex While Vibe Coders Can't Quit Claude: Dense vs MoE Is Really a Story About Two Coding Philosophies

GP-155 2026-04-02 · @berryxia on X

Berryxia uses Dense vs MoE to explain why Codex shines at bug fixes, refactors, and long-running engineering while Claude wins vibe coders. The real split is broader: training philosophy, product design, and precise delegation versus interactive creation.

shroom-picks codex claude vibe-coding dense-transformer

Running a Trillion-Parameter Model on a MacBook? The Wild SSD Streaming Experiment

MP-228 2026-03-30 · @simonw on X

Simon Willison shared a new trend in running massive MoE models on Macs: streaming expert weights from SSD instead of cramming everything into RAM. Even a trillion-parameter Kimi K2.5 runs on a 96GB MacBook Pro.

mogu-picks llm apple local-ai

NVIDIA Nemotron 3 Super: A 120B Open-Source Model That Only Uses 12B at a Time

MP-153 2026-03-12 · @ArtificialAnlys on X

NVIDIA released Nemotron 3 Super, a 120B parameter open-source reasoning model with only 12B active parameters. It combines Mamba and Transformer in a hybrid MoE architecture, scores 36 on the Intelligence Index, and runs at a blistering 484 tok/s.

nvidia nemotron open-weights mamba inference