Diffusion Language Models Explained — How Mercury Generates 1,000 Tokens Per Second
Mercury uses diffusion instead of autoregressive decoding to generate all tokens in parallel, hitting 1,000+ tokens/sec. We break down how it works.
Stay on top of AI and tech news, fresh research explained simply, and honest tool reviews.
Mercury uses diffusion instead of autoregressive decoding to generate all tokens in parallel, hitting 1,000+ tokens/sec. We break down how it works.
Anthropic accidentally exposed Claude Mythos, a new Capybara-tier model above Opus with major cyber capabilities, via a …
A new paper by Kawarabayashi, Thorup, Mohar, and Thomassen gives an O(n log n) algorithm for 4-coloring planar graphs, …
Google's TurboQuant algorithm compresses LLM KV cache memory by 6x with zero accuracy loss and no retraining needed. We …
China has barred Manus AI founders from leaving the country as it reviews Meta's $2 billion acquisition. The AI cold war …
Claude Dispatch, OpenClaw, and Google Mariner compared on pricing, setup, security, and daily use for desktop AI agent …
GitHub Copilot now uses Free, Pro, and Pro+ user interaction data for AI training by default. Here's what changed, what …
Donald Knuth published Claude's Cycles after Claude Opus 4.6 solved his open graph theory problem in 31 tries. The full …
OpenAI shut down Sora just six months after launch, killing Disney's $1B partnership deal. Here's why it happened and …