DiffusionGemma generates 256 tokens per step, 4x faster locally
Apache 2.0, built on Gemma 4 26B MoE, day-zero support in Hugging Face Transformers, vLLM, and Unsloth.
Google DeepMind released DiffusionGemma today, a 26B-parameter mixture-of-experts model that generates up to 256 tokens per step in parallel rather than sequentially, delivering up to 4x faster local inference. Anthropic separately launched Claude Fable 5 for general use and expanded Mythos 5 access to hundreds of organizations across 15 countries, though Fable's broad cybersecurity guardrails are drawing complaints from security researchers.
Apache 2.0, built on Gemma 4 26B MoE, day-zero support in Hugging Face Transformers, vLLM, and Unsloth.
NVIDIA ships day-zero DiffusionGemma support across RTX, RTX PRO, and DGX Spark hardware platforms.
Researchers including IBM X-Force's Palmiotti report Fable blocks innocuous security tasks like reading blog posts.
Gemini 3.5 Live Translate covers 70+ languages in real-time speech-to-speech; Google backstops Anthropic's $35B chip lease.
Two Writer papers show stored user preferences cause models to return user-biased answers to unrelated factual queries.
AI agents in Kiro and Claude can now author, debug, and profile NKI kernels on Trainium and Inferentia.
Framework addresses loss of statistical power in two-sample testing as model scale increases, enabling black-box unlearning audits.
Ramp AI Index data; top 10% spend $611/employee/month; median is $11.38. Top-tier spend grew 14.1% last month.
Founded by two early Datadog engineers; Greylock led the round; pitch targets enterprises wary of OpenAI/Anthropic lock-in.
Amodei argues congressional timelines are too slow relative to AI capability gains; outlines specific policy recommendations.