Models and Research

Highlights advances in core systems, technical breakthroughs, experiments, and academic work driving progress.

Models and Research

Speculative Decoding’s Ceiling Just Moved With DFlash
ByMax Dvornik April 8, 2026June 25, 2026

A serving engineer watches tokens arrive in that familiar trickle: fast enough to demo, slow enough to feel like the…

Read More Speculative Decoding’s Ceiling Just Moved With DFlash
Models and Research

Reduce LLM Hallucinations? Why ‘Make-No-Mistakes’ Fails
ByMax Dvornik April 7, 2026June 16, 2026

The first time you see it, it’s kind of perfect: a tiny folder in your Cursor skills called make-no-mistakes. One…

Read More Reduce LLM Hallucinations? Why ‘Make-No-Mistakes’ Fails
Models and Research

Neuro-symbolic AI Cuts Energy 100×: Change the Problem
ByGeoff Dyers April 7, 2026June 25, 2026

If you tried to rebuild the Tufts experiment yourself, the first thing you’d notice is boring: the neuro-symbolic AI system…

Read More Neuro-symbolic AI Cuts Energy 100×: Change the Problem
Models and Research

Chinese AI Model Delays End Casual Open-Weight Era
ByPriscilla Li April 6, 2026June 25, 2026

Everyone on Reddit sees the same thing: a bunch of Chinese labs promising new open‑weight models… and then quietly missing…

Read More Chinese AI Model Delays End Casual Open-Weight Era
Models and Research

GLM-5 vs Claude Opus: Why Cheap Models Win for Agents
ByJames McCallef April 5, 2026June 23, 2026

YC‑Bench just produced the sort of result that usually launches a thousand hot takes: GLM‑5 vs Claude Opus on a…

Read More GLM-5 vs Claude Opus: Why Cheap Models Win for Agents
Models and Research

AI Model Collapse Is Happening: Treat Data as Code Now
ByPriscilla Li April 3, 2026June 25, 2026

If you’ve asked an LLM for a simple command lately and watched it flail through three wrong answers, you’ve already…

Read More AI Model Collapse Is Happening: Treat Data as Code Now
Models and Research

RBF Attention Reveals Dot‑Product’s Hidden Norm Bias
ByGeoff Dyers April 2, 2026June 16, 2026

Swapping dot‑product attention for RBF attention sounds like an architectural revolution. In Raphael Pisoni’s experiment, it turned out to be…

Read More RBF Attention Reveals Dot‑Product’s Hidden Norm Bias
Models and Research

Claude vs ChatGPT: Why Claude Feels More Honest and Accurate
ByPriscilla Li March 30, 2026July 12, 2026

A 100‑question “bullshit benchmark” sounds like a joke until you see the chart. In BullshitBench v2, Anthropic’s Claude models sit…

Read More Claude vs ChatGPT: Why Claude Feels More Honest and Accurate
Models and Research

Rebuttal Experiments Are Breaking Peer Review Right Now
ByMax Dvornik March 29, 2026June 25, 2026

A lot of people in AI quietly agree on one thing about rebuttal experiments: they make their papers better. More…

Read More Rebuttal Experiments Are Breaking Peer Review Right Now

Categories