Models and Research

Highlights advances in core systems, technical breakthroughs, experiments, and academic work driving progress.

Models and Research

FutureSim Exposes Polymarket AI’s Narrow Wins and Failures
ByPriscilla Li May 17, 2026June 28, 2026

Max Planck Institute researchers recently released FutureSim, a benchmark for polymarket ai-style forecasting that tests whether agents can predict real-world…

Read More FutureSim Exposes Polymarket AI’s Narrow Wins and Failures
Models and Research

10^7-Dimensional LLM Memory, but Only If it Stays Sparse
ByMax Dvornik May 12, 2026June 25, 2026

A BDH seminar summary circulating in recent technical discussion frames LLM memory as a tradeoff between the familiar transformer KV…

Read More 10^7-Dimensional LLM Memory, but Only If it Stays Sparse
Models and Research

DeepSeek Forces Visual Reasoning Through Points and Boxes
ByMax Dvornik May 1, 2026June 23, 2026

DeepSeek has released an open-source visual reasoning framework called Thinking with Visual Primitives. According to 36Kr, the system changes how…

Read More DeepSeek Forces Visual Reasoning Through Points and Boxes
Models and Research

A Formula From Another Field Opened Erdős Problem
ByJames McCallef April 27, 2026June 23, 2026

Erdős problem #1196 now has a serious claimed solution, and the evidence ladder is unusually visible. Liam Price posted GPT-5.4…

Read More A Formula From Another Field Opened Erdős Problem
Models and Research

302 Designs, 16 Hits: AI-Designed Viruses in the Lab
ByMax Dvornik April 27, 2026June 23, 2026

AI-designed viruses are now a lab result, but not in the way the viral posts made it sound. Researchers affiliated…

Read More 302 Designs, 16 Hits: AI-Designed Viruses in the Lab
Models and Research

A 14-Author Paper Tries to Make Deep Learning Theory a Science
ByJames McCallef April 26, 2026June 23, 2026

A 14-author perspective paper posted to arXiv on April 23 argues that deep learning theory is starting to look less…

Read More A 14-Author Paper Tries to Make Deep Learning Theory a Science
Models and Research

LLM Failure Modes Start in the Stack, Not the Chat
ByJames McCallef April 24, 2026June 25, 2026

LLM failure modes are easiest to understand if you stop treating them as personality flaws, “the model lied,” “the chatbot…

Read More LLM Failure Modes Start in the Stack, Not the Chat
Models and Research

66 Tokens Make a Diffusion Language Model Look Easy
BySarah Fraser April 23, 2026June 16, 2026

A diffusion language model generates text by starting from masked or otherwise corrupted tokens and iteratively restoring them. In this…

Read More 66 Tokens Make a Diffusion Language Model Look Easy
Models and Research

Language-Agnostic Representations Show a Shared Semantic Workspace
BySarah Fraser April 20, 2026June 23, 2026

The standard story is that LLMs work in words. They predict the next token, so surely their internal reasoning is…

Read More Language-Agnostic Representations Show a Shared Semantic Workspace

Categories