Erdős problem #1196 now has a serious claimed solution, and the evidence ladder is unusually visible. Liam Price posted GPT-5.4 Pro output to erdosproblems.com; Scientific American reports that Terence Tao and Jared Lichtman said the opening move looked new for this problem; an 8-page note organizing the argument now exists; and a Lean formalization repository claims a machine-checked proof. The theorem claim and the proof artifacts are public. The novelty of the opening is still best described as an expert assessment, not a settled historical fact.
That is the interesting part. Not that an amateur suddenly outran the field, but that a general-purpose model may have made move one differently. Tao’s description, in Scientific American, is the load-bearing fact: researchers had converged on a standard opening for this Erdős problem, and the model instead reached for a formula from a related area of math that nobody had been trying here.
What AI Actually Solved in Erdős Problem #1196
The problem is about primitive sets and a weighted sum Erdős defined for them. A primitive set is a set of positive integers where no element divides any other. All primes form a primitive set, but many non-prime examples exist too.
Erdős Problem #1196 asks whether every primitive set made only of sufficiently large numbers obeys a universal upper bound for its Erdős sum. More concretely, the claimed result bounds the weighted sum over any primitive set using weights proportional to 1/(n log n), as long as every element of the set is at least x. When the later Lean repository says the set is supported on [x, ∞), that just means every number in the set is at least x.
That is more specific than “AI solved a math problem.” The theorem is a quantitative statement about all primitive sets above a threshold, not a one-off construction or a numerical experiment.
The problem was not ignored. Scientific American reports that it had eluded prominent mathematicians, and Tao’s quote there is tighter: “people did look at it.” That matters because it rules out the easiest fake-AI-breakthrough story, where a model stumbles into a neglected exercise nobody serious cared about.
The current status is stronger than a forum post. There is:
– a public thread on erdosproblems.com where the proof was posted and refined,
– an 8-page write-up organizing the argument,
– and a Lean repository claiming a formalization of the result.
Formalization does not mean the model wrote a correct proof end to end. It means the final theorem was translated into Lean, a proof assistant that checks each logical step once humans make those steps explicit enough.
Why the amateur mattered less than the model’s first move
Liam Price’s role was prompting, posting, and surfacing the result. The potentially novel mathematical step is what experts attributed to the model.
This gets blurred in headlines. According to Scientific American, Price is a 23-year-old with no advanced math training, and the claimed solution began with a single prompt to GPT-5.4 Pro. If the story were simply “an amateur solved a hard open problem,” the right default reaction would be skepticism.
Instead, Tao and Lichtman focused on something narrower. Tao said previous researchers had a standard sequence of moves they usually started with. The model did not follow that sequence. It applied a formula already known in a related part of mathematics to this primitive-sets question.
That difference is the whole story. The important claim is not that ChatGPT became a mathematician. It is that a general-purpose model may have proposed a first step specialists had systematically not been trying on this Erdős problem.
Tao’s public wiki on AI contributions to Erdős problems is useful context here because it is cautious, not promotional. It notes selection bias, provisional assessments, and cases where AI-assisted work later turned out to be wrong. So this result got attention despite a skeptical backdrop, not because mathematicians have started lowering the bar for AI math headlines.
How the proof moved across subfields
The mechanism was not “the model reasoned perfectly.” It was “the model tried a different route.” In source-safe terms, that route was to use a formula already known in a related area of math on this primitive-sets problem, rather than following the standard sequence of moves earlier researchers used. The supplied sources do not spell out the exact formula in enough detail to name it more precisely here, and that is exactly why the right description stays at this level.
From there, the process was procedural, not magical:
| Stage | Who did it | What happened |
|---|---|---|
| Initial prompt | Liam Price | Submitted the problem to GPT-5.4 Pro |
| First proof attempt | ChatGPT | Produced a rough proof containing the nonstandard opening move |
| Expert evaluation | Jared Lichtman, Terence Tao, others | Checked whether that move could actually support the theorem |
| Proof cleanup | Human mathematicians | Rewrote, shortened, and organized the argument |
| Formal verification | Math, Inc. Lean repo | Encoded the theorem as a machine-checked proof artifact |
That middle phase is the part most AI headlines skip. Lichtman told Scientific American that the raw ChatGPT output was “actually quite poor” and that experts had to “sift through and actually understand what it was trying to say.”
So the result was not a polished theorem-proof package dropped out of a chatbot. It was a messy draft with one promising move inside it, followed by human interpretation, proof repair, and later formalization.
That chronology also explains why this looks different from previous AI math breakthrough claims around Erdős problems. The public record here includes the original posting, mathematician commentary, a cleaned-up note, and a Lean artifact. You can watch the proof becoming legible.
Breadth-Stuck Problems and Cross-Subfield Search
The evidence here supports a narrower conclusion than “LLMs can do frontier math.” It supports the claim that a model can sometimes help when a problem is stuck because everyone keeps opening the same way.
Tao’s wiki is the reason to keep that conclusion narrow. It explicitly says the list is not a benchmark, warns that assessments are provisional, and tracks incorrect claims too. So Erdős problem #1196 is not proof that general-purpose models are now reliable theorem provers.
What it does show is a workflow that looks plausible and testable:
- a model proposes an off-path opening,
- experts decide whether that opening contains a real idea,
- humans rebuild the argument into mathematical form,
- and a proof assistant can later verify the final structure.
That is a very specific capability: broad analogical search across subfields, followed by expert cleanup and formal verification. On this evidence, that is the part worth taking seriously.
Key Takeaways
- Erdős problem #1196 concerns primitive sets and a weighted sum bound, not a generic “AI solved math” stunt.
- The visible evidence chain is public: Price’s post, mathematician commentary, an 8-page note, and a Lean formalization repository.
- Liam Price surfaced the result, but Tao and Lichtman’s reported view is that the important step was the model’s nonstandard opening move.
- The raw ChatGPT output was, in Lichtman’s words, “actually quite poor,” which makes this a story about expert cleanup and verification, not autonomous theorem proving.
- This case shows that a model can contribute a novel opening move, but only after expert interpretation and later formal verification does that become a legitimate mathematical result.
Further Reading
- Amateur armed with ChatGPT ‘vibe-maths’ a 60-year-old problem, Primary reporting on Liam Price, Tao, Lichtman, and why mathematicians took the proof route seriously.
- AI contributions to Erdős problems, Tao’s tracking wiki, with explicit caveats about provisional assessments, selection bias, and incorrect AI-assisted claims.
- Primitive Sets Above x in Lean, Repository claiming a Lean formalization of the theorem for Erdős Problem #1196.
- Erdős Problem #1196 discussion thread, The forum thread where the proof was posted, discussed, and refined.
