3b1b Talent
More 3b1b Partners

Jane Street

About

Epoch AI is a research nonprofit investigating the trajectory and future of artificial intelligence. We are the creators of FrontierMath, a recognized AI benchmark made of research-level math problems.

We are now seeking problem contributors for FrontierMath: Open Problems, a new kind of AI benchmark consisting of unsolved math problems that have resisted serious attempts by professional mathematicians. AI solutions would meaningfully advance the state of human mathematical knowledge.

We have already launched the pilot and maintain a live scoreboard of open problems across fields like Combinatorics, Number Theory, and Topology, including notability ratings and "warm-up" versions for AI testing.

We think the 3b1b audience will be able to come up with some great problems!

Propose a problem

The Challenge

We are commissioning math problems that satisfy the following criteria:

Next Steps

After you propose a problem, next steps typically consist of a brief back-and-forth about problem details, followed by us offering you a compensated contract to produce the full problem package, consisting of the following:

Compensation varies depending on how much work the package will be to produce. This is usually driven by the complexity of the verifier.

Watch Greg Burnham, lead of the project, present FrontierMath: Open Problems and explain why it matters. This was an internal chat held before the pilot was launched.

Solutions can be verified automatically

Evaluating AI solutions to unsolved math problems is a major logistical challenge. Math research typically proceeds via natural-language papers. Evaluating such papers is labor intensive and error-prone even for humans. While AI systems have made progress at evaluating natural-language mathematics, we cannot rely on the accuracy of their evaluations for advanced material (e.g., see here for work on AI systems grading prose proofs.) Our approach is to find problems where, even though no solution is currently known, a proposed solution can be checked by a relatively straightforward computer program running on a typical computer. It is not obvious that such verifiable problems exist, but they do:

The downside is that this approach limits what we can ask about, but we have been pleasantly surprised by how readily mathematicians have been able to come up with a diversity of mathematically meaningful problems that satisfy this verifiability constraint. Problems in the pilot span a range of notability, estimated time to solve, number of mathematicians who've tried to solve, and topic areas:

Image

For more context

IEEE Spectrum recently interviewed project lead Greg Burnham, situating this new benchmark in relation to Epoch's prior FrontierMath Tiers 1-4, as well as Aletheia and First Proof.

Image

For a deeper dive into why this project matters, listen to Greg Burnham and Prof. Daniel Litt discuss the jagged frontier of AI math capabilities, including the value of "expert-level" problems that resist standard optimization: