SafeBench

Up to $500,000 in prizes for ML Safety benchmark ideas.

Get started Open Oct 2022–August 2023

Get started

View example ideas

Benchmarks should relate to Monitoring, Robustness, Alignment, or Safety Applications. You could concretize one of the example directions we've provided or propose a novel idea.

Ideas

Develop a benchmark

Submit a research paper or formalize your idea in a write-up using our guidelines. You are not required to provide a dataset with your submission.

Guidelines

Submit

Oct 2022–August 31, 2023

Early submissions will be eligible for feedback and resubmission.

Submit

Judging

Oct 2022-Oct 2023

We will award \$100,000 for outstanding benchmark ideas and \$50,000 for good benchmark ideas. We will award at least \$100,000 in prizes total and up to \$500,000 depending on the quality of the ideas submitted. For especially promising proposals, we may offer funding for data collection and engineering expertise so that teams can build their benchmark. Submissions will be judged by Dan Hendrycks, Paul Christiano, and Collin Burns.

Proposals and Winners Made Public

~Oct 2023

Authors can choose to keep their proposals private during the competition (for example, if they are preparing a research paper); however, all winning and non-winning proposals and the names of the winners will eventually be made public for the sake of transparency. By default, we will release proposals several months after the competition ends, though this can be negotiated on a case-by-case basis.

FAQ

What information should I include in my proposal? How will submissions be evaluated?
Please refer to the guidelines page.
Why will submissions be made public?
Benchmark ideas cannot be evaluated in a ‘clear cut’ manner. So, to make the prizing transparent, we will eventually make all final submissions public and release the names of the winning authors.
Can I submit more than one entry?
Yes. Submissions will be evaluated independently.
Can I participate in a team?
Yes. For winning teams, prizes will be divided evenly among the authors.
May I submit a paper that I’ve already published?
Yes; however, our aim is to encourage the development of new benchmarks, so we are unlikely to award prizes for papers that predate this competition.
What qualifies as a ‘safety’ related benchmark?
We are ultimately looking for benchmarks that help to reduce high-consequence risks from advanced AI. In order to provide further guidance, we’ve listed categories (Robustness, Monitoring, Alignment, and Safety Applications) and example ideas.
Why focus on risks from more ‘advanced AI systems?’
First of all, much of current safety research will probably be relevant to making more advanced systems safe. Current risks may simply become more extreme with more capable and autonomous systems. For example, ensuring that language models don’t produce harmful text could transition to ensuring AI virtual assistants don’t take harmful actions. We think current safety issues are important, but we focus on risks from more advanced AI systems because we think they are likely to be even more consequential.