Put your AI forecaster to the test on ForecastBench
Our LLM forecasting accuracy benchmark is now open to public submissions. Here’s how to submit your model to ForecastBench.
(Post written by Houtan Bastani, Simas Kučinskas and Matt Reynolds)
We recently relaunched ForecastBench, our live, contamination-free benchmark of large language model (LLM) forecasting accuracy. ForecastBench is now open to public submissions. The next submission date is November 9, with additional rounds every two weeks thereafter. For more information about ForecastBench, take a look at our launch blog.
Why participate in ForecastBench?
ForecastBench features two leaderboards. The Baseline leaderboard shows out-of-the-box model performance without extra tools or scaffolding, while the Tournament leaderboard allows individuals and teams to enhance models in any way they choose.
Here’s why we think you should submit to ForecastBench:
Demonstrate the practical value of your model
Forecasting is a practical, general-purpose task with applications from geopolitical strategy to high-stakes business planning. LLMs with high forecasting accuracy are valuable in many domains.
Benchmark LLMs without data leakage
Other LLM benchmarks are susceptible to data leakage, where benchmark questions appear in model training datasets. Since forecasting requires LLMs to predict future events with unknown outcomes, accurate predictions on ForecastBench are a signal of genuine capability rather than data leakage.
Compare your model to expert human forecasters
ForecastBench compares LLMs to superforecasters—the very best human forecasters. So far, no model has matched superforecaster performance, but LLM-superforecaster parity may only be a year away. Will your model be the first to cross the Rubicon?
Use any forecasting method
Any LLM-based forecasting method is accepted, from fine-tuned models to bespoke scaffolding.
Submit your model easily
Submitting your model to ForecastBench is straightforward. Simply follow the instructions on our Wiki, summarized below, and start forecasting.
Gain public recognition
All qualifying submissions will be added to our public Tournament leaderboard, where your model’s success will be visible for everyone.
The forecasting timeline
The timeline above shows the time between the question set generation and the moment forecasts are due, 23:59:59 UTC on day D. This process repeats on a fortnightly basis.
D - 10: Forecasts are sampled from the ForecastBench question bank.
D at 00:00 UTC: The question set is published to our datasets repository.
D at 23:59 UTC: Forecasting teams then have until 23:59:59 UTC to generate and submit their forecasts to the benchmark.
Forecasts can be submitted every two weeks.
How to submit to ForecastBench
We provide detailed instructions on our wiki. Here’s the high-level flow:
Request access. Email forecastbench@forecastingresearch.org with the email addresses that should be authorized to upload your team’s forecasts. We’ll provision a secure GCP folder for your team and confirm your next forecast due date.
Prepare your code. Before the due date, build your pipeline to read the JSON question set (see the data dictionary on the wiki), prompt your model(s) with whatever information you choose to provide, and output forecasts.
Test your pipeline. Do a dry run using a previous question set.
Download the question set. At 0:00 UTC on the submission date, you can download the file and begin forecasting.
Generate forecasts. There are 500 total questions: 250 market and 250 dataset.
Upload. Submit your forecast file to your GCP folder by 23:59:59 UTC on the due date.
What to expect after submitting
After 1 day: We publish your forecast set for public download.
After 14 days: We begin resolving your forecasts and make the resolved forecasts available for public download.
After 50 days: Your model enters the Tournament leaderboard, ranked by overall score.
Community
Forum: We’ve opened a GitHub discussion forum. If you have any questions or run into problems, you can reach out to us there.
Open source: Submissions need not be open source. However, if you choose to share your code, we’ll feature it on the wiki to help future participants.
Showcase: We plan to invite teams with innovative or high-performing setups to feature in blog posts on the Forecasting Research Institute Substack about their approach.
Ready to participate?
Email forecastbench@forecastingresearch.org to get access and submit in the next round on November 9. We’re excited to benchmark your ideas alongside the latest models and human baselines!



