How well can large language models predict…

Oct 8

We’ve just released an updated version of ForecastBench, our LLM forecasting benchmark. Here’s what the new results reveal about the accuracy of state-of-the-art models.

Read →

3 Comments

Tom Coyne

Oct 9

So happy to see you’ve started a Substack to more widely share your great work!

Expand full comment

Tom Coyne

Oct 9

Questions: what happens when you extremize the public and super forecasts, rather than using the median? And when you also do that that with forecasts with multiple LLMs?

Expand full comment

Chris

Oct 9

"LLMs have surpassed the general public"

If finance analyst ends up informally leaning on a “Claude” take...not realising so many other analysts are doing same...it’s an AI herd. And herds crash

A dodgy AI owner may even game the system :-)

Expand full comment