DApp Store | Web3 Hub for Events & Games

Trending topics

How well do modern LLMs predict the future? They tested on ~300 Kalshi prediction markets. Claude Opus 4.5 performed the best. Its Brier Score (a measure of mean square error of prediction probs) of ~0.23 is still off human superforecasters (0.15-0.2) but is approaching it.

They used Oct-Nov 2025. Gemini 3 Pro wasn't compared but GPT 5.2 XHigh disappointed. Source:

(ForecastBench is also an attempt to do this but is stale and doesn't have the new models)

305

Top

Ranking

Favorites