mirai-llm.github.io - MIRAI: Evaluating LLM Agents for Event Forecasting

Description: MIRAI: Evaluating LLM Agents for Event Forecasting

agent (4781) mirai (57) event forecast (1)

Example domain paragraphs

(Overview of LLM agent forecasting with the MIRAI benchmark)

MIRAI is a benchmark crafted for evaluating LLM agents for temporal forecasting of international events, with tool-use and complex reasoning. We finalized a collection of 991,759 GDELT event records, corresponding to 59,161 unique events and 296,630 unique news articles. Our test set contains 705 query and answer pairs on forecasting an event of given timestamp between two countries, with a 100 balanced test subset.

MIRAI comprehensively covers global event data. The circular chart shows the relation hierarchy and distribution in MIRAI.

Links to mirai-llm.github.io (3)