From 030a14dfda27146c6baf2fc8c5ac5e12039b4e25 Mon Sep 17 00:00:00 2001 From: Jasper Date: Tue, 17 Mar 2026 09:58:17 +0000 Subject: [PATCH] Add MathArena evaluation result for aime/aime_2026 This PR adds a new MathArena evaluation result so it can be indexed on the model leaderboard page. Model: Qwen/Qwen3.5-9B Competition dataset id: MathArena/aime_2026 Score: 92.50 Result file: .eval_results/MathArena--aime_2026.yaml The results are the same as the ones displayed on [our webpage](https://matharena.ai/?view=problem&comp=aime--aime_2026). Note: this is an experimental feature, we are currently trying to make this work as smooth as possible. --- .eval_results/MathArena--aime_2026.yaml | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 .eval_results/MathArena--aime_2026.yaml diff --git a/.eval_results/MathArena--aime_2026.yaml b/.eval_results/MathArena--aime_2026.yaml new file mode 100644 index 0000000..2750d84 --- /dev/null +++ b/.eval_results/MathArena--aime_2026.yaml @@ -0,0 +1,8 @@ +- dataset: + id: MathArena/aime_2026 + task_id: MathArena/aime_2026 + value: 92.5 + date: '2026-03-17' + source: + url: https://matharena.ai/?comp=aime--aime_2026 + name: Official MathArena Evaluation