This PR adds a new MathArena evaluation result so it can be indexed on the model leaderboard page. Model: Qwen/Qwen3.5-9B Competition dataset id: MathArena/hmmt_feb_2026 Score: 71.21 Result file: .eval_results/MathArena--hmmt_feb_2026.yaml The results are the same as the ones displayed on [our webpage](https://matharena.ai/?view=problem&comp=hmmt--hmmt_feb_2026). Note: this is an experimental feature, we are currently trying to make this work as smooth as possible.
9 lines
222 B
YAML
9 lines
222 B
YAML
- dataset:
|
|
id: MathArena/hmmt_feb_2026
|
|
task_id: MathArena/hmmt_feb_2026
|
|
value: 71.21
|
|
date: '2026-03-17'
|
|
source:
|
|
url: https://matharena.ai/?comp=hmmt--hmmt_feb_2026
|
|
name: Official MathArena Evaluation
|