# Ko-PIQA: Korean Physical Commonsense Reasoning Dataset [![arXiv](https://img.shields.io/badge/arXiv-2509.11303-b31b1b.svg)](https://arxiv.org/abs/2509.11303) ## πŸ“– Dataset Overview Ko-PIQA is a **Korean Physical Commonsense Reasoning** dataset designed to complement English-centric benchmarks like PIQA and to include culturally-grounded physical reasoning questions. - **Total items:** 441 - **Culturally-grounded items:** 87 (19.7%) (e.g., kimchi storage, hanbok care, ondol heating) - **Format:** PIQA-style binary choice (`solution0` / `solution1`) - **Goal:** Evaluate Korean LLM physical reasoning capabilities --- ## πŸ“Š Data Fields | Field | Type | Description | |-------------|---------|-------------| | `prompt` | string | The goal or question | | `solution0` | string | Candidate answer A | | `solution1` | string | Candidate answer B | | `label` | int | Correct answer index (`0` or `1`) | | `cultural` | int/null | `1` if culturally-grounded, otherwise `null` | --- ## πŸ”Ž Source & Filtering Pipeline - **Source:** 3.01M Korean Q&A pairs from Naver Knowledge iN (collected until May 2025) - **Step 1:** Filtered PIQA-style questions using Qwen3-4B, Qwen3-32B, and HCX-14B β†’ 11,553 candidates - **Step 2:** Sampled 600 general and 158 cultural questions - **Step 3:** Refined and generated distractors using GPT-4o - **Step 4:** Two native Korean speakers validated and filtered questions β†’ 471 items - **Step 5:** Deduplicated using KoSentenceBERT (cosine similarity > 0.85) β†’ **final 441 items** --- ## πŸ’‘ Example ```json { "prompt": "κΉ€μΉ˜μ°Œκ°œλ₯Ό 끓일 λ•Œ λ¬΅μ€μ§€μ˜ 신맛을 μ€‘ν™”μ‹œν‚€λ©΄μ„œλ„ κΉŠμ€ 맛을 λ‚΄λ €λ©΄?", "solution0": "섀탕을 ν•œ μŠ€ν‘Ό λ„£κ³  물을 뢀은 ν›„ μ€‘λΆˆμ—μ„œ 5λΆ„κ°„ 끓인닀.", "solution1": "섀탕을 ν•œ μŠ€ν‘Ό λ„£κ³  μ€‘λΆˆμ—μ„œ 5λΆ„κ°„ λ¨Όμ € 볢은 ν›„ 물을 λΆ“λŠ”λ‹€.", "label": 1, "cultural": 1 } ``` ## πŸ’» Usage ```python from datasets import load_dataset ds = load_dataset("HAERAE-HUB/Ko-PIQA") print(ds['train'][0]) ``` ## πŸ“Œ Citation ``` @misc{choi2025kopiqakoreanphysicalcommonsense, title={Ko-PIQA: A Korean Physical Commonsense Reasoning Dataset with Cultural Context}, author={Dasol Choi and Jungwhan Kim and Guijin Son}, year={2025}, eprint={2509.11303}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2509.11303}, } ```