Update README.md

This commit is contained in:
Maxime Labonne 2024-07-27 18:55:42 +00:00 committed by system
parent fce49a43bf
commit 164a00608a
No known key found for this signature in database
GPG Key ID: 6A528E38E0733467

@ -23,3 +23,9 @@ configs:
- split: train - split: train
path: data/train-* path: data/train-*
--- ---
# FineTome-100k
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/75I3ffI4XnRlheOQ7kNJ3.jpeg)
The FineTome dataset is a susbet of [arcee-ai/The-Tome](https://huggingface.co/datasets/arcee-ai/The-Tome) (without arcee-ai/qwen2-72b-magpie-en) re-filtered using [HuggingFaceFW/fineweb-edu-classifier](https://huggingface.co/HuggingFaceFW/fineweb-edu-classifier).