Maxime Labonne c2343c1372

2024-07-29 09:52:30 +00:00

dataset_info

configs

features

splits

download_size

dataset_size

name

list

conversations

name	dtype
from	string

name	dtype
value	string

name	dtype
source	string

name	dtype
score	float64

name	num_bytes	num_examples
train	239650960.7474458	100000

116531415

239650960.7474458

config_name

data_files

default

split	path
train	data/train-*

FineTome-100k

The FineTome dataset is a subset of arcee-ai/The-Tome (without arcee-ai/qwen2-72b-magpie-en), re-filtered using HuggingFaceFW/fineweb-edu-classifier.