Go to file
2024-07-29 09:52:30 +00:00
data Upload dataset 2024-07-27 18:34:53 +00:00
.gitattributes initial commit 2024-07-27 18:34:47 +00:00
README.md Update README.md 2024-07-29 09:52:30 +00:00

dataset_info configs
features splits download_size dataset_size
name list
conversations
name dtype
from string
name dtype
value string
name dtype
source string
name dtype
score float64
name num_bytes num_examples
train 239650960.7474458 100000
116531415 239650960.7474458
config_name data_files
default
split path
train data/train-*

FineTome-100k

image/jpeg

The FineTome dataset is a subset of arcee-ai/The-Tome (without arcee-ai/qwen2-72b-magpie-en), re-filtered using HuggingFaceFW/fineweb-edu-classifier.

It was made for my article "Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth".