Upload tokenizer (#1)

- Upload tokenizer (78f12abcb79c7c69e0c42f9dda00e1371419e08a)


Co-authored-by: Arthur Zucker <ArthurZ@users.noreply.huggingface.co>
This commit is contained in:
Patrick von Platen 2024-05-22 15:17:17 +00:00 committed by system
parent 73be8467cd
commit a3cd77bcd2
No known key found for this signature in database
GPG Key ID: 6A528E38E0733467
4 changed files with 105005 additions and 0 deletions

23
special_tokens_map.json Normal file

@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

98793
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

6186
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff