diff --git a/README.md b/README.md new file mode 100644 index 0000000..3d6b6e0 --- /dev/null +++ b/README.md @@ -0,0 +1,46 @@ +--- +tags: +- generated_from_trainer +model-index: +- name: mistral-1L-tiny + results: [] +--- + + + +# mistral-1L-tiny + +This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset. + +## Model description + +More information needed + +## Intended uses & limitations + +More information needed + +## Training and evaluation data + +More information needed + +## Training procedure + +### Training hyperparameters + +The following hyperparameters were used during training: +- learning_rate: 0.0006 +- train_batch_size: 64 +- eval_batch_size: 8 +- seed: 42 +- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 +- lr_scheduler_type: cosine +- num_epochs: 3.0 + +### Framework versions + +- Transformers 4.38.1 +- Pytorch 2.2.0+cu121 +- Datasets 2.17.1 +- Tokenizers 0.15.2 diff --git a/generation_config.json b/generation_config.json new file mode 100644 index 0000000..0b1a873 --- /dev/null +++ b/generation_config.json @@ -0,0 +1,6 @@ +{ + "_from_model_config": true, + "bos_token_id": 1, + "eos_token_id": 2, + "transformers_version": "4.38.1" +} diff --git a/model.safetensors b/model.safetensors index d6ea94f..b5884cd 100644 --- a/model.safetensors +++ b/model.safetensors @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:dd5756bf16cbd7a5bbfd3677e2bf34a3458373f674a252d7d930dfbeec77a5d6 +oid sha256:9a565a3989b85cc64b7e0013120b3252fcff093b749f49485e005151bad8462f size 140516640