Gustavo de Rosa
8e9ebfb9bf
Delete configuration_mixformer_sequential.py
2023-11-16 18:10:30 +00:00
Gustavo de Rosa
271c3397ab
Update to new model interface.
2023-11-16 17:28:06 +00:00
Gustavo de Rosa
92557d03bb
Improves type hinting on configuration arguments.
2023-11-01 23:40:19 +00:00
Gustavo de Rosa
45f4b21525
Enables to toggle fused_dense, flash_rotary and attn_pdrop in the configuration.
2023-11-01 23:33:57 +00:00
Gustavo de Rosa
0254d42a95
Fixes flash-attn import with a try/except statement
2023-11-01 23:32:35 +00:00
Gustavo de Rosa
0bbd68a176
Adds support for flash-attn rotary embedding and fused dense layers.
2023-11-01 20:40:12 +00:00
Gustavo de Rosa
de35f900d3
Adds support for MQA/GQA and attention mask during training.
2023-10-30 16:59:12 +00:00
Gustavo de Rosa
d38e6f954e
Update modeling_mixformer_sequential.py
...
Removes print regarding attention_mask to prevent excessive information from being logged.
2023-10-26 20:01:15 +00:00
Gustavo de Rosa
8091327f9e
Adding _set_gradient_checkpointing for compatibility ( #22 )
...
- Adding _set_gradient_checkpointing for compatibility (a30a931294ac0f344a0c1547877c692ceb17123c)
Co-authored-by: Vicente Rivera <vriveras@users.noreply.huggingface.co>
2023-10-17 12:11:30 +00:00
Gustavo de Rosa
b6a7e2fe15
Upload modeling_mixformer_sequential.py
2023-09-27 15:22:44 +00:00
Gustavo de Rosa
8ab0f29ff6
Add more precise license metadata (UI will be cleaner!) ( #35 )
...
- Add more precise license metadata (UI will be cleaner!) (2c182742af8c7c93f0f4ee1180232a5d0c114958)
Co-authored-by: Julien Chaumond <julien-c@users.noreply.huggingface.co>
2023-09-27 15:20:42 +00:00
Gustavo de Rosa
bc09a085e7
Upload README.md
2023-09-27 14:04:07 +00:00
Gustavo de Rosa
f9f2ac7c45
fix(phi-1_5): Checks length of attention_maskif it is passed as direct tensor.
2023-09-26 21:21:45 +00:00
Gustavo de Rosa
3128bb636a
Support for attention_mask in forward pass.
...
This commit implements the following:
- Cleans up unused arguments and definitions.
- Adds support for `attention_mask`.
- Adds support for cached inference.
2023-09-26 18:17:08 +00:00
Gustavo de Rosa
4a426d8015
add _no_split_modules property ( #17 )
...
- add _no_split_modules property (7e925ddfdf2d1bb29fc26db755aafd77fb8f565e)
Co-authored-by: wing lian <winglian@users.noreply.huggingface.co>
2023-09-15 22:57:07 +00:00
Gunasekar
7d482ddf93
Update README.md
2023-09-14 00:44:40 +00:00
Gunasekar
c8f6ad8189
Update README.md
2023-09-12 18:40:56 +00:00
Gustavo de Rosa
762a3110be
Link paper to arXiv ( #5 )
...
- Link paper to arXiv (c30653547e6bbdc00a068e538a7f84ed568d1918)
Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.huggingface.co>
2023-09-12 16:01:41 +00:00
Gunasekar
ea95720a35
Update README.md
2023-09-12 01:38:42 +00:00
Gunasekar
4bba51c9b5
Update README.md
2023-09-11 21:45:49 +00:00
Gunasekar
52e294acfe
Update README.md
2023-09-11 21:44:15 +00:00
Gunasekar
9efbcafbe4
Upload tokenizer
2023-09-11 21:30:53 +00:00
Gunasekar
d655135ca1
Upload MixFormerSequentialForCausalLM
2023-09-11 21:30:53 +00:00
Gunasekar
07a048efa7
Update README.md
2023-09-11 07:57:24 +00:00
Gunasekar
b63051536f
Update README.md
2023-09-11 07:56:12 +00:00
Gunasekar
40b496f7e0
Update README.md
2023-09-11 07:50:39 +00:00
Gunasekar
d9c7521001
Update README.md
2023-09-11 07:46:06 +00:00
Gunasekar
6ddac37bb9
Update README.md
2023-09-11 07:35:26 +00:00
Gunasekar
cd4510ca85
Update README.md
2023-09-11 07:33:48 +00:00
Gunasekar
34046b03b7
Update README.md
2023-09-11 07:32:34 +00:00
Gunasekar
24ad69c3c0
Update README.md
2023-09-11 02:12:39 +00:00
Gunasekar
b3d67f3c44
Update README.md
2023-09-11 01:01:17 +00:00
Gunasekar
14be6562c1
Upload Research License.docx
2023-09-11 01:00:01 +00:00
Gunasekar
6157c47c1f
Upload tokenizer
2023-09-10 06:28:52 +00:00
Gunasekar
e656142af4
Upload MixFormerSequentialForCausalLM
2023-09-10 06:28:51 +00:00
Gunasekar
4b752e7b2d
Upload tokenizer
2023-09-10 06:16:33 +00:00
Gunasekar
2bfd6ef82c
Upload MixFormerSequentialForCausalLM
2023-09-10 06:16:29 +00:00
Gunasekar
67f350b99d
Upload tokenizer
2023-09-10 06:15:56 +00:00
Gunasekar
ba44a904e2
Upload MixFormerSequentialForCausalLM
2023-09-10 06:15:55 +00:00
Gunasekar
67a43eb1b5
Upload tokenizer
2023-09-10 05:42:14 +00:00
Gunasekar
16982066f0
Upload MixFormerSequentialForCausalLM
2023-09-10 05:42:14 +00:00
Gunasekar
98416e6398
initial commit
2023-09-10 04:03:46 +00:00