titaiwang
2066613d0b
prototype of unblocking onnx export
2023-11-02 21:19:36 +00:00
Gustavo de Rosa
0254d42a95
Fixes flash-attn import with a try/except statement
2023-11-01 23:32:35 +00:00
Gustavo de Rosa
0bbd68a176
Adds support for flash-attn rotary embedding and fused dense layers.
2023-11-01 20:40:12 +00:00
Gustavo de Rosa
de35f900d3
Adds support for MQA/GQA and attention mask during training.
2023-10-30 16:59:12 +00:00
Gustavo de Rosa
d38e6f954e
Update modeling_mixformer_sequential.py
...
Removes print regarding attention_mask to prevent excessive information from being logged.
2023-10-26 20:01:15 +00:00
Gustavo de Rosa
8091327f9e
Adding _set_gradient_checkpointing for compatibility ( #22 )
...
- Adding _set_gradient_checkpointing for compatibility (a30a931294ac0f344a0c1547877c692ceb17123c)
Co-authored-by: Vicente Rivera <vriveras@users.noreply.huggingface.co>
2023-10-17 12:11:30 +00:00
Gustavo de Rosa
b6a7e2fe15
Upload modeling_mixformer_sequential.py
2023-09-27 15:22:44 +00:00
Gustavo de Rosa
f9f2ac7c45
fix(phi-1_5): Checks length of attention_maskif it is passed as direct tensor.
2023-09-26 21:21:45 +00:00
Gustavo de Rosa
3128bb636a
Support for attention_mask in forward pass.
...
This commit implements the following:
- Cleans up unused arguments and definitions.
- Adds support for `attention_mask`.
- Adds support for cached inference.
2023-09-26 18:17:08 +00:00
Gustavo de Rosa
4a426d8015
add _no_split_modules property ( #17 )
...
- add _no_split_modules property (7e925ddfdf2d1bb29fc26db755aafd77fb8f565e)
Co-authored-by: wing lian <winglian@users.noreply.huggingface.co>
2023-09-15 22:57:07 +00:00
Gunasekar
d655135ca1
Upload MixFormerSequentialForCausalLM
2023-09-11 21:30:53 +00:00
Gunasekar
16982066f0
Upload MixFormerSequentialForCausalLM
2023-09-10 05:42:14 +00:00