Gustavo de Rosa
92557d03bb
Improves type hinting on configuration arguments.
2023-11-01 23:40:19 +00:00
Gustavo de Rosa
45f4b21525
Enables to toggle fused_dense, flash_rotary and attn_pdrop in the configuration.
2023-11-01 23:33:57 +00:00
Gustavo de Rosa
de35f900d3
Adds support for MQA/GQA and attention mask during training.
2023-10-30 16:59:12 +00:00
Gustavo de Rosa
3128bb636a
Support for attention_mask in forward pass.
...
This commit implements the following:
- Cleans up unused arguments and definitions.
- Adds support for `attention_mask`.
- Adds support for cached inference.
2023-09-26 18:17:08 +00:00
Gunasekar
16982066f0
Upload MixFormerSequentialForCausalLM
2023-09-10 05:42:14 +00:00