phi-1_5

Author	SHA1	Message	Date
Gustavo de Rosa	bffd3b29c4	Update LICENSE	2024-02-06 12:36:39 +00:00
Gustavo de Rosa	349cf8b5e8	Update README.md	2024-01-24 13:34:13 +00:00
Gustavo de Rosa	83b9c52637	Update README.md	2024-01-22 12:25:40 +00:00
Gustavo de Rosa	675e8c1bae	Update config.json	2024-01-22 12:25:27 +00:00
Gustavo de Rosa	34a1490e06	Update modeling_phi.py	2024-01-16 16:05:38 +00:00
Gustavo de Rosa	59e722d14e	Update README.md	2024-01-16 14:56:49 +00:00
Gustavo de Rosa	426ea900b0	Update modeling_phi.py	2024-01-15 14:26:10 +00:00
Gustavo de Rosa	3edb5e62c4	Update modeling_phi.py	2024-01-12 00:44:23 +00:00
Gustavo de Rosa	e0f03c4877	Update modeling_phi.py	2024-01-11 16:40:17 +00:00
Gustavo de Rosa	051d15f1e7	Update config.json	2024-01-11 11:22:42 +00:00
Gustavo de Rosa	914c8fb3c6	Upload modeling_phi.py	2024-01-10 13:54:40 +00:00
Gustavo de Rosa	3a705a2d6b	Delete Research License.docx	2024-01-10 13:16:00 +00:00
Gustavo de Rosa	341a17a8f2	Upload 5 files	2024-01-10 13:15:50 +00:00
Gustavo de Rosa	1dc35eb2f5	Update README.md (#69 ) - Update README.md (8584061b4d9f189aea26e170cb1c285a22fe731d) Co-authored-by: Mojan Javaheripi <mojanjp@users.noreply.huggingface.co>	2024-01-10 11:29:00 +00:00
Gustavo de Rosa	41217aafb5	Update config.json	2024-01-08 17:13:22 +00:00
Gustavo de Rosa	d3ba318b78	chore(root): Updates files to internal transformers implementation.	2024-01-08 13:12:24 +00:00
Gustavo de Rosa	24f9ea14df	Update README.md	2023-12-13 23:24:09 +00:00
Gustavo de Rosa	d262514668	Upload 4 files	2023-12-13 23:19:24 +00:00
Gustavo de Rosa	f27cd936bd	Update README.md	2023-12-13 23:01:12 +00:00
Gustavo de Rosa	80c0ba9f8e	Update README.md	2023-12-13 22:44:59 +00:00
Gustavo de Rosa	a286f5c1de	Disables inference API to prevent mismatch with HF implementation.	2023-12-13 21:54:41 +00:00
Gustavo de Rosa	ca573e3fa3	fix(modeling_phi): Fixes initial generation with length larger than context length.	2023-12-08 17:40:16 +00:00
Gustavo de Rosa	37527ba0b8	fix(modeling_phi): Fixes cached generation when above maximum context length.	2023-12-05 21:09:53 +00:00
Gustavo de Rosa	5fd430c7bc	Fixes exceeding maximum sequence length when using generate().	2023-11-20 18:11:04 +00:00
Gustavo de Rosa	d212a78962	Delete modeling_mixformer_sequential.py	2023-11-16 18:10:37 +00:00
Gustavo de Rosa	8e9ebfb9bf	Delete configuration_mixformer_sequential.py	2023-11-16 18:10:30 +00:00
Gustavo de Rosa	271c3397ab	Update to new model interface.	2023-11-16 17:28:06 +00:00
Gustavo de Rosa	92557d03bb	Improves type hinting on configuration arguments.	2023-11-01 23:40:19 +00:00
Gustavo de Rosa	45f4b21525	Enables to toggle fused_dense, flash_rotary and attn_pdrop in the configuration.	2023-11-01 23:33:57 +00:00
Gustavo de Rosa	0254d42a95	Fixes flash-attn import with a try/except statement	2023-11-01 23:32:35 +00:00
Gustavo de Rosa	0bbd68a176	Adds support for flash-attn rotary embedding and fused dense layers.	2023-11-01 20:40:12 +00:00
Gustavo de Rosa	de35f900d3	Adds support for MQA/GQA and attention mask during training.	2023-10-30 16:59:12 +00:00
Gustavo de Rosa	d38e6f954e	Update modeling_mixformer_sequential.py Removes print regarding attention_mask to prevent excessive information from being logged.	2023-10-26 20:01:15 +00:00
Gustavo de Rosa	8091327f9e	Adding _set_gradient_checkpointing for compatibility (#22 ) - Adding _set_gradient_checkpointing for compatibility (a30a931294ac0f344a0c1547877c692ceb17123c) Co-authored-by: Vicente Rivera <vriveras@users.noreply.huggingface.co>	2023-10-17 12:11:30 +00:00
Gustavo de Rosa	b6a7e2fe15	Upload modeling_mixformer_sequential.py	2023-09-27 15:22:44 +00:00
Gustavo de Rosa	8ab0f29ff6	Add more precise license metadata (UI will be cleaner!) (#35 ) - Add more precise license metadata (UI will be cleaner!) (2c182742af8c7c93f0f4ee1180232a5d0c114958) Co-authored-by: Julien Chaumond <julien-c@users.noreply.huggingface.co>	2023-09-27 15:20:42 +00:00
Gustavo de Rosa	bc09a085e7	Upload README.md	2023-09-27 14:04:07 +00:00
Gustavo de Rosa	f9f2ac7c45	fix(phi-1_5): Checks length of `attention_mask`if it is passed as direct tensor.	2023-09-26 21:21:45 +00:00
Gustavo de Rosa	3128bb636a	Support for `attention_mask` in forward pass. This commit implements the following: - Cleans up unused arguments and definitions. - Adds support for `attention_mask`. - Adds support for cached inference.	2023-09-26 18:17:08 +00:00
Gustavo de Rosa	4a426d8015	add _no_split_modules property (#17 ) - add _no_split_modules property (7e925ddfdf2d1bb29fc26db755aafd77fb8f565e) Co-authored-by: wing lian <winglian@users.noreply.huggingface.co>	2023-09-15 22:57:07 +00:00
Gunasekar	7d482ddf93	Update README.md	2023-09-14 00:44:40 +00:00
Gunasekar	c8f6ad8189	Update README.md	2023-09-12 18:40:56 +00:00
Gustavo de Rosa	762a3110be	Link paper to arXiv (#5 ) - Link paper to arXiv (c30653547e6bbdc00a068e538a7f84ed568d1918) Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.huggingface.co>	2023-09-12 16:01:41 +00:00
Gunasekar	ea95720a35	Update README.md	2023-09-12 01:38:42 +00:00
Gunasekar	4bba51c9b5	Update README.md	2023-09-11 21:45:49 +00:00
Gunasekar	52e294acfe	Update README.md	2023-09-11 21:44:15 +00:00
Gunasekar	9efbcafbe4	Upload tokenizer	2023-09-11 21:30:53 +00:00
Gunasekar	d655135ca1	Upload MixFormerSequentialForCausalLM	2023-09-11 21:30:53 +00:00
Gunasekar	07a048efa7	Update README.md	2023-09-11 07:57:24 +00:00
Gunasekar	b63051536f	Update README.md	2023-09-11 07:56:12 +00:00

1 2

67 Commits