attention_mask
This commit implements the following: - Cleans up unused arguments and definitions. - Adds support for `attention_mask`. - Adds support for cached inference.