Proposal: new chat_template_arg enable_history_reasoning for reusing prompt cache among querys within Agents .

Currently reasoning contents before the last user query msg will be ignored. This might cause prompt cache miss, especially within agents (eg. Coding Agents / Deep Agents) that just calling tools many time before the last user query msg. So, here I propose a new chat template arg `enable_history_reasoning` for (optionally) keep the history reasoning contents in the final prompt, forreusing prompt cache (better) in such cases.
2026-03-23 16:55:54 +00:00 · 2026-03-23 16:55:54 +00:00 · 3f2adb652a
commit 3f2adb652a
parent c202236235
1 changed files with 1 additions and 1 deletions
--- a/chat_template.jinja
+++ b/chat_template.jinja
@ -97,7 +97,7 @@
            {%- endif %}
        {%- endif %}
        {%- set reasoning_content = reasoning_content|trim %}
-        {%- if loop.index0 > ns.last_query_index %}
+        {%- if loop.index0 > ns.last_query_index or enable_history_reasoning is defined and enable_history_reasoning is true %}
            {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content + '\n</think>\n\n' + content }}
        {%- else %}
            {{- '<|im_start|>' + message.role + '\n' + content }}