Proposal: new chat_template_arg enable_history_reasoning for reusing prompt cache among querys within Agents .

Currently  reasoning contents before the last user query msg will be ignored.
This might cause prompt cache miss, especially within agents (eg. Coding Agents / Deep Agents) that just  calling tools many time before the last user query msg.
So, here I propose a new chat template arg `enable_history_reasoning` for (optionally) keep the history reasoning contents in the final prompt, forreusing prompt cache  (better) in such cases.
This commit is contained in:
Sun Yongyue 2026-03-23 16:55:54 +00:00 committed by system
parent c202236235
commit 3f2adb652a
No known key found for this signature in database
GPG Key ID: 6A528E38E0733467

@ -97,7 +97,7 @@
{%- endif %} {%- endif %}
{%- endif %} {%- endif %}
{%- set reasoning_content = reasoning_content|trim %} {%- set reasoning_content = reasoning_content|trim %}
{%- if loop.index0 > ns.last_query_index %} {%- if loop.index0 > ns.last_query_index or enable_history_reasoning is defined and enable_history_reasoning is true %}
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content + '\n</think>\n\n' + content }} {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content + '\n</think>\n\n' + content }}
{%- else %} {%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }} {{- '<|im_start|>' + message.role + '\n' + content }}