From 3f2adb652a90b8a002310a11ef37ec0e6a64499a Mon Sep 17 00:00:00 2001 From: Sun Yongyue Date: Mon, 23 Mar 2026 16:55:54 +0000 Subject: [PATCH] Proposal: new chat_template_arg `enable_history_reasoning` for reusing prompt cache among querys within Agents . Currently reasoning contents before the last user query msg will be ignored. This might cause prompt cache miss, especially within agents (eg. Coding Agents / Deep Agents) that just calling tools many time before the last user query msg. So, here I propose a new chat template arg `enable_history_reasoning` for (optionally) keep the history reasoning contents in the final prompt, forreusing prompt cache (better) in such cases. --- chat_template.jinja | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/chat_template.jinja b/chat_template.jinja index a585dec..d243fb2 100644 --- a/chat_template.jinja +++ b/chat_template.jinja @@ -97,7 +97,7 @@ {%- endif %} {%- endif %} {%- set reasoning_content = reasoning_content|trim %} - {%- if loop.index0 > ns.last_query_index %} + {%- if loop.index0 > ns.last_query_index or enable_history_reasoning is defined and enable_history_reasoning is true %} {{- '<|im_start|>' + message.role + '\n\n' + reasoning_content + '\n\n\n' + content }} {%- else %} {{- '<|im_start|>' + message.role + '\n' + content }}