• Stefy Lanza (nextime / spora )'s avatar
    text: surface model reasoning as a separate field (think/thinking/thought) · 0a7d343a
    Stefy Lanza (nextime / spora ) authored
    Qwen-style chat templates pre-fill the opening <think> in the prompt, so the
    model emits only the reasoning body + a bare closing </think> — and they think
    by DEFAULT regardless of the API enable_thinking flag. The old paired-tag
    reasoning extractor missed the bare close, leaking the whole thought (and the
    </think>) into content and conversation history.
    
    - extract_reasoning_content: handle a bare </think|/thinking|/thought> with no
      opening tag (treat the prefix as reasoning).
    - streaming: a chunk-safe reasoning gate routes the thought into
      delta.reasoning / reasoning_content until </think>, then flips to content;
      tool extraction runs on the post-</think> answer only.
    - non-streaming: extract reasoning, set message.reasoning(+_content), clean
      content; tools parsed from the answer.
    - activate whenever the model auto-thinks (qwen3/qwq/deepseek-r1/… name) OR
      reasoning is explicitly enabled — not just on the API flag.
    - configurable suppression: per-model `suppress_reasoning`, or per-request via
      the standard reasoning:{exclude:true} / reasoning_effort:"none" /
      suppress_reasoning fields. Emits both `reasoning` and DeepSeek-style
      `reasoning_content` for client compatibility.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    0a7d343a
Name
Last commit
Last update
..
static Loading commit data...
templates Loading commit data...
__init__.py Loading commit data...
auth.py Loading commit data...
download_worker.py Loading commit data...
routes.py Loading commit data...