• Stefy Lanza (nextime / spora )'s avatar
    fix: tool-call streaming/format robustness + clear over-context error · 3834ecf5
    Stefy Lanza (nextime / spora ) authored
    - Streaming tool gate now withholds the gemma/qwen native `<|tool_call>` marker
      (and partials) too, not just `<tool_call>`/`call:NAME{` — so the raw marker no
      longer leaks to the client mid-stream (Kilo was executing partial calls).
    - Normalize tool-call function.arguments from JSON string → dict before applying
      the chat template, so templates that render `arguments|items` (Qwen) don't
      raise "Can only get item pairs from a mapping".
    - Context-window overflow now returns a meaningful error: a structured SSE error
      event (code context_length_exceeded) when streaming, or HTTP 400 with a clear
      message for non-streaming — instead of injecting "[Generation error: …]" as
      assistant content (which polluted chat history).
    - Models page: unconfigured GGUF files now expose the "Free disk" button (records
      them as "to download" before deleting), matching HF models.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    3834ecf5
Name
Last commit
Last update
..
admin Loading commit data...
api Loading commit data...
backends Loading commit data...
broker Loading commit data...
frontproxy Loading commit data...
models Loading commit data...
openai Loading commit data...
pydantic Loading commit data...
queue Loading commit data...
tasks Loading commit data...
__init__.py Loading commit data...
cli.py Loading commit data...
config.py Loading commit data...
main.py Loading commit data...
platform_paths.py Loading commit data...