• Stefy Lanza (nextime / spora )'s avatar
    text: make auto-compaction actually fire — fix config lookup + max_tokens-aware layered trimming · 913e283a
    Stefy Lanza (nextime / spora ) authored
    Auto-compaction never triggered: multi_model_manager.config stores the
    whitelisted build_runtime_kwargs() dict, which drops the per-model
    auto_compact* keys (they survive only under _raw_cfg), so _resolve_compaction
    always read the global default (False) and returned None. Read the keys via a
    _raw_cfg fallback so per-model compaction config is honoured.
    
    Also rework the over-context handling to count the reply reservation, since the
    reply is generated into the same window (prompt + max_tokens <= n_ctx). Four
    layers, cheapest first:
      1. fits as-is              -> nothing
      2. overflow within tol     -> trim max_tokens to fit (lossless)
      3. beyond tol & big prompt -> compact history (drop/summarize)
      4. single message too big  -> slice it (summarize its middle, keep head/tail)
    
    The chars/4 estimate undercounts token-dense code/JSON, so trimming to the exact
    n_ctx edge could still overflow; inflate the estimate by a configurable
    estimate_safety (default 1.15) for all physical-fit decisions.
    
    New CompactionConfig knobs (per-model overridable): tolerance_pct (20),
    min_output (512), estimate_safety (1.15). Effective max_tokens is threaded back
    to both the streaming and non-streaming generation paths.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    913e283a
Name
Last commit
Last update
..
admin Loading commit data...
api Loading commit data...
backends Loading commit data...
broker Loading commit data...
frontproxy Loading commit data...
models Loading commit data...
openai Loading commit data...
pydantic Loading commit data...
queue Loading commit data...
tasks Loading commit data...
__init__.py Loading commit data...
cli.py Loading commit data...
config.py Loading commit data...
main.py Loading commit data...
platform_paths.py Loading commit data...