• Stefy Lanza (nextime / spora )'s avatar
    feat: per-model auto-compact of the conversation context (off by default) · a019905f
    Stefy Lanza (nextime / spora ) authored
    When enabled for a model, if the prompt would exceed auto_compact_pct% of the
    model's context window, the conversation is shrunk to ~65% before generation
    instead of erroring on overflow. Per-model config (auto_compact / auto_compact_pct
    / auto_compact_strategy) with three strategies:
      - drop_oldest    : keep system messages + the most recent turns that fit.
      - keep_head_tail : also keep the first user turn as an anchor + a count note.
      - summarize      : replace the dropped middle with a best-effort LLM summary
                         (generated by the loaded model; falls back to a count note).
    
    Token size is a cheap chars/4 estimate; membership uses object identity so
    value-equal turns don't collide. Wired into the chat path (codai/api/text.py),
    the model-configure whitelist, and the model config modal UI.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    a019905f
models.html 203 KB