codai/admin/templates · a019905fd3c05ab19857cd76520532e4fae6cdb5 · nexlab / coderai

feat: per-model auto-compact of the conversation context (off by default) · a019905f

Stefy Lanza (nextime / spora ) authored Jun 19, 2026

When enabled for a model, if the prompt would exceed auto_compact_pct% of the
model's context window, the conversation is shrunk to ~65% before generation
instead of erroring on overflow. Per-model config (auto_compact / auto_compact_pct
/ auto_compact_strategy) with three strategies:
  - drop_oldest    : keep system messages + the most recent turns that fit.
  - keep_head_tail : also keep the first user turn as an anchor + a count note.
  - summarize      : replace the dropped middle with a best-effort LLM summary
                     (generated by the loaded model; falls back to a count note).

Token size is a cheap chars/4 estimate; membership uses object identity so
value-equal turns don't collide. Wired into the chat path (codai/api/text.py),
the model-configure whitelist, and the model config modal UI.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

a019905f

Name	Last commit	Last update
..
archive.html		Loading commit data...
base.html		Loading commit data...
change_password.html		Loading commit data...
chat.html		Loading commit data...
dashboard.html		Loading commit data...
login.html		Loading commit data...
models.html		Loading commit data...
settings.html		Loading commit data...
tasks.html		Loading commit data...
tokens.html		Loading commit data...
users.html		Loading commit data...