-
Stefy Lanza (nextime / spora ) authored
- ds4 kv janitor: a checkpoint is deleted only when ALL hold — untouched by max(mtime, atime) for the age (so a checkpoint ds4 merely READS, which bumps atime not mtime, is spared); not currently open (fd/mmap) by a ds4-server; and ds4 is not serving any request. New in-flight counter on Ds4Backend (any_request_active) gates the sweep. - settings: "Download a default DeepSeek V4 model" — select + button backed by new /admin/api/ds4/default-models catalog (q2-imatrix / q2-q4 / q4 / mtp from antirez/deepseek-v4-gguf). Reuses the normal downloader, which flattens the gguf into the cache and surfaces it in the model list; live progress. - parser: rescue the degraded plaintext <tool>name arg: value</tool> form that heavy quants (ds4 q2-imatrix) emit when they can't reproduce DSML. Scoped to DeepSeekParser only (never the shared ToolCallParser, so other families are untouched), requires a DECLARED tool name, plaintext-only inner, and the block(s) to be the message's trailing action — so a <tool> example inside a prose reply is not misread as a call. - settings: corrected ds4 perf note (i-quants/Q2_K fail CUDA prefill; use Q4_K+). Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
e23dd2a7