codai/admin/templates · 6a153c581b008192f6bc53aedbb77736de8d5a4a · nexlab / coderai

feat(ds4): auto-route deepseek4 GGUFs by architecture; serve the requested file · 6a153c58

Stefy Lanza (nextime / spora ) authored Jun 19, 2026

- Route to ds4 by GGUF ARCHITECTURE (general.architecture == "deepseek4"), read
  from the file header (cached) — not by filename. Mainline deepseek/2/3/32 GGUFs
  stay on llama.cpp; the model_id alias still routes for the download case.
- ds4-server now serves the REQUESTED GGUF: Ds4Backend resolves the model to a
  local .gguf and launches `ds4-server -m <file>` (resolve_service_key keys the
  managed service per file). No fixed-variant assumption.
- Honour the model's per-entry n_ctx for ds4-server --ctx (over the global ctx).
- New config.ds4 options + settings UI: ssd_streaming (--ssd-streaming, stream
  MoE experts from SSD/disk), model_path (explicit -m override), and
  auto_download (OFF by default — only serve GGUFs already present; error clearly
  instead of silently pulling tens of GB; opt in to fetch model_variant).
- AI.PROMPT: document DeepSeek-V4 = pending upstream llama.cpp PRs (needs new ggml
  ops) → ds4 for now; and ds4 routing/offload/text-only specifics.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

6a153c58

Name	Last commit	Last update
..
archive.html		Loading commit data...
base.html		Loading commit data...
change_password.html		Loading commit data...
chat.html		Loading commit data...
dashboard.html		Loading commit data...
login.html		Loading commit data...
models.html		Loading commit data...
settings.html		Loading commit data...
tasks.html		Loading commit data...
tokens.html		Loading commit data...
users.html		Loading commit data...