Files · ef106ba16d01884ae65e503144acd77f97b0f720 · nexlab / coderai

ds4: auto-downloaded weights land in coderai GGUF cache + show on models page · ef106ba1

Stefy Lanza (nextime / spora ) authored Jun 19, 2026

When ds4.auto_download is enabled and a deepseek4 request resolves no local
GGUF, the downloaded weight variant is now relocated into coderai's GGUF cache
(get_model_cache_dir; move on same FS, symlink across devices) and registered
in models.json as a text_models entry that mimics the requested ("failed")
model's config — backend auto, on-request, enabled and visible (removed from
unloaded/to_download). model_name is threaded ds4 backend → ensure_service →
ensure_model so the registration mirrors the right entry.

Also: settings "Extra ds4-server args" hint/placeholder updated to reflect the
auto --kv-disk-dir and SSD-streaming expert-cache sizing
(--ssd-streaming-cache-experts), noting Q2_K can fail ds4's CUDA prefill.

Diagnosis (no code change): ds4-server's "cuda prefill failed" on the 93GB
Q2_K variant is a quant-specific ds4 CUDA bug — the 154GB Q4_K completes
prefill fine (verified: "prompt done 434s" vs Q2_K instant failure), with
15.8GB VRAM free either way (not OOM, not cache budget, not coderai).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

ef106ba1

Name	Last commit	Last update
codai		Loading commit data...
docs		Loading commit data...
packaging		Loading commit data...
samples		Loading commit data...
tests		Loading commit data...
tools		Loading commit data...
.dockerignore		Loading commit data...
.gitignore		Loading commit data...
AI.PROMPT		Loading commit data...
CODERAI_API_DOCUMENTATION.md		Loading commit data...
CoderAI.gif		Loading commit data...
DISTRIBUTION.md		Loading commit data...
LICENSE.md		Loading commit data...
MULTIMODAL_CAPABILITIES.md		Loading commit data...
MULTIMODAL_UI_EXAMPLES.md		Loading commit data...
README.md		Loading commit data...
build-oci.sh		Loading commit data...
build.ps1		Loading commit data...
build.sh		Loading commit data...
coderai		Loading commit data...
coderai-broker-implementation-reference.md		Loading commit data...
coderai-integration.md		Loading commit data...
commands		Loading commit data...
osxbuild.sh		Loading commit data...
package-oci.sh		Loading commit data...
package-tarball.sh		Loading commit data...
requirements-nvidia.txt		Loading commit data...
requirements-vulkan.txt		Loading commit data...
requirements.txt		Loading commit data...
run-oci.sh		Loading commit data...
smoke-test-oci.sh		Loading commit data...
todo.md		Loading commit data...
video_editor.config.json		Loading commit data...

README.md