• Stefy Lanza (nextime / spora )'s avatar
    ds4: auto-downloaded weights land in coderai GGUF cache + show on models page · ef106ba1
    Stefy Lanza (nextime / spora ) authored
    When ds4.auto_download is enabled and a deepseek4 request resolves no local
    GGUF, the downloaded weight variant is now relocated into coderai's GGUF cache
    (get_model_cache_dir; move on same FS, symlink across devices) and registered
    in models.json as a text_models entry that mimics the requested ("failed")
    model's config — backend auto, on-request, enabled and visible (removed from
    unloaded/to_download). model_name is threaded ds4 backend → ensure_service →
    ensure_model so the registration mirrors the right entry.
    
    Also: settings "Extra ds4-server args" hint/placeholder updated to reflect the
    auto --kv-disk-dir and SSD-streaming expert-cache sizing
    (--ssd-streaming-cache-experts), noting Q2_K can fail ds4's CUDA prefill.
    
    Diagnosis (no code change): ds4-server's "cuda prefill failed" on the 93GB
    Q2_K variant is a quant-specific ds4 CUDA bug — the 154GB Q4_K completes
    prefill fine (verified: "prompt done 434s" vs Q2_K instant failure), with
    15.8GB VRAM free either way (not OOM, not cache budget, not coderai).
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    ef106ba1
Name
Last commit
Last update
codai Loading commit data...
docs Loading commit data...
packaging Loading commit data...
samples Loading commit data...
tests Loading commit data...
tools Loading commit data...
.dockerignore Loading commit data...
.gitignore Loading commit data...
AI.PROMPT Loading commit data...
CODERAI_API_DOCUMENTATION.md Loading commit data...
CoderAI.gif Loading commit data...
DISTRIBUTION.md Loading commit data...
LICENSE.md Loading commit data...
MULTIMODAL_CAPABILITIES.md Loading commit data...
MULTIMODAL_UI_EXAMPLES.md Loading commit data...
README.md Loading commit data...
build-oci.sh Loading commit data...
build.ps1 Loading commit data...
build.sh Loading commit data...
coderai Loading commit data...
coderai-broker-implementation-reference.md Loading commit data...
coderai-integration.md Loading commit data...
commands Loading commit data...
osxbuild.sh Loading commit data...
package-oci.sh Loading commit data...
package-tarball.sh Loading commit data...
requirements-nvidia.txt Loading commit data...
requirements-vulkan.txt Loading commit data...
requirements.txt Loading commit data...
run-oci.sh Loading commit data...
smoke-test-oci.sh Loading commit data...
todo.md Loading commit data...
video_editor.config.json Loading commit data...