• Stefy Lanza (nextime / spora )'s avatar
    ds4: configurable CUDA env knobs (expert-cache reserve + free-form extra_env) · 7fc393d4
    Stefy Lanza (nextime / spora ) authored
    ds4-server exposes several CUDA tunables only via environment, not CLI flags.
    By default ds4 reserves half the card for non-cache use and allocates the model
    weight arena in 1792 MiB chunks — both starve / OOM the streaming expert cache
    on small-weight MoE models served from SSD.
    
    Pass an explicit env to ds4-server (Popen now sets env=) with:
      - expert_cache_reserve_gb: typed knob -> DS4_CUDA_STREAMING_EXPERT_CACHE_RESERVE_GB
        (0 = leave ds4's default).
      - extra_env: free-form KEY=VALUE passthrough for the rest, e.g.
        DS4_CUDA_WEIGHT_ARENA_CHUNK_MB=512 to shrink the weight-arena chunk so it
        fits a heap fragmented by the expert cache.
    
    Both surfaced in Settings (config + admin GET/POST + UI), default to no-op so
    behaviour is unchanged unless set.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    7fc393d4
Name
Last commit
Last update
codai Loading commit data...
docs Loading commit data...
packaging Loading commit data...
samples Loading commit data...
tests Loading commit data...
tools Loading commit data...
.dockerignore Loading commit data...
.gitignore Loading commit data...
AI.PROMPT Loading commit data...
CODERAI_API_DOCUMENTATION.md Loading commit data...
CoderAI.gif Loading commit data...
DISTRIBUTION.md Loading commit data...
LICENSE.md Loading commit data...
MULTIMODAL_CAPABILITIES.md Loading commit data...
MULTIMODAL_UI_EXAMPLES.md Loading commit data...
README.md Loading commit data...
build-oci.sh Loading commit data...
build.ps1 Loading commit data...
build.sh Loading commit data...
coderai Loading commit data...
coderai-broker-implementation-reference.md Loading commit data...
coderai-integration.md Loading commit data...
commands Loading commit data...
osxbuild.sh Loading commit data...
package-oci.sh Loading commit data...
package-tarball.sh Loading commit data...
requirements-nvidia.txt Loading commit data...
requirements-vulkan.txt Loading commit data...
requirements.txt Loading commit data...
run-oci.sh Loading commit data...
smoke-test-oci.sh Loading commit data...
todo.md Loading commit data...
video_editor.config.json Loading commit data...