Files · 7fc393d404bf25f61d3c67d412c17073b0b942b9 · nexlab / coderai

ds4: configurable CUDA env knobs (expert-cache reserve + free-form extra_env) · 7fc393d4

Stefy Lanza (nextime / spora ) authored Jun 20, 2026

ds4-server exposes several CUDA tunables only via environment, not CLI flags.
By default ds4 reserves half the card for non-cache use and allocates the model
weight arena in 1792 MiB chunks — both starve / OOM the streaming expert cache
on small-weight MoE models served from SSD.

Pass an explicit env to ds4-server (Popen now sets env=) with:
  - expert_cache_reserve_gb: typed knob -> DS4_CUDA_STREAMING_EXPERT_CACHE_RESERVE_GB
    (0 = leave ds4's default).
  - extra_env: free-form KEY=VALUE passthrough for the rest, e.g.
    DS4_CUDA_WEIGHT_ARENA_CHUNK_MB=512 to shrink the weight-arena chunk so it
    fits a heap fragmented by the expert cache.

Both surfaced in Settings (config + admin GET/POST + UI), default to no-op so
behaviour is unchanged unless set.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

7fc393d4

Name	Last commit	Last update
codai		Loading commit data...
docs		Loading commit data...
packaging		Loading commit data...
samples		Loading commit data...
tests		Loading commit data...
tools		Loading commit data...
.dockerignore		Loading commit data...
.gitignore		Loading commit data...
AI.PROMPT		Loading commit data...
CODERAI_API_DOCUMENTATION.md		Loading commit data...
CoderAI.gif		Loading commit data...
DISTRIBUTION.md		Loading commit data...
LICENSE.md		Loading commit data...
MULTIMODAL_CAPABILITIES.md		Loading commit data...
MULTIMODAL_UI_EXAMPLES.md		Loading commit data...
README.md		Loading commit data...
build-oci.sh		Loading commit data...
build.ps1		Loading commit data...
build.sh		Loading commit data...
coderai		Loading commit data...
coderai-broker-implementation-reference.md		Loading commit data...
coderai-integration.md		Loading commit data...
commands		Loading commit data...
osxbuild.sh		Loading commit data...
package-oci.sh		Loading commit data...
package-tarball.sh		Loading commit data...
requirements-nvidia.txt		Loading commit data...
requirements-vulkan.txt		Loading commit data...
requirements.txt		Loading commit data...
run-oci.sh		Loading commit data...
smoke-test-oci.sh		Loading commit data...
todo.md		Loading commit data...
video_editor.config.json		Loading commit data...

README.md