codai/api · 1d457be748a3269a0d9b54cfb2cb7904acf0f8aa · nexlab / coderai

Add --no-ram option to maximize VRAM usage · b782a092

Your Name authored Mar 20, 2026

- Add --no-ram CLI option to force model loading without CPU RAM spilling
- Implement --no-ram behavior for:
  - llama-cpp-python: n_gpu_layers=-1, use_mmap=False, ignore --n-ctx
  - HuggingFace transformers: device_map='cuda:0', low_cpu_mem_usage=True
  - Diffusers: force full GPU loading
  - sd.cpp: maximize GPU usage
- Propagate flag through model manager
- Add startup banner message

b782a092

Name	Last commit	Last update
..
__init__.py		Loading commit data...
app.py		Loading commit data...
images.py		Loading commit data...
log.py		Loading commit data...
state.py		Loading commit data...
text.py		Loading commit data...
transcriptions.py		Loading commit data...
tts.py		Loading commit data...