codai · b782a092161a1af18b4097e269422314f6a7e938 · nexlab / coderai

Add --no-ram option to maximize VRAM usage · b782a092

Your Name authored Mar 20, 2026

- Add --no-ram CLI option to force model loading without CPU RAM spilling
- Implement --no-ram behavior for:
  - llama-cpp-python: n_gpu_layers=-1, use_mmap=False, ignore --n-ctx
  - HuggingFace transformers: device_map='cuda:0', low_cpu_mem_usage=True
  - Diffusers: force full GPU loading
  - sd.cpp: maximize GPU usage
- Propagate flag through model manager
- Add startup banner message

b782a092

Name	Last commit	Last update
..
api		Loading commit data...
backends		Loading commit data...
models		Loading commit data...
pydantic		Loading commit data...
queue		Loading commit data...
__init__.py		Loading commit data...
cli.py		Loading commit data...
main.py		Loading commit data...