codai/models · b782a092161a1af18b4097e269422314f6a7e938 · nexlab / coderai

Add --no-ram option to maximize VRAM usage · b782a092

Your Name authored Mar 20, 2026

- Add --no-ram CLI option to force model loading without CPU RAM spilling
- Implement --no-ram behavior for:
  - llama-cpp-python: n_gpu_layers=-1, use_mmap=False, ignore --n-ctx
  - HuggingFace transformers: device_map='cuda:0', low_cpu_mem_usage=True
  - Diffusers: force full GPU loading
  - sd.cpp: maximize GPU usage
- Propagate flag through model manager
- Add startup banner message

b782a092

Name	Last commit	Last update
..
cache		Loading commit data...
__init__.py		Loading commit data...
capabilities.py		Loading commit data...
grammar.py		Loading commit data...
manager.py		Loading commit data...
parser.py		Loading commit data...
templates.py		Loading commit data...
tool_call_grammar.gbnf		Loading commit data...
utils.py		Loading commit data...