codai · c08a5b4f27907e64f2a61d32270651a903af6263 · nexlab / coderai

Implement proper loadswap/loadall/ondemand model management modes · c08a5b4f

Your Name authored Mar 19, 2026

- Default mode changed to ondemand (pre-load first model, unload/load on switch)
- loadswap: load first model in VRAM, others in CPU RAM, swap on switch
- loadall: try to load all models in VRAM, offload to CPU RAM if OOM
- --nopreload: skip pre-loading in any mode, load on first request
- request_model() now properly handles all three modes
- Added _move_model_to_cpu() and _move_model_to_vram() for loadswap
- Fixed NameError: model_manager reference in request_model() (was using global singleton instead of self)
- Updated CLI help text for --loadall, --loadswap, --nopreload

c08a5b4f

Name	Last commit	Last update
..
api		Loading commit data...
backends		Loading commit data...
models		Loading commit data...
pydantic		Loading commit data...
queue		Loading commit data...
__init__.py		Loading commit data...
cli.py		Loading commit data...
main.py		Loading commit data...