Files · 00775972af0105acd35d03d665a211685d9d84c0 · nexlab / coderai

Fix: Centralize model unloading - properly handle all model types in ondemand mode · 00775972

Your Name authored Mar 19, 2026

- Added unload_all_models() to MultiModelManager that handles ALL model types:
  ModelManager, diffusers pipelines, sd.cpp StableDiffusion, and any other objects
- Text endpoints now properly unload image models before loading text models
- Image endpoints now properly unload text models before loading image models
- The rule: in ondemand mode, if the model in VRAM differs from the requested
  model (regardless of type), fully unload before loading the new one
- Includes gc.collect(), torch.cuda.empty_cache(), and 1s settle delay

00775972

Name	Last commit	Last update
.vscode		Loading commit data...
codai		Loading commit data...
.gitignore		Loading commit data...
LICENSE.md		Loading commit data...
README.md		Loading commit data...
build.sh		Loading commit data...
coder		Loading commit data...
coderai		Loading commit data...
requirements-nvidia.txt		Loading commit data...
requirements-vulkan.txt		Loading commit data...
requirements.txt		Loading commit data...

README.md