codai · 00775972af0105acd35d03d665a211685d9d84c0 · nexlab / coderai

Fix: Centralize model unloading - properly handle all model types in ondemand mode · 00775972

Your Name authored Mar 19, 2026

- Added unload_all_models() to MultiModelManager that handles ALL model types:
  ModelManager, diffusers pipelines, sd.cpp StableDiffusion, and any other objects
- Text endpoints now properly unload image models before loading text models
- Image endpoints now properly unload text models before loading image models
- The rule: in ondemand mode, if the model in VRAM differs from the requested
  model (regardless of type), fully unload before loading the new one
- Includes gc.collect(), torch.cuda.empty_cache(), and 1s settle delay

00775972

Name	Last commit	Last update
..
api		Loading commit data...
backends		Loading commit data...
models		Loading commit data...
pydantic		Loading commit data...
queue		Loading commit data...
__init__.py		Loading commit data...
cli.py		Loading commit data...
main.py		Loading commit data...