• Your Name's avatar
    Fix: Centralize model unloading - properly handle all model types in ondemand mode · 00775972
    Your Name authored
    - Added unload_all_models() to MultiModelManager that handles ALL model types:
      ModelManager, diffusers pipelines, sd.cpp StableDiffusion, and any other objects
    - Text endpoints now properly unload image models before loading text models
    - Image endpoints now properly unload text models before loading image models
    - The rule: in ondemand mode, if the model in VRAM differs from the requested
      model (regardless of type), fully unload before loading the new one
    - Includes gc.collect(), torch.cuda.empty_cache(), and 1s settle delay
    00775972
Name
Last commit
Last update
.vscode Loading commit data...
codai Loading commit data...
.gitignore Loading commit data...
LICENSE.md Loading commit data...
README.md Loading commit data...
build.sh Loading commit data...
coder Loading commit data...
coderai Loading commit data...
requirements-nvidia.txt Loading commit data...
requirements-vulkan.txt Loading commit data...
requirements.txt Loading commit data...