codai/api · c08a5b4f27907e64f2a61d32270651a903af6263 · nexlab / coderai

Centralize model resolution and VRAM management in MultiModelManager.request_model() · e004541a

Your Name authored Mar 19, 2026

- Added request_model() method to MultiModelManager that handles:
  1. Alias resolution (image, audio, tts, vision, default, custom aliases)
  2. VRAM management (unloading previous models in ondemand mode)
  3. Checking if model is already loaded

- Simplified codai/api/images.py:
  - Uses request_model() for model resolution and VRAM management
  - Extracted helper functions: _is_gguf_model(), _load_diffusers_pipeline(),
    _generate_with_diffusers(), _generate_with_sdcpp(), _load_sdcpp_model()
  - Removed duplicated sd.cpp generation code
  - Fixed semaphore scope (all generation now inside semaphore block)

- Simplified codai/api/tts.py:
  - Uses request_model() instead of duplicated VRAM management code
  - Removed duplicate get_cached_model_path() and get_model_cache_dir() wrappers

- Simplified codai/api/transcriptions.py:
  - Uses request_model() instead of duplicated VRAM management code

- Simplified codai/api/text.py:
  - Both /v1/chat/completions and /v1/completions use request_model()
  - Removed duplicated VRAM management blocks

e004541a

Name	Last commit	Last update
..
__init__.py		Loading commit data...
app.py		Loading commit data...
images.py		Loading commit data...
log.py		Loading commit data...
state.py		Loading commit data...
text.py		Loading commit data...
transcriptions.py		Loading commit data...
tts.py		Loading commit data...