-
Your Name authored
- Added request_model() method to MultiModelManager that handles: 1. Alias resolution (image, audio, tts, vision, default, custom aliases) 2. VRAM management (unloading previous models in ondemand mode) 3. Checking if model is already loaded - Simplified codai/api/images.py: - Uses request_model() for model resolution and VRAM management - Extracted helper functions: _is_gguf_model(), _load_diffusers_pipeline(), _generate_with_diffusers(), _generate_with_sdcpp(), _load_sdcpp_model() - Removed duplicated sd.cpp generation code - Fixed semaphore scope (all generation now inside semaphore block) - Simplified codai/api/tts.py: - Uses request_model() instead of duplicated VRAM management code - Removed duplicate get_cached_model_path() and get_model_cache_dir() wrappers - Simplified codai/api/transcriptions.py: - Uses request_model() instead of duplicated VRAM management code - Simplified codai/api/text.py: - Both /v1/chat/completions and /v1/completions use request_model() - Removed duplicated VRAM management blockse004541a
| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| __init__.py | ||
| app.py | ||
| images.py | ||
| log.py | ||
| state.py | ||
| text.py | ||
| transcriptions.py | ||
| tts.py |