-
Stefy Lanza (nextime / spora ) authored
model-load/model-unload were proxied to the primary engine, so unloading (or loading) a model that lives on a secondary engine hit the wrong process and silently no-op'd (was_loaded=False). Add front-proxy interceptors: - unload: find the engine whose loaded_models matches the path and forward the request there; fall back to the primary. - load: reuse an engine already serving the model, else the model's engine pin from models.json, else the primary. Registered before the catch-all proxy, mirroring /admin/api/engines. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
8abd66c7
| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| __init__.py | ||
| app.py | ||
| assignment.py | ||
| engine_supervisor.py | ||
| gpu_detect.py | ||
| registry.py | ||
| router.py |