Implement on-demand model swapping for multiple models (362b8452) · Commits · nexlab / coderai

Commit 362b8452 authored Mar 15, 2026 by

Your Name

Implement on-demand model swapping for multiple models

- Add model_backend_types dict to track backend for each model
- Update set_default_model to accept backend_type parameter
- Modify get_model_for_request to swap models on-demand when in ondemand mode
- Unload current model from VRAM and load new model when request arrives for different model
- Respect --backend flag when loading models on-demand
- Only activates when no --loadall or --loadswap flag is specified

parent ebfa6892

Expand all Hide whitespace changes

Inline Side-by-side

View file @ 362b8452

This diff is collapsed.

Please register or to comment