-
Stefy Lanza (nextime / spora ) authored
pick_engine honours the front's assignment (radeon) only if the engine can_serve the request's required capability. But _required_cap derived that capability from the bare alias 'coe-…-q4_k_m' — no literal 'gguf' — so required_capability returned 'transformers' (CUDA-only). radeon is gguf-only, failed can_serve, and the request fell through to the default engine (nvidia), even though compute_assignment had correctly placed the model on radeon (it sees the full '…-q4_k_m.gguf' path). Resolve the model's configured path in _load_pins (now indexed by the .gguf-stripped stem too) and, when the name heuristic yields 'transformers' but that path is a .gguf, correct the capability to 'gguf'. whisper/ds4 precedence is unchanged. Combined with the registry stem-matching, a bare-alias request now lands on the owning Vulkan/AMD engine. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
482e47cf
| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| __init__.py | ||
| app.py | ||
| assignment.py | ||
| engine_supervisor.py | ||
| gpu_detect.py | ||
| registry.py | ||
| router.py |