• Stefy Lanza (nextime / spora )'s avatar
    front: route gguf bare alias by capability to its real engine, not nvidia · 482e47cf
    Stefy Lanza (nextime / spora ) authored
    pick_engine honours the front's assignment (radeon) only if the engine
    can_serve the request's required capability. But _required_cap derived
    that capability from the bare alias 'coe-…-q4_k_m' — no literal 'gguf' —
    so required_capability returned 'transformers' (CUDA-only). radeon is
    gguf-only, failed can_serve, and the request fell through to the default
    engine (nvidia), even though compute_assignment had correctly placed the
    model on radeon (it sees the full '…-q4_k_m.gguf' path).
    
    Resolve the model's configured path in _load_pins (now indexed by the
    .gguf-stripped stem too) and, when the name heuristic yields
    'transformers' but that path is a .gguf, correct the capability to
    'gguf'. whisper/ds4 precedence is unchanged. Combined with the registry
    stem-matching, a bare-alias request now lands on the owning Vulkan/AMD
    engine.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    482e47cf
Name
Last commit
Last update
..
__init__.py Loading commit data...
app.py Loading commit data...
assignment.py Loading commit data...
engine_supervisor.py Loading commit data...
gpu_detect.py Loading commit data...
registry.py Loading commit data...
router.py Loading commit data...