• Stefy Lanza (nextime / spora )'s avatar
    multi-engine: live /v1/models on config change + accept gguf-stem ids · 79c2e44d
    Stefy Lanza (nextime / spora ) authored
    Two bugs made a freshly-configured model unusable until a full restart on a
    multi-engine node:
    
    1. Name mismatch: list_models advertises a gguf's filename WITHOUT .gguf as an
       id, but get_all_allowed_identifiers only allowed the name WITH .gguf, so a
       request using the id from /v1/models was 404'd as "not an allowed model".
       Now the .gguf-stripped stem is allowed too.
    
    2. Stale per-engine assignment: each engine's /v1/models is filtered by the
       assignment set fixed at startup, and secondary engines never re-read
       models.json — so an added/removed model didn't show up or route until
       restart. The front now watches models.json mtime, recomputes the
       assignment, updates its router, and pushes it to every engine via a new
       internal POST /internal/reload-config (re-reads models.json +
       set_assigned_models). /v1/models and routing now reflect add/remove live.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    79c2e44d
engine_supervisor.py 29.6 KB