• Stefy Lanza (nextime / spora )'s avatar
    whisper: account a running runner as a loaded model for VRAM eviction · 2a214215
    Stefy Lanza (nextime / spora ) authored
    Starting a whisper-server runner loads the gguf onto the GPU, but it was
    invisible to the VRAM-eviction logic — it never evicted others to make room,
    recorded no footprint, and (lacking a cleanup()) couldn't itself be evicted.
    
    - WhisperServerManager.cleanup() -> stop(), so _evict_one/unload_model can
      free its VRAM like any other model.
    - MultiModelManager.start_whisper_server(): estimate the gguf footprint, evict
      other models if free VRAM is short, start the subprocess, and register it in
      models/models_in_vram/_measured_vram_gb (active_in_vram). It's now both a
      trigger for eviction and an eviction candidate.
    - stop_whisper_server(): stop + clear all that accounting (frees VRAM).
    - Routed every start/stop through these: on-request transcription, engine
      startup pre-load, admin model-load (Load button) and model-unload/disable.
    
    So: starting a runner = a model load (evicts as needed); unloading = frees VRAM.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    2a214215
Name
Last commit
Last update
..
__init__.py Loading commit data...
_film_net.py Loading commit data...
_rife_ifnet.py Loading commit data...
app.py Loading commit data...
archive.py Loading commit data...
audio_backends.py Loading commit data...
audio_clean.py Loading commit data...
audio_gen.py Loading commit data...
audio_stems.py Loading commit data...
characters.py Loading commit data...
custom_pipelines.py Loading commit data...
ds4_worker.py Loading commit data...
embeddings.py Loading commit data...
environments.py Loading commit data...
faceswap.py Loading commit data...
images.py Loading commit data...
log.py Loading commit data...
loras.py Loading commit data...
parler_worker.py Loading commit data...
pipelines.py Loading commit data...
prompt_cache.py Loading commit data...
ratelimit.py Loading commit data...
spatial.py Loading commit data...
state.py Loading commit data...
text.py Loading commit data...
transcriptions.py Loading commit data...
tts.py Loading commit data...
tts_backends.py Loading commit data...
urlutils.py Loading commit data...
video.py Loading commit data...
voice_clone.py Loading commit data...
voice_convert.py Loading commit data...