• Stefy Lanza (nextime / spora )'s avatar
    Add task management, quantization, and hardware telemetry · 8ad15128
    Stefy Lanza (nextime / spora ) authored
    Tasks / queue management:
    - Central in-memory task registry with cooperative cancel, pause/resume,
      and step progress across image/video/audio/text generation + LoRA training
    - Tasks admin page (live 2s poll): cancel, interrupt, pause/resume, restart,
      remove; done jobs auto-drop from the list; bounded persisted job history
    - Disable interrupted-training recovery via --no-resume-jobs + settings toggle
    
    Quantization / acceleration:
    - TurboQuant embedding vector quantization (data-free, inner-product
      preserving): built-in NumPy backend + optional turboquant-py library,
      selectable per embedding model; /v1/embeddings `quantization` param
    - llama.cpp KV-cache quantization (cache_type_k/v) for GGUF text models,
      configurable in the Models UI
    
    Hardware telemetry:
    - Thermal cooldown state surfaced on the Tasks page (banner + per-task badge)
    - Live CPU/GPU/RAM/VRAM usage + temperature panel via /admin/api/system-stats
    
    Docs: API documentation gaps/accuracy pass + Swagger overhaul; DISTRIBUTION.md
    implementation spec. Plus I2V LoRA training channel-mismatch fix.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    8ad15128
Name
Last commit
Last update
..
audiogenrequest.py Loading commit data...
embedrequest.py Loading commit data...
imagerequest.py Loading commit data...
textrequest.py Loading commit data...
transcriptionrequest.py Loading commit data...
videorequest.py Loading commit data...