codai/admin/templates/models.html · f21c61855bb7bcaba8be93a17e661a025e29555f · nexlab / coderai

LoRA training base cache + thermal averaging + scoped debug flags; live model config · 8d1136c4

Stefy Lanza (nextime / spora ) authored Jun 09, 2026

- LoRA trainer: cache the SD/SDXL base on CPU between jobs so back-to-back
  trainings against the same base skip the disk reload, and the base holds no
  VRAM between jobs (moved to GPU only while training). Fixes the post-training
  eviction failure that forced the next image request into CPU/disk offload.
- Model manager: add register_external_vram_releaser() + last-resort eviction
  pass so a generation can reclaim the trainer's cached base when needed (skips
  while a job runs).
- Thermal: average 3 CPU samples spread across a 3s budget for the resume/
  cooldown decision (CPU sensors swing +/-10C); pause stays single-read to react
  fast. Bounded so it never blocks past 3s of the poll interval.
- Debug flags: --debug-web (uvicorn access lines), --debug-thermal ([thermal]
  [debug] checks), --debug-lora (per-step training loss to terminal); all off by
  default and independent of --debug.
- Admin: lora_train_base_model field on the Models page; saves apply live to the
  running server (build_runtime_kwargs/apply_model_entry_live) with no restart.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

8d1136c4

models.html 155 KB

Replace models.html