codai · 8d1136c4e60f4f467feecbce55051cfda3f138c0 · nexlab / coderai

LoRA training base cache + thermal averaging + scoped debug flags; live model config · 8d1136c4

Stefy Lanza (nextime / spora ) authored Jun 09, 2026

- LoRA trainer: cache the SD/SDXL base on CPU between jobs so back-to-back
  trainings against the same base skip the disk reload, and the base holds no
  VRAM between jobs (moved to GPU only while training). Fixes the post-training
  eviction failure that forced the next image request into CPU/disk offload.
- Model manager: add register_external_vram_releaser() + last-resort eviction
  pass so a generation can reclaim the trainer's cached base when needed (skips
  while a job runs).
- Thermal: average 3 CPU samples spread across a 3s budget for the resume/
  cooldown decision (CPU sensors swing +/-10C); pause stays single-read to react
  fast. Bounded so it never blocks past 3s of the poll interval.
- Debug flags: --debug-web (uvicorn access lines), --debug-thermal ([thermal]
  [debug] checks), --debug-lora (per-step training loss to terminal); all off by
  default and independent of --debug.
- Admin: lora_train_base_model field on the Models page; saves apply live to the
  running server (build_runtime_kwargs/apply_model_entry_live) with no restart.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

8d1136c4

Name	Last commit	Last update
..
admin		Loading commit data...
api		Loading commit data...
backends		Loading commit data...
broker		Loading commit data...
models		Loading commit data...
openai		Loading commit data...
pydantic		Loading commit data...
queue		Loading commit data...
__init__.py		Loading commit data...
cli.py		Loading commit data...
config.py		Loading commit data...
main.py		Loading commit data...
platform_paths.py		Loading commit data...