-
Stefy Lanza (nextime / spora ) authored
Adds a target="video" path that trains a LoRA directly against the configured video model so it loads on the video pipeline (image LoRAs can't apply to a Wan DiT). _train_wan: encodes stills as 1-frame latents via the Wan 3D VAE (latents_mean/std normalized), encodes the prompt via UMT5, loads the transformer expert(s) in 4-bit (QLoRA) with gradient checkpointing, adds PEFT LoRA to the attention projections, and trains a rectified-flow loss. Handles Wan2.2's dual experts (transformer + transformer_2) via boundary_ratio routing, and saves both expert LoRA layers (falls back to high-noise only on older diffusers). Reuses the queue, eviction, thermal checkpoints and progress. LoraTrainRequest gains target/quantize_4bit/num_frames; base-path resolution gains a "video" category so it resolves the video model entry. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
9071e839