- 15 Jun, 2026 1 commit
-
-
Stefy Lanza (nextime / spora ) authored
Township tool (tools/gen_township_fighters.py): - Outcome videos now generate TWO keyframes per outcome (finish + victory), each anchoring its own clip; victory clip uses a dedicated referee shot. - Referee characters: new role on create form, kept out of fighter pools, dressed as officials, attachable per-match and used in victory keyframes. - Per-match referee selection (new-match form + match editor, persisted). - Autogenerate buttons on character/referee, environment and new-match forms (LLM-filled, editable before create) via /profile/autogen + /matches/autogen. - Single-worker generation queue: all coderai-bound jobs (create/regen/train/ match/process) are serialised and surfaced as "queued", with one persistent match-detail monitor replacing the competing per-job pollers (fixes the blinking progress when two jobs were launched at once). coderai: favicon.ico served at /favicon.ico + linked in admin/login templates; bundled township favicon served at /favicon.ico. Also gitignore large packaging/runtime artifact dirs (.packaging-cache/, tmp/). Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
- 14 Jun, 2026 2 commits
-
-
Stefy Lanza (nextime / spora ) authored
rife-ncnn-vulkan and the ffmpeg frame extract/encode were grabbing all cores and ran with no ongoing thermal control. Now: - _cpu_thread_limit() mirrors coderai's half-the-cores cap (honours the OMP_NUM_THREADS set at import). All ffmpeg calls in the upscale + interpolate paths pass -threads N and are CPU-pinned via a sched_setaffinity preexec_fn; rife gets -j capped and the same affinity pin — so neither can saturate 24 cores. - RIFE is one opaque subprocess, so it now runs under a watcher thread that SIGSTOPs it when the GPU/CPU exceeds the configured thermal-high threshold and SIGCONTs it once cooled (the subprocess analogue of the upscaler's per-frame thermal gate), and terminates it on task cancel. Per-frame progress preserved. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Make video enhancement fully AI-on-CoderAI and rework township outcomes. Upscaling (Real-ESRGAN / SD upscalers): - Support diffusers-style .safetensors weights + config.json (e.g. hlky/RealESRGAN_*), not just classic .pth; infer RRDBNet arch/scale from config. fp16 + tiling for performance. - AI-or-fail: no ffmpeg fallback. Auto-select a configured upscaler when the request omits a model (find_capable_model). - Fix a registry-pollution bug: cache upscalers in a private dict, never under a synthetic 'upscale:<id>' key in multi_model_manager.models (which made a later request_model() resolve/reload the bogus key -> 400). - Per-frame progress + a first-class "upscale" task (pause/cancel/thermal), with a periodic thermal re-check through the frame loop. Interpolation (RIFE): - AI-or-fail: removed the ffmpeg minterpolate fallback. Resolve the rife-ncnn-vulkan binary + bundled model robustly, pass exact -n frame count, and pin -g to the SAME GPU CoderAI uses (matched by CUDA device name, not a hardcoded index). Progress + "interpolate" task + thermal guard. Township generator: - One draw per match (not per fighter); longer, configurable outcome videos built as a finish -> victory two-shot sequence; richer, more brutal, camera-aware prompts (finish/victory templates editable on the Prompts page). - Stream large results via response_format=url instead of base64-in-JSON; per-frame progress for both upscale and interpolate. Configurable temp dir: - New --tmp CLI flag and config.tmp_dir (+ admin Settings field, applied live). Sets tempfile.tempdir and TMPDIR/TMP/TEMP so all scratch (frame extraction, upscaling, interpolation) and child processes use it — fixes "[Errno 28] No space left on device" when /tmp is small. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
- 13 Jun, 2026 7 commits
-
-
Stefy Lanza (nextime / spora ) authored
The 'full' match-regen scope now (1) removes this match's existing keyframes, clip videos, outcome videos and finals up front, so a re-plan that changes the clip count can't leave orphaned files that would get globbed into the reassembled finals; and (2) runs strictly in order — prompts -> keyframes -> clips + outcomes (assemble_finals=False) -> assemble finals as the explicit last phase (4/4) via _reassemble_finals. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Adds a `full` scope to the per-match action handler that rebuilds one match in order: re-plan all fight-clip + outcome prompts (text model) → regenerate every keyframe (image model) → re-render all clips + outcomes and reassemble finals (video model), with live per-phase progress. Other matches are untouched. Wires the confirm dialog, the match-detail button, and the /matches/render scope allowlist. Fix: the `full` confirm label used an apostrophe (match's) inside the single-quoted JS string of the plain triple-quoted _match_js block, which collapsed to a real quote and broke the whole script (reMatch undefined). Reworded to avoid it; verified the rendered JS parses with node --check. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Downloads: run each model download in a clean `python -m codai.admin.download_worker` subprocess streaming JSON progress, so the Stop button reliably cancels by terminating the process (HF parallel/Xet chunk transfers ignore in-thread flags). Adds download-cancel-all. Avoids multiprocessing spawn, which re-imports the server launcher as __main__. VACE extension: detect WanVACEPipeline; new 'extend' mode + cond_frames request field condition on the previous chained part's frame tail (real motion -> forward continuation, fixing the single-frame boomerang). _build_vace_conditioning builds the (video, mask) pair; _snap_wan_frames enforces 4k+1; only the freshly generated frames are returned. VACE also serves keyframe i2v / t2v via masking; i2v/t2v fallbacks skipped for it. Township auto-uses extend for chained parts when the model is VACE. Fight prompts: full-MMA system prompt + rotating per-clip action focus (kicks/knees/elbows/takedowns/ground/submissions) and occasional blood, rebalanced fallback templates, keyframe wardrobe enforcement. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
- township: new playback_fps (0 = same as generation fps). coderai uses fps only for the mp4 encode (Wan generates a fixed frame count), so a higher playback fps plays the same frames faster (less slow-motion). The planner counts clip duration as nf/playback_fps so the finals reach their target length at the real play speed. Wired through config/CLI (--playback-fps)/web form/all call sites. - main.py: suppress /v1/{video,images,audio}/progress access-log lines unless --debug-web is set (matching the existing /v1/loras/progress filter). Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
- ratelimit.py: exempt /v1/video, /v1/audio and /v1/loras progress polls from BOTH auth and rate limiting (shared _PROGRESS_PATHS), matching /v1/images. The township script polls /v1/video/progress ~1/s during a clip; being rate-limited, those polls ate the budget so the generation POST got 429'd (clip failed) and the polls themselves 429'd (stuck step bar). - township _render_once: a 429 now backs off and retries the same render (up to 40 attempts, capped 60s) instead of abandoning the clip; covers clips, chained parts and outcomes. Genuine errors still fail fast. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
coderai: - Thermal: configurable proactive CPU soft-throttle (engage temp + max per-step sleep) that gently slows generation in a warm band so it rarely hits the hard pause; CPU-only, hard pause always takes precedence. Tasks page shows a soft-throttle banner + per-task badge (live, gated on a running task). - Acceleration hot-swap: toggling/changing a model's acceleration now evicts the loaded model (manager.unload_model) so the next request reloads with the new setting — no restart. (acceleration is fused at load time.) - Models UI: cascading distill-LoRA pickers — new /admin/api/accel-loras scans the cache for distill repos; pick the distill model, then its high/low (or single) LoRA from dropdowns. Presets now also fill the high/low fields. - Tasks queue summary now reflects ALL model activity (derived from the unified task list), not just queue-manager requests — fixes the stuck "0 active". - images.py: proactive eviction no longer skipped by a NameError (model_key). township (tools/gen_township_fighters.py): - Per-clip/outcome/keyframe progress now shows real diffusion-step progress (polls /v1/{images,video}/progress) on the CLI spinner and the web step bars, including "shot part N/total" for chained single shots. - Chained-shot concat re-encodes (CFR) instead of stream-copy, fixing the "first half is a static image" freeze at the part boundary. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Township fight-video generator (tools/gen_township_fighters.py): - 16:9 native resolution: default 832x480 video + matching keyframes (configurable video_size); square 512 was off-distribution for Wan2.2. - Split-and-chain rendering: single-render cap (default 50f); clips/outcomes longer than the cap render as chained sub-renders (last frame seeds the next) concatenated into one continuous shot, parts discarded — Matches page unchanged. Planned-clip ceiling raised to 480f. - Separate outcome min/max frames (default 40/70), same split-chain path. - Configurable short/long final-assembly intervals; clip count derives from the long target + fps so the long cut always fills. - Prompt continuity: deterministic wardrobe+environment clause on every clip, replan clip and outcome; stronger LLM system prompts; updated default suffix. - Run page: configurable fighter/environment counts + reference-image counts; moved "Include female fighters" into the Characters card; suggested steps/rank/weight guide table; per-profile LoRA train defaults now mirror the run-page config (lora_* for characters, env_lora_* for environments). - Matches: "Remove match completely" (files + keyframes + prompts.json entry). - Renamed the prompts step to "Generate matches prompts"; removed the gallery page. coderai: - images.py: fix NameError ('model_key' undefined) that silently skipped proactive VRAM eviction before every image load. - thermal.py: cross-worker cooldown — when one generation pauses for heat, all parallel generations now back off until the resume threshold; add process-tree CPU% reader (100%/core). - video.py/manager.py/main.py: offload ref-leak fix, offloaded-load VRAM guard, wire --pipeline-cache flags. - Tasks page CPU tile shows process-tree CPU% scaled to cores. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
- 12 Jun, 2026 3 commits
-
-
Stefy Lanza (nextime / spora ) authored
VRAM estimation (manager.py): - Weight the effective quant multiplier by REAL per-component parameter shares (new _component_param_shares scans safetensors by component folder) instead of a blind 70/30 split. Wan2.2 is 99.6% quantizable (two 14B experts + text encoder 4-bit, only the 0.13B VAE dense), so the old 0.475x multiplier inflated ~25.8 GB -> 42.7 GB and forced needless offload. Now ~0.28x -> ~25.8 GB. VAE forced dense (conv-only, bnb can't quantize). Auto offload decision (video.py): - 'auto': when peak footprint exceeds free VRAM, go straight to `model` CPU offload (active component on GPU, near full-GPU speed) — no full-GPU gamble, no slow balanced+disk path. - 'auto-borderline' (new mode): same, except a marginal overshoot (<=3 GB) tries full-GPU first to keep both experts resident and use free VRAM, falling back to model offload on OOM. Acceleration LoRA (acceleration.py + video.py): - Keep the distill/Lightning LoRA as an ACTIVE RUNTIME ADAPTER instead of fusing. Fusing into CPU-offloaded bitsandbytes 4-bit weights triggers a dequant->merge->requant per Linear on the CPU — minutes/hours per expert, appearing to hang (high CPU, empty VRAM). Runtime adapters apply at forward time on the GPU at negligible cost and natively cover transformer_2. - _sync_video_loras preserves the accel adapters across per-request LoRA swaps and re-includes them in every set_adapters; _unload_video_loras deletes only per-request adapters, keeping accel. UI (models.html): - Add "Auto borderline-aware" offload strategy option + updated hint. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Merge feature/tasks-quant-thermal: task mgmt, quantization, Wan2.2 video fixes, pipeline cache, smarter offload Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Wan2.2 A14B (dual-expert) generation fixes: - Fuse the Lightning distill LoRA into BOTH experts (transformer + transformer_2); diffusers' fuse_lora defaults to ["transformer"] only, which left the low-noise expert undistilled → 4-step clips collapsed to a solid colour. Also load per-request fighter/env LoRAs into both experts. - Pre-configure the wan22_lightning_4step preset with the local high/low-noise LoRAs (lora_high/lora_low), used when acceleration is enabled, ignored when not; surfaced in the Acceleration UI. - Safety net: only apply the preset's low step count when the distill LoRA actually fused, else fall back to safe steps. - Skip bitsandbytes/quanto quant for the VAE (conv-only → "no linear modules"). VRAM / offload: - Strategy auto-selection actually fires now ('auto' is normalised, not passed through as a no-op) and no longer double-counts the runtime/accel reserve. - Graceful OOM degrade ladder: full-GPU → balanced @ configured% → 80 → 60 → 40 → sequential → disk, respecting the model's balanced_gpu_percent as the starting cap. Expose 'balanced' as a selectable offload strategy. Pipeline disk cache (--pipeline-cache / --rebuild-pipeline-cache): - Cache the quantized base pipeline to disk and reload it on later starts, skipping re-download/re-quantization; accel LoRA re-fused per load. Fail-safe with self-healing invalidate-and-rebuild. Tasks / misc: - Show model loading as a (non-cancellable, non-pausable) Tasks entry. - Filter the Tasks-page pollers from the access log unless --debug-web. - Township gen script: per-image keyframe progress (no longer all-or-nothing). Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
- 11 Jun, 2026 5 commits
-
-
Stefy Lanza (nextime / spora ) authored
Tasks / queue management: - Central in-memory task registry with cooperative cancel, pause/resume, and step progress across image/video/audio/text generation + LoRA training - Tasks admin page (live 2s poll): cancel, interrupt, pause/resume, restart, remove; done jobs auto-drop from the list; bounded persisted job history - Disable interrupted-training recovery via --no-resume-jobs + settings toggle Quantization / acceleration: - TurboQuant embedding vector quantization (data-free, inner-product preserving): built-in NumPy backend + optional turboquant-py library, selectable per embedding model; /v1/embeddings `quantization` param - llama.cpp KV-cache quantization (cache_type_k/v) for GGUF text models, configurable in the Models UI Hardware telemetry: - Thermal cooldown state surfaced on the Tasks page (banner + per-task badge) - Live CPU/GPU/RAM/VRAM usage + temperature panel via /admin/api/system-stats Docs: API documentation gaps/accuracy pass + Swagger overhaul; DISTRIBUTION.md implementation spec. Plus I2V LoRA training channel-mismatch fix. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
The pipeline class is selected from the request mode, which can disagree with the model's real capability (transformer input channels), causing a hard channel-mismatch crash. Detect and degrade gracefully for Wan: - ti2v/i2v request on a t2v model (transformer in_channels=16): rebuild as a plain WanPipeline and run t2v with the keyframe dropped. - t2v request on an i2v model (in_channels=36): rebuild as WanImageToVideoPipeline (image_encoder/processor are optional) and seed a neutral gray frame so the prompt still drives the clip. Both rebuild a sibling pipeline reusing the SAME components, so fused acceleration and per-request LoRAs on the shared transformer carry over with no reload; the view is cached on the pipe so repeated clips reuse it and _sync_video_loras' adapter dedup stays intact. Helpers: _wan_in_channels(), _maybe_t2v_fallback(), _maybe_i2v_fallback(). Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Previously a per-request LoRA could only be a local path or HF id, which assumed the client shared the server's filesystem. Add a content-addressed store so remote clients can supply LoRAs by value or handle. Request `loras` spec now accepts (resolved server-side, in priority): id "name:<registered>" -> a LoRA trained on this server (path-independent) id "sha256:<hex>" -> a previously uploaded blob file/data (base64) -> inline weights, cached in the blob store url -> server downloads (cached by content hash) model/path -> legacy local path / HF id (unchanged) - loras.py: blob store (save_lora_blob / lora_blob_exists / _lora_blob_path), resolve_lora_ref(), resolve_request_loras() (in-place -> clean 400 on a missing blob / unknown name). New POST /v1/loras/upload (multipart / JSON base64 / raw, dedup) and GET /v1/loras/blob/{hash} existence check. - LoraConfig / VideoLoraConfig: model now optional; add id/url/file/data/path. - image + video handlers resolve_request_loras() before model work, so signature dedup / VRAM reserve / load_lora_weights read lora.model as before. - gen_township_fighters.py: reference trained LoRAs by id "name:<registered>" (derived from the server path) with the raw path kept as a co-located fallback, so the script works client/server-split. Also harden video load: float(cfg.get('balanced_gpu_percent', 80)) crashed on an explicit null (admin UI writes null for blank fields); use `or 80`. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
- log.py: _redact_blobs() recursively truncates data-URI / base64 fields (init_image, image, mask, character_references, …) to their first 48 chars in the FULL REQUEST DEBUG dump, so a clip request no longer prints tens of KB of base64. Prompts and normal fields are left intact (base64-charset check excludes anything with spaces/punctuation). - requirements.txt: add ftfy (required by the diffusers Wan/T5 prompt_clean path; its absence surfaced as "name 'ftfy' is not defined" at generation). Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Per-model `acceleration` config block fuses a distillation LoRA into the pipeline at load and supplies low step-count / guidance defaults at generation time, for a 5-10x speedup. Covers video (Wan), image diffusers (SD/SDXL), and sd.cpp (step/cfg defaults + <lora:> prompt injection). - New codai/models/acceleration.py: preset catalog (ACCEL_PRESETS), resolve_acceleration(), apply_accel_to_pipeline() (load->fuse->unload so it stays orthogonal to per-request character/env LoRAs), accel_call_defaults(). - video.py: fuse accel LoRA after load; _generate_video / _generate_sdcpp_video use preset steps/guidance (request always wins). - images.py: _apply_image_acceleration on both diffusers load paths; _generate_image and _generate_with_sdcpp honour preset steps/guidance. - main.py: surface `acceleration` as a first-class runtime kwarg. - admin: persist `acceleration`; new GET /admin/api/accel-presets; models.html Acceleration/Distillation card (preset dropdown + manual override). Also fix a latent null-trap: float(cfg.get('balanced_gpu_percent', 80)) crashed when the config stored an explicit null (written by the admin UI for blank fields) since .get(key, default) returns the stored None. Use `or 80`. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
- 10 Jun, 2026 11 commits
-
-
Stefy Lanza (nextime / spora ) authored
Outcome scenes belong to a match, so their keyframe (image model) and video clip (video model) now attach the environment + BOTH match fighters' LoRAs, matching the fight clips — previously they sent only the single named fighter. Resolves the match pair from the in-memory fight_plan, falling back to the saved prompts.json so a single-outcome regen (fight_plan == []) still gets both. Legacy outcomes with no resolvable match keep the single fighter. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
New keyframes-missing render scope fills in only the keyframes that don't exist yet for a whole match (clips + outcomes) — existing ones are kept and nothing is re-rendered. Buttons added on the keyframes page and the match detail action row; finishes immediately when none are missing. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
- New /match/keyframes page (
🖼 Keyframes ▸ from the match page): thumbnail grid of every clip + planned-outcome keyframe, each with per-tile regenerate (image model) and delete, plus match-level Regenerate all / Clear all. Live progress bars; reloads on completion with mtime cache-bust. - Regenerate all (whole match) now also covers this match's OUTCOME keyframes, not just clips. - Clear all now removes planned outcome keyframes too (not only ones with a rendered video), keeping it symmetric with Regenerate all. - Fix outcome keyframe stem on the page: use the plan entry's actual match_name (None → legacy "<fighter>_<outcome>") so it matches the file the generator writes, instead of a "<match>_<fighter>_<outcome>" that is never created (outcome keyframes were showing "no keyframe" after a successful regen). Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
The render endpoint's scope allowlist rejected the new keyframe-regen scopes with a 400 before reaching the handler. Add them so the "Regenerate keyframes" / per-clip kf↻ buttons work. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Server (codai/api/loras.py): - /v1/loras/train gains wait (default True) + session; wait=false detaches the job and returns a job_id, avoiding HTTP read-timeouts on multi-hour video trainings. - Disk-persisted job registry keyed by job_id (carries session). Progress endpoint serves ?job=<id> / ?session=<tok> so a client only ever sees its own job — no cross-user spillover. Jobs left mid-flight at startup are marked interrupted. - Mid-training PEFT checkpoints (SD1.5/SDXL/Wan) + train_state.json; a resubmit resumes from the last step when base/target/rank (and session) match, so a reboot no longer throws away hours of Wan training. Township (tools/gen_township_fighters.py): - Async training: per-run session token + persisted per-LoRA job_id; polls by job_id, re-attaches to a running server job after a restart, resubmits an interrupted one (server resumes from checkpoint). - Dedicated train timeouts (24h video / 4h image). - Match page: regenerate/clear keyframes (match-level + per-clip/outcome) via new keyframes/keyframe render + delete scopes. tools/videogen.py: mirror the session-token + job-id recovery helpers. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
- cuda/vulkan backend improvements and config plumbing - API updates across characters, text, environments, audio, embeddings, tts - admin chat/settings template updates - add hf_loading helper, video request fields, platform paths - new docs (CODERAI_API_DOCUMENTATION.md) and tools (review_outputs, video_dubber) - ignore generated township_output/ Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
The server exposes one global training progress (jobs run one at a time via the queue), so every queued card was mirroring the active job's progress. Restore the name match: a card shows real progress only when the global progress reports ITS LoRA; otherwise it shows "queued — '<other>' training first… (elapsed)". Keeps the elapsed-timer handling so a long model load still looks alive rather than frozen. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Extend the cross-job cache from just the transformer expert(s) to the full Wan stack: VAE, tokenizer and text encoder are kept on CPU between jobs (moved to GPU only while encoding), experts stay on GPU. A back-to-back training against the same base now reloads nothing from disk — previously the small VAE/text-encoder still reloaded each job. The releaser and error path clear all cached components. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
- Match detail page: Fighter 1/2 and Environment are now dropdowns of saved profiles + the built-in pools, so a match's fighters/location can be switched before re-rendering (Save match persists; env_desc updated with env). - Video LoRA status on profile cards: resolve the active video model the same way training/generation do (configured, else auto-picked, cached), and show "trained for: <slug>" even when the active model can't be resolved so a trained LoRA never looks untrained. - Video LoRA training progress: show a ticking elapsed timer during the long model-load/preparing phase (the progress endpoint is starved while a large model loads), drop the strict name filter that could freeze the bar at 6%, and use step-based % once training begins. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Cache the Wan transformer expert(s) between consecutive trainings against the same base (keyed by base_path+quantize) so a back-to-back job skips the very slow reload (tens of minutes for A14B). Only this job's adapter + gradient-checkpoint hooks are removed at teardown; the base transformer(s) stay resident. Since 4-bit weights can't move to CPU, they hold GPU VRAM between jobs — so the external VRAM releaser now drops the Wan cache too when a generation needs the GPU, and the error path clears both caches. Also report training progress every step (cheap dict update) instead of every 10, so the web UI bar advances smoothly once steps begin. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
- 09 Jun, 2026 11 commits
-
-
Stefy Lanza (nextime / spora ) authored
torch.rand defaults to fp32, so the rectified-flow interpolation promoted x_t to fp32 while the patch-embedding Conv3d stays bf16 (bitsandbytes 4-bit quantizes only Linear layers), raising "Input type (float) and bias type (BFloat16) should be the same". Compute the interpolation in fp32 then cast x_t/target back to the model compute dtype, and pass timestep as fp32 (Wan casts it internally). Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Suppressing the whole uvicorn.access logger hid every request. Instead add a logging.Filter that drops only /v1/loras/progress (polled ~1.5s by the web UI); all other API request lines keep logging. A logger filter also survives uvicorn's configure_logging(), which resets the access logger LEVEL at startup (that reset is what defeated the earlier setLevel(WARNING)). --debug-web shows everything. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Several related changes accumulated in this session: Wan video LoRAs (additive — image LoRAs kept for keyframes): - New per-model maps video_loras.json/env_video_loras.json keyed name -> {model_slug: path}; on-disk names tagged with the video model slug. - Video requests attach the video LoRA matching the current video model's slug; image LoRAs stay on the keyframe path only. - Per-profile "Train video LoRA" button + step button + full-run checkbox + --video-loras/--only-video-loras; batch + CLI wiring; client target="video". Final/outcome enhance (upscale 2x/4x + raise FPS): - _enhance_video_file + Phase C stage; --upscale-factor/--fps-multiplier and Run-page selects; match-page Enhance card with live progress bars. Match page UX: - Video previews enlarge + center on play (video lightbox). - Match render shows global + per-clip progress bars, surviving reload. Outcome fixes: - Re-rendering a match's outcomes now resolves legacy per-fighter outcomes (no match_name) by fighter membership, and forces them into the match's environment so a match stays in one location. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Adds a target="video" path that trains a LoRA directly against the configured video model so it loads on the video pipeline (image LoRAs can't apply to a Wan DiT). _train_wan: encodes stills as 1-frame latents via the Wan 3D VAE (latents_mean/std normalized), encodes the prompt via UMT5, loads the transformer expert(s) in 4-bit (QLoRA) with gradient checkpointing, adds PEFT LoRA to the attention projections, and trains a rectified-flow loss. Handles Wan2.2's dual experts (transformer + transformer_2) via boundary_ratio routing, and saves both expert LoRA layers (falls back to high-noise only on older diffusers). Reuses the queue, eviction, thermal checkpoints and progress. LoraTrainRequest gains target/quantize_4bit/num_frames; base-path resolution gains a "video" category so it resolves the video model entry. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
An SD/SDXL image LoRA loaded onto a Wan video transformer matches no keys, so set_adapters() raised "not in the list of present adapters: set()" and aborted the whole request. Now each adapter is checked against the pipe's PEFT-capable components after load; ones that registered nothing (wrong architecture) are skipped with a clear message and generation proceeds with whatever is compatible (or no LoRA). The request signature is still cached so the futile load isn't retried every clip. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
The video model stays cached across clips, but LoRAs were loaded from disk and fully unloaded on every clip — wasted I/O and fusion latency, since consecutive clips of a match request the identical fighter+env LoRAs. Cache the active LoRA signature on the pipe (_coderai_active_loras) and only swap when it changes: a request with the same set reuses the loaded adapters, a different set (or none) triggers a clean unload + reload. Replaces the apply-then-unload-every-clip path. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Match re-render previously showed only a frozen text status. Now: - _stage_videos_render reports per-clip start/end via clip_cb and overall done/total via progress_cb. - _run_match_job seeds an items list and maps clip state into the job record. - The match detail page renders a global progress bar plus one bar per clip/output (pending/rendering/done/failed), and these survive a reload via /active-jobs + resumeMatchJobs. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Route /v1/loras/train through queue_manager.acquire/release with a constant "lora-train" model key. Concurrent training requests now queue and run one after another (serialized by the scheduler, protecting the shared base cache) and participate in the same scheduling/metrics as every other model request, instead of being rejected with 409. _train_lock is kept as the in-flight signal that _release_base_cache checks. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Tag regen/train jobs with kind/name/jtype and add a /active-jobs endpoint listing running ones. On Characters/Environments page load, resumeActiveJobs() re-attaches the live progress display to any in-flight job for the matching card, so a reload (or reopening the tab) keeps showing progress until completion. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
Add a shared image lightbox (injected into every page via _page) and make the reference-image thumbnails on the Characters/Environments pages clickable to view them full-size. Click the backdrop or press Escape to close. Delete button still works independently. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-
Stefy Lanza (nextime / spora ) authored
- LoRA trainer: cache the SD/SDXL base on CPU between jobs so back-to-back trainings against the same base skip the disk reload, and the base holds no VRAM between jobs (moved to GPU only while training). Fixes the post-training eviction failure that forced the next image request into CPU/disk offload. - Model manager: add register_external_vram_releaser() + last-resort eviction pass so a generation can reclaim the trainer's cached base when needed (skips while a job runs). - Thermal: average 3 CPU samples spread across a 3s budget for the resume/ cooldown decision (CPU sensors swing +/-10C); pause stays single-read to react fast. Bounded so it never blocks past 3s of the poll interval. - Debug flags: --debug-web (uvicorn access lines), --debug-thermal ([thermal] [debug] checks), --debug-lora (per-step training loss to terminal); all off by default and independent of --debug. - Admin: lora_train_base_model field on the Models page; saves apply live to the running server (build_runtime_kwargs/apply_model_entry_live) with no restart. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
-