- 25 Feb, 2026 15 commits
-
-
Stefy Lanza (nextime / spora ) authored
When a model has component folders (transformer, vae, etc.) but no model_index.json at the root level, the loading would fail. This fix adds: 1. Base model fallback strategy: - Detect model type from model ID (ltx, wan, svd, cogvideo, mochi) - Load the known base model first - Then attempt to load fine-tuned components from the target model 2. Component detection and loading: - List files in the repo to find component folders - Load transformer, VAE components from the fine-tuned model - Fall back to base model if component loading fails 3. Better error messages: - Clear indication of what went wrong - Suggestions for alternative models This fixes loading of models like Muinez/ltxvideo-2b-nsfw which have all component folders but are missing the model_index.json file.
-
Stefy Lanza (nextime / spora ) authored
System Load Detection: - Added get_system_load() method to detect CPU, memory, and GPU utilization - CPU load >80% adds 50% slowdown, >50% adds 20% slowdown - Memory >90% adds 80% slowdown, >75% adds 40% slowdown - GPU utilization >80% adds 60% slowdown, >50% adds 30% slowdown - Warning displayed when system is under heavy load More Conservative Base Estimates: - WanPipeline: 3.0s → 5.0s/frame - MochiPipeline: 5.0s → 8.0s/frame - SVD: 1.5s → 2.5s/frame - CogVideoX: 4.0s → 6.0s/frame - LTXVideo: 4.0s → 6.0s/frame - Flux: 8.0s → 12.0s/frame - Allegro: 8.0s → 12.0s/frame - Hunyuan: 10.0s → 15.0s/frame - OpenSora: 6.0s → 10.0s/frame More Conservative GPU Tier Multipliers: - extreme: 1.0 → 1.2x - high: 1.5 → 2.0x - medium: 2.5 → 3.5x - low: 4.0 → 5.0x - very_low: 8.0 → 10.0x More Conservative Model Loading Times: - Huge (>50GB): 10min → 15min - Large (30-50GB): 5min → 8min - Medium (16-30GB): 3min → 5min - Small (<16GB): 1.5min → 3min - Download estimate: 15s/GB → 30s/GB Additional Safety Margins: - Overhead increased from 30% to 50% - I2V processing overhead increased from 20% to 30% - Added 20% safety margin for unpredictable factors - Load factor applied to model loading time as well
-
Stefy Lanza (nextime / spora ) authored
- Increased base time per frame for all models (2-4x more realistic) - Added LTXVideoPipeline specific estimate (4.0s/frame) - Increased model loading times (90s-10min based on model size) - Added realistic image model loading times for I2V mode - Added image generation time based on model type (Flux, SDXL, SD3) - Added 30% overhead for I/O and memory operations - Added 20% extra time for I2V processing - Increased resolution scaling factor to 1.3 (quadratic relationship) - Increased download time estimate to 15s/GB with 2min cap The previous estimates were too optimistic and didn't account for: - Full diffusion process (multiple denoising steps) - Model loading from disk/download - Memory management overhead - I2V-specific processing time - Image model loading for I2V mode
-
Stefy Lanza (nextime / spora ) authored
Features: - Modern web UI with all generation modes (T2V, I2V, T2I, I2I, V2V, Dub, Subtitles, Upscale) - Real-time progress updates via WebSocket - File upload for input images/videos/audio - File download for generated content - Background job processing with progress tracking - Job management (cancel, retry, delete) - Gallery for browsing generated files - REST API for programmatic access - Responsive design for desktop and mobile Backend (webapp.py): - Flask + Flask-SocketIO for real-time updates - Background job processing with threading - File upload/download handling - Job state persistence - REST API endpoints Frontend: - Modern dark theme UI - Mode selection with visual cards - Form with all options and settings - Real-time progress modal with log streaming - Toast notifications - Keyboard shortcuts (Ctrl+Enter to submit, Escape to close) Documentation: - Updated README.md with web interface section - Updated EXAMPLES.md with web interface usage - Updated requirements.txt with web dependencies
-
Stefy Lanza (nextime / spora ) authored
- Apply same 404 fallback strategy to deferred I2V model loading - Try DiffusionPipeline as fallback when model_index.json not found - Ensures all model loading paths have consistent error handling
-
Stefy Lanza (nextime / spora ) authored
Model Loading Fixes: - Add fallback loading when model_index.json returns 404 - Try alternative paths (diffusers/, diffusion_model/, pipeline/) - Try generic DiffusionPipeline as fallback - Check HuggingFace API for actual file structure - Load from subdirectories if model_index.json found there - Apply same fallback to I2V image model loading Time Estimation Improvements: - Add hardware detection (GPU model, VRAM, RAM, CPU cores) - Detect GPU tier (extreme/high/medium/low/very_low) - Calculate realistic time estimates based on GPU performance - Account for VRAM constraints and offloading penalty - Consider distributed/multi-GPU setups - More accurate model loading times (minutes, not seconds) - Account for resolution impact (quadratic relationship) - Add 20% overhead for memory management - Print hardware info for transparency GPU Tier Performance Multipliers: - Extreme (RTX 4090, A100, H100): 1.0x - High (RTX 4080, RTX 3090, V100): 1.5x - Medium (RTX 4070, RTX 3080, T4): 2.5x - Low (RTX 3060, RTX 2070): 4.0x - Very Low (GTX 1060, etc.): 8.0x
-
Stefy Lanza (nextime / spora ) authored
Features Added: - Video dubbing with voice preservation (--dub-video) - Automatic subtitle generation (--create-subtitles) - Subtitle translation (--translate-subtitles) - Burn subtitles into video (--burn-subtitles) - Audio transcription using Whisper (--transcribe) - Text translation using MarianMT models New Command-Line Arguments: - --transcribe: Transcribe audio from video - --whisper-model: Select Whisper model size (tiny/base/small/medium/large) - --source-lang: Source language code - --target-lang: Target language code for translation - --create-subtitles: Create SRT subtitles from video - --translate-subtitles: Translate subtitles to target language - --burn-subtitles: Burn subtitles into video - --subtitle-style: Customize subtitle appearance - --dub-video: Translate and dub video with voice preservation - --voice-clone/--no-voice-clone: Enable/disable voice cloning MCP Server Updates: - Added videogen_transcribe_video tool - Added videogen_create_subtitles tool - Added videogen_dub_video tool - Added videogen_translate_text tool Documentation Updates: - Updated SKILL.md with dubbing/translation section - Updated EXAMPLES.md with comprehensive examples - Updated requirements.txt with openai-whisper dependency Supported Languages: English, Spanish, French, German, Italian, Portuguese, Russian, Chinese, Japanese, Korean, Arabic, Hindi, Dutch, Polish, Turkish, Vietnamese, Thai, Indonesian, Swedish, Ukrainian
-
Stefy Lanza (nextime / spora ) authored
Features Added: - Model type filters: --t2i-only, --v2v-only, --v2i-only, --3d-only, --tts-only, --audio-only - Enhanced model list table with new capability columns (V2V, V2I, 3D, TTS) - Updated detect_model_type() to detect all model capabilities MCP Server Updates: - Added videogen_video_to_video tool for V2V style transfer - Added videogen_apply_video_filter tool for video filters - Added videogen_extract_frames tool for frame extraction - Added videogen_create_collage tool for thumbnail grids - Added videogen_upscale_video tool for AI upscaling - Added videogen_convert_3d tool for 2D-to-3D conversion - Added videogen_concat_videos tool for video concatenation - Updated model list filter to support all new types SKILL.md Updates: - Added V2V, V2I, 3D to generation types table - Added model filter examples - Added 8 new use cases for V2V, filters, frames, collage, upscale, 3D, concat
-
Stefy Lanza (nextime / spora ) authored
Features Added: - Video-to-Video (V2V): Style transfer, filters, concatenation - Video-to-Image (V2I): Frame extraction, keyframes, collages - 2D-to-3D Conversion: SBS, anaglyph, VR 360 formats - Video upscaling with AI (ESRGAN, Real-ESRGAN, SwinIR) - Video filters (grayscale, sepia, blur, speed, slow-mo, etc.) Command-line Arguments: - --video: Input video file for V2V/V2I operations - --video-to-video: Enable V2V style transfer - --video-filter: Apply video filters - --extract-frame, --extract-keyframes, --extract-frames - --convert-3d-sbs, --convert-3d-anaglyph, --convert-vr - --upscale-video, --upscale-method Model Discovery: - Added depth estimation models to --update-models - Added 2D-to-3D model searches - Added V2V style transfer models Documentation: - Updated README.md with new features - Added comprehensive V2V/V2I/2D-to-3D examples - Added multi-node cluster setup guide - Added NFS shared storage configuration
-
Stefy Lanza (nextime / spora ) authored
- Add video frame extraction (extract_video_frames, extract_keyframes) - Add video info retrieval (get_video_info) - Add frames to video conversion (frames_to_video) - Add video upscaling with AI support (upscale_video) - Add video-to-video style transfer (video_to_video_style_transfer) - Add video-to-image extraction (video_to_image) - Add video collage creation (create_video_collage) - Add video filters (apply_video_filter - grayscale, sepia, blur, etc.) - Add video concatenation (concat_videos) - Add image upscaling (upscale_image) Features: - Extract frames at specific FPS or timestamps - AI upscaling with ESRGAN/SwinIR support - Scene detection for keyframe extraction - Multiple video filters and effects - Video concatenation with re-encoding or stream copy
-
Stefy Lanza (nextime / spora ) authored
- Add IP-Adapter integration for character consistency using reference images - Add InstantID support for superior face identity preservation - Add Character Profile System to store reference images and face embeddings - Add LoRA Training Workflow for perfect character consistency - Add command-line arguments for all character consistency features - Update EXAMPLES.md with comprehensive character consistency documentation - Update requirements.txt with optional dependencies (insightface, onnxruntime) New commands: - --character: Use saved character profile - --create-character: Create new character profile from reference images - --list-characters: List all saved profiles - --show-character: Show profile details - --ipadapter: Enable IP-Adapter for consistency - --instantid: Enable InstantID for face identity - --train-lora: Train custom LoRA for character
-
Stefy Lanza (nextime / spora ) authored
- When --update-models detects a LoRA adapter, validate that the base model exists on HuggingFace before adding it to the model list - Skip LoRAs whose base models are not found on HuggingFace - Added support for flux and sdxl base model detection - Print informative messages when skipping LoRAs with missing base models
-
Stefy Lanza (nextime / spora ) authored
PEFT (Parameter-Efficient Fine-Tuning) is required for loading LoRA adapters with pipe.load_lora_weights(). Without it, LoRA loading fails with: 'PEFT backend is required for this method.'
-
Stefy Lanza (nextime / spora ) authored
- Add update_model_pipeline_class() function to update model config - Call function when main model pipeline mismatch is corrected - Call function when image model pipeline mismatch is corrected - Ensures future runs use the correct pipeline class automatically
-
Stefy Lanza (nextime / spora ) authored
-
- 24 Feb, 2026 16 commits
-
-
Stefy Lanza (nextime / spora ) authored
When defer_i2v_loading=True (I2V mode without provided image), the code sets pipe=None but then tried to call pipe.load_lora_weights() and pipe.enable_model_cpu_offload() on None, causing AttributeError. This fix wraps the LoRA loading and offloading configuration blocks inside an 'if not defer_i2v_loading:' condition so they are skipped when the I2V model loading is deferred until after image generation.
-
Stefy Lanza (nextime / spora ) authored
- Defer I2V model loading when in I2V mode without provided image - Generate image first with T2I model - Unload T2I model completely (del, empty_cache, gc.collect) - Then load I2V model and generate video - This ensures only one model is in memory at a time - Fixes Linux OOM killer issue when loading multiple models
-
Stefy Lanza (nextime / spora ) authored
- Add auto_disable.json to track failure counts and disabled status - Models that fail 3 times in auto mode are automatically disabled - Disabled models are skipped during auto model selection - Manual selection of a disabled model re-enables it for auto mode - Model list now shows 'Auto' column with status (Yes, OFF, or X/3) - Disabled models shown with
🚫 indicator in model list - New functions: load_auto_disable_data(), save_auto_disable_data(), record_model_failure(), is_model_disabled(), re_enable_model(), get_model_fail_count() -
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
- Detect LoRA adapters from tags (lora, LoRA) and files (*.safetensors) - Extract base model from tags (format: base_model:org/model-name) - Skip model_index.json fetch for LoRA-only repos - Determine pipeline class from base model for LoRA adapters - Improves handling of models like enhanceaiteam/Flux-Uncensored-V2
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
- Replace locals().get('goto_after_loading', False) with properly initialized boolean flag - The locals() approach failed because locals() returns a copy, not a reference - Now the fallback correctly skips error handling when pipeline loads successfully via detected class -
Stefy Lanza (nextime / spora ) authored
- Add fallback mechanism for models with incorrect model_index.json - Detect pipeline class from model ID patterns when component mismatch occurs - Fix indentation error in auto mode retry logic block - Properly handle Wan2.2-I2V models with misconfigured pipeline class
-
Stefy Lanza (nextime / spora ) authored
- Track if user explicitly specified --model before auto mode runs - Skip retry with alternative models when user's model fails - Show clear error message explaining user's choice is preserved - Only auto-selected models can be retried with alternatives
-
Stefy Lanza (nextime / spora ) authored
- Track failed base models in _failed_base_models set - Skip LoRA adapters that depend on failed base models during retry - Try non-LoRA alternatives when all LoRAs with same base fail - Improve error detection for 'Repository Not Found' errors - Show skipped LoRA count during retry process
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
- Skip models not found on HuggingFace instead of adding with defaults - Add deep search for model variants from known organizations - Search organizations: Alpha-VLLM, stepvideo, hpcai-tech, tencent, rhymes-ai, THUDM, genmo, Wan-AI, stabilityai, black-forest-labs - Remove non-existent models from known_large_models list - Better error handling for model validation
-
Stefy Lanza (nextime / spora ) authored
- Add HF_TOKEN support to main pipeline loading (pipe_kwargs) - Add HF_TOKEN support to VAE loading for Wan models - Add HF_TOKEN support to image model loading for I2V mode - Enhanced pipeline detection with multiple strategies - Improved error messages for authentication errors (401, gated models) - Added debug output for HF token status
-
Stefy Lanza (nextime / spora ) authored
- Fix retry logic bug: only run auto mode once (check for _auto_mode flag) - Prevent infinite retry loops by preserving retry count across recursive calls - Add better error handling for pipeline compatibility issues (FrozenDict, scale_factor errors) - Add helpful troubleshooting messages for diffusers version incompatibilities - Show retry exhaustion message when all alternative models fail
-
Stefy Lanza (nextime / spora ) authored
- Add DiffusionPipeline to PIPELINE_CLASS_MAP for generic model loading - Add fallback to DiffusionPipeline for unknown pipeline classes - Add return_all parameter to select_best_model() for getting all candidates - Store alternative models in auto mode for retry support - Implement retry logic when model loading fails in auto mode - Retry up to 3 times with alternative models before failing - Add debug output for model loading troubleshooting - Improve error messages with troubleshooting hints
-
Stefy Lanza (nextime / spora ) authored
Features: - Audio generation: TTS via Bark/Edge-TTS, music via MusicGen - Audio sync: stretch, trim, pad, loop modes - Lip sync: Wav2Lip and SadTalker integration - Auto mode: automatic model selection with NSFW detection - MCP server: AI agent integration via Model Context Protocol - Model management: external config, search, validation - T2I/I2I support: static image and image-to-image generation - Time estimation: detailed timing breakdown for each step Documentation: - README.md: comprehensive installation and usage guide - EXAMPLES.md: 100+ command-line examples - SKILL.md: AI agent integration guide - LICENSE.md: GPLv3 license Copyleft
© 2026 Stefy <stefy@nexlab.net>
-