Add audio generation, auto mode, MCP server, and comprehensive documentation

Features: - Audio generation: TTS via Bark/Edge-TTS, music via MusicGen - Audio sync: stretch, trim, pad, loop modes - Lip sync: Wav2Lip and SadTalker integration - Auto mode: automatic model selection with NSFW detection - MCP server: AI agent integration via Model Context Protocol - Model management: external config, search, validation - T2I/I2I support: static image and image-to-image generation - Time estimation: detailed timing breakdown for each step Documentation: - README.md: comprehensive installation and usage guide - EXAMPLES.md: 100+ command-line examples - SKILL.md: AI agent integration guide - LICENSE.md: GPLv3 license Copyleft © 2026 Stefy <stefy@nexlab.net>

Add audio generation, auto mode, MCP server, and comprehensive documentation
Features: - Audio generation: TTS via Bark/Edge-TTS, music via MusicGen - Audio sync: stretch, trim, pad, loop modes - Lip sync: Wav2Lip and SadTalker integration - Auto mode: automatic model selection with NSFW detection - MCP server: AI agent integration via Model Context Protocol - Model management: external config, search, validation - T2I/I2I support: static image and image-to-image generation - Time estimation: detailed timing breakdown for each step Documentation: - README.md: comprehensive installation and usage guide - EXAMPLES.md: 100+ command-line examples - SKILL.md: AI agent integration guide - LICENSE.md: GPLv3 license Copyleft © 2026 Stefy <stefy@nexlab.net>
4ba0b99f · Stefy Lanza (nextime / spora ) · 4ba0b99f · 4ba0b99f · 4ba0b99f · 4ba0b99f
Commit 4ba0b99f authored Feb 24, 2026 by Stefy Lanza (nextime / spora )
7 changed files
--- a/EXAMPLES.md
+++ b/EXAMPLES.md
--- a/LICENSE.md
+++ b/LICENSE.md
--- a/README.md
+++ b/README.md
+# VideoGen - Universal Video Generation Toolkit
+
+**Copyleft © 2026 Stefy <stefy@nexlab.net>**
+
+A comprehensive, GPU-accelerated video generation toolkit supporting Text-to-Video (T2V), Image-to-Video (I2V), Text-to-Image (T2I), and Image-to-Image (I2I) generation with audio synthesis, synchronization, and lip-sync capabilities.
+
+---
+
+## Features
+
+### Video Generation
+- **Text-to-Video (T2V)**: Generate videos from text prompts
+- **Image-to-Video (I2V)**: Animate static images
+- **Text-to-Image (T2I)**: Generate high-quality images
+- **Image-to-Image (I2I)**: Transform existing images
+
+### Audio Capabilities
+- **Text-to-Speech (TTS)**: Multiple voices via Bark and Edge-TTS
+- **Music Generation**: MusicGen integration for background music
+- **Audio Sync**: Match audio duration to video (stretch, trim, pad, loop)
+- **Lip Sync**: Wav2Lip and SadTalker integration
+
+### Model Support
+- **Small Models** (<16GB VRAM): Wan 1.3B, Zeroscope, ModelScope
+- **Medium Models** (16-30GB VRAM): Wan 14B, CogVideoX, Mochi
+- **Large Models** (30-50GB VRAM): Allegro, HunyuanVideo
+- **Huge Models** (50GB+ VRAM): Open-Sora, Step-Video, Lumina
+
+### Smart Features
+- **Auto Mode**: Automatic model selection and configuration
+- **NSFW Detection**: Automatic content classification
+- **Prompt Splitting**: Intelligent I2V prompt separation
+- **Time Estimation**: Predict generation time before starting
+- **Multi-GPU**: Distributed generation across multiple GPUs
+
+### AI Integration
+- **MCP Server**: Model Context Protocol wrapper for AI agents
+- **Skill Documentation**: Comprehensive AI agent integration guide
+
+---
+
+## Installation
+
+### Core Dependencies
+```bash
+pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121 --break-system-packages
+pip install git+https://github.com/huggingface/diffusers.git --break-system-packages
+pip install git+https://github.com/huggingface/transformers.git --break-system-packages
+pip install --upgrade accelerate xformers spandrel psutil ffmpeg-python ftfy --break-system-packages
+```
+
+### Audio Features (Optional)
+```bash
+pip install scipy soundfile librosa --break-system-packages
+pip install git+https://github.com/suno-ai/bark.git --break-system-packages
+pip install edge-tts --break-system-packages
+pip install audiocraft
+```
+
+### Lip Sync (Optional)
+```bash
+pip install opencv-python face-recognition dlib --break-system-packages
+git clone https://github.com/Rudrabha/Wav2Lip.git
+```
+
+### MCP Server (For AI Agents)
+```bash
+pip install mcp
+```
+
+---
+
+## Quick Start
+
+### First-Time Setup
+
+**IMPORTANT**: Before using VideoGen, update the model database:
+
+```bash
+python3 videogen --update-models
+```
+
+This fetches the latest model list from HuggingFace and populates the local database.
+
+### Basic Usage
+
+```bash
+# Simple video generation
+python3 videogen --model wan_1.3b_t2v --prompt "a cat playing piano" --output cat_piano
+
+# Auto mode - let the script decide everything
+python3 videogen --auto --prompt "a beautiful sunset over the ocean"
+
+# Generate with audio
+python3 videogen --model wan_14b_t2v --prompt "epic battle scene" \
+  --generate_audio --audio_type music --sync_audio --output battle
+```
+
+### Image-to-Video
+
+```bash
+# Animate an existing image
+python3 videogen --model svd_xt_1.1 --image my_photo.jpg \
+  --prompt "add subtle motion" --output animated
+
+# I2V with auto-generated image
+python3 videogen --image_to_video --model svd_xt_1.1 \
+  --image_model flux_dev --prompt "cinematic portrait" \
+  --prompt_animation "gentle head movement" --output portrait
+```
+
+### With Lip Sync
+
+```bash
+python3 videogen --image_to_video --model svd_xt_1.1 \
+  --image_model flux_dev --prompt "person speaking" \
+  --generate_audio --audio_type tts \
+  --audio_text "Hello, welcome to my channel" \
+  --lip_sync --output speaker
+```
+
+---
+
+## AI Agent Integration
+
+### MCP Server
+
+VideoGen includes an MCP (Model Context Protocol) server for seamless integration with AI agents like Claude:
+
+```bash
+# Start the MCP server
+python3 videogen_mcp_server.py
+```
+
+Add to Claude Desktop config (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):
+
+```json
+{
+  "mcpServers": {
+    "videogen": {
+      "command": "python3",
+      "args": ["/path/to/videogen_mcp_server.py"]
+    }
+  }
+}
+```
+
+### Available MCP Tools
+
+| Tool | Description |
+|------|-------------|
+| `videogen_generate` | Generate video with auto mode |
+| `videogen_generate_video` | Text-to-Video generation |
+| `videogen_generate_image` | Text-to-Image generation |
+| `videogen_animate_image` | Image-to-Video animation |
+| `videogen_transform_image` | Image-to-Image transformation |
+| `videogen_generate_with_audio` | Video with TTS or music |
+| `videogen_list_models` | List available models |
+| `videogen_show_model` | Show model details |
+| `videogen_update_models` | Update model database |
+| `videogen_search_models` | Search HuggingFace |
+| `videogen_add_model` | Add model to database |
+| `videogen_list_tts_voices` | List TTS voices |
+
+### Skill Documentation
+
+See [SKILL.md](SKILL.md) for comprehensive AI agent integration guide including:
+- Quick reference commands
+- Common use cases
+- Model selection guide
+- Error handling
+- Programmatic usage examples
+
+---
+
+## Documentation
+
+- **[EXAMPLES.md](EXAMPLES.md)**: Comprehensive command-line examples for all features
+- **[SKILL.md](SKILL.md)**: AI agent integration guide
+- **Built-in help**: `python3 videogen --help`
+- **Model list**: `python3 videogen --model-list`
+- **TTS voices**: `python3 videogen --tts-list`
+
+---
+
+## Model Management
+
+```bash
+# Update model database (run this first!)
+python3 videogen --update-models
+
+# List available models
+python3 videogen --model-list
+
+# List models by VRAM requirement
+python3 videogen --model-list --low-vram    # ≤16GB
+python3 videogen --model-list --high-vram   # >30GB
+python3 videogen --model-list --huge-vram   # >55GB
+
+# Search HuggingFace for models
+python3 videogen --search-models "video generation"
+
+# Add a model
+python3 videogen --add-model stabilityai/stable-video-diffusion-img2vid-xt-1.1 --name svd_xt
+
+# Show model details
+python3 videogen --show-model 1
+```
+
+---
+
+## VRAM Management
+
+```bash
+# Limit VRAM usage
+python3 videogen --model wan_14b_t2v --prompt "test" --vram_limit 16
+
+# Offloading strategies
+python3 videogen --model wan_14b_t2v --prompt "test" --offload_strategy sequential
+
+# Low RAM mode
+python3 videogen --model wan_14b_t2v --prompt "test" --low_ram_mode
+```
+
+---
+
+## Distributed Generation
+
+```bash
+# Multi-GPU distributed generation
+python3 videogen --model hunyuanvideo --prompt "epic scene" \
+  --length 30 --distribute --vram_limit 20
+```
+
+---
+
+## Configuration
+
+Models are stored in `~/.config/videogen/models.json`
+
+Set environment variables:
+```bash
+export HF_TOKEN=your_token_here        # For gated models
+export HF_HOME=/path/to/cache          # Custom cache directory
+export CUDA_VISIBLE_DEVICES=0,1        # GPU selection
+```
+
+---
+
+## Project Structure
+
+```
+videogen/
+├── videogen              # Main script
+├── videogen_mcp_server.py # MCP server for AI agents
+├── README.md             # This file
+├── EXAMPLES.md           # Comprehensive examples
+├── SKILL.md              # AI agent integration guide
+├── LICENSE.md            # GPLv3 License
+└── requirements.txt      # Python dependencies
+```
+
+---
+
+## License
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+See [LICENSE.md](LICENSE.md) for the full license text.
+
+---
+
+## Copyleft
+
+**VideoGen - Universal Video Generation Toolkit**
+Copyright © 2026 Stefy <stefy@nexlab.net>
+
+This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.
+
+---
+
+## Contributing
+
+Contributions are welcome! Please feel free to submit pull requests.
+
+---
+
+## Support
+
+For issues and questions, please open an issue on the repository or contact stefy@nexlab.net.
\ No newline at end of file
--- a/SKILL.md
+++ b/SKILL.md
--- a/requirements.txt
+++ b/requirements.txt
+# VideoGen - Universal Video Generation Toolkit
+# Copyleft © 2026 Stefy <stefy@nexlab.net>
+
+# Core Dependencies (Required)
+torch>=2.0.0
+torchvision>=0.15.0
+torchaudio>=2.0.0
+diffusers>=0.30.0
+transformers>=4.35.0
+accelerate>=0.24.0
+xformers>=0.0.22
+spandrel>=0.1.0
+psutil>=5.9.0
+ffmpeg-python>=0.2.0
+ftfy>=6.1.0
+Pillow>=10.0.0
+safetensors>=0.4.0
+huggingface-hub>=0.19.0
+
+# Audio Dependencies (Optional - for TTS and music generation)
+scipy>=1.11.0
+soundfile>=0.12.0
+librosa>=0.10.0
+edge-tts>=6.1.0
+# bark  # Install with: pip install git+https://github.com/suno-ai/bark.git
+# audiocraft  # Install with: pip install audiocraft
+
+# Lip Sync Dependencies (Optional)
+opencv-python>=4.8.0
+face-recognition>=1.14.0
+# dlib  # Install with: pip install dlib (requires cmake)
+
+# Model Management
+requests>=2.31.0
+urllib3>=2.0.0
+
+# Progress and UI
+tqdm>=4.66.0
+rich>=13.0.0
+
+# Configuration
+pydantic>=2.0.0
+
+# Distributed Processing
+# accelerate  # Already listed above
+
+# Optional: NSFW Classification
+# onnxruntime>=1.16.0
\ No newline at end of file
--- a/videogen
+++ b/videogen
--- a/videogen_mcp_server.py
+++ b/videogen_mcp_server.py