Add web interface for VideoGen

Features:
- Modern web UI with all generation modes (T2V, I2V, T2I, I2I, V2V, Dub, Subtitles, Upscale)
- Real-time progress updates via WebSocket
- File upload for input images/videos/audio
- File download for generated content
- Background job processing with progress tracking
- Job management (cancel, retry, delete)
- Gallery for browsing generated files
- REST API for programmatic access
- Responsive design for desktop and mobile

Backend (webapp.py):
- Flask + Flask-SocketIO for real-time updates
- Background job processing with threading
- File upload/download handling
- Job state persistence
- REST API endpoints

Frontend:
- Modern dark theme UI
- Mode selection with visual cards
- Form with all options and settings
- Real-time progress modal with log streaming
- Toast notifications
- Keyboard shortcuts (Ctrl+Enter to submit, Escape to close)

Documentation:
- Updated README.md with web interface section
- Updated EXAMPLES.md with web interface usage
- Updated requirements.txt with web dependencies
parent 344cd12a
...@@ -8,24 +8,25 @@ This document contains comprehensive examples for using the VideoGen toolkit, co ...@@ -8,24 +8,25 @@ This document contains comprehensive examples for using the VideoGen toolkit, co
1. [Basic Usage](#basic-usage) 1. [Basic Usage](#basic-usage)
2. [Auto Mode](#auto-mode) 2. [Auto Mode](#auto-mode)
3. [Text-to-Video (T2V)](#text-to-video-t2v) 3. [Web Interface](#web-interface)
4. [Image-to-Video (I2V)](#image-to-video-i2v) 4. [Text-to-Video (T2V)](#text-to-video-t2v)
5. [Text-to-Image (T2I)](#text-to-image-t2i) 5. [Image-to-Video (I2V)](#image-to-video-i2v)
6. [Image-to-Image (I2I)](#image-to-image-i2i) 6. [Text-to-Image (T2I)](#text-to-image-t2i)
7. [Video-to-Video (V2V)](#video-to-video-v2v) 7. [Image-to-Image (I2I)](#image-to-image-i2i)
8. [Video-to-Image (V2I)](#video-to-image-v2i) 8. [Video-to-Video (V2V)](#video-to-video-v2v)
9. [2D-to-3D Conversion](#2d-to-3d-conversion) 9. [Video-to-Image (V2I)](#video-to-image-v2i)
10. [Audio Generation](#audio-generation) 10. [2D-to-3D Conversion](#2d-to-3d-conversion)
11. [Lip Sync](#lip-sync) 11. [Audio Generation](#audio-generation)
12. [Video Dubbing & Translation](#video-dubbing--translation) 12. [Lip Sync](#lip-sync)
13. [Subtitle Generation](#subtitle-generation) 13. [Video Dubbing & Translation](#video-dubbing--translation)
14. [Character Consistency](#character-consistency) 14. [Subtitle Generation](#subtitle-generation)
15. [Distributed Multi-GPU](#distributed-multi-gpu) 15. [Character Consistency](#character-consistency)
16. [Model Management](#model-management) 16. [Distributed Multi-GPU](#distributed-multi-gpu)
17. [VRAM Management](#vram-management) 17. [Model Management](#model-management)
18. [Upscaling](#upscaling) 18. [VRAM Management](#vram-management)
19. [NSFW Content](#nsfw-content) 19. [Upscaling](#upscaling)
20. [Advanced Combinations](#advanced-combinations) 20. [NSFW Content](#nsfw-content)
21. [Advanced Combinations](#advanced-combinations)
--- ---
...@@ -59,6 +60,145 @@ python3 videogen --auto --prefer-speed --prompt "quick animation test" ...@@ -59,6 +60,145 @@ python3 videogen --auto --prefer-speed --prompt "quick animation test"
--- ---
## Web Interface
VideoGen includes a modern web interface for easy access to all features without using the command line.
### Starting the Web Server
```bash
# Start on default port (5000)
python3 webapp.py
# Start on custom port
python3 webapp.py --port 8080
# Start accessible from network
python3 webapp.py --host 0.0.0.0 --port 5000
# Start with debug mode
python3 webapp.py --debug
```
### Web Interface Features
The web interface provides access to all VideoGen features:
1. **Generate Tab**
- Mode selection (T2V, I2V, T2I, I2I, V2V, Dub, Subtitles, Upscale)
- Prompt input with quick hints
- Model selection with auto-mode option
- File upload for input images/videos/audio
- Resolution, FPS, duration settings
- Audio options (TTS, music, sync, lip-sync)
- Translation/dubbing options
- Advanced settings (offloading, VRAM limit, debug)
2. **Jobs Tab**
- Real-time job monitoring
- Progress bars with status text
- Output log streaming
- Cancel/retry/delete jobs
- Download generated files
3. **Gallery Tab**
- Browse all generated files
- Preview videos and images
- Download files
- Delete files
4. **Settings Tab**
- Server configuration
- Default values
- About information
### Using the Web Interface
1. **Start the server:**
```bash
python3 webapp.py
```
2. **Open in browser:**
Navigate to `http://localhost:5000`
3. **Select a mode:**
Click on the desired generation mode (T2V, I2V, etc.)
4. **Enter prompt:**
Type your prompt or use the quick hints
5. **Configure settings:**
Select model, resolution, duration, etc.
6. **Upload files (if needed):**
For I2I, V2V, dubbing - upload input files
7. **Click Generate:**
Watch real-time progress in the modal
8. **Download result:**
From the modal, jobs list, or gallery
### Web Interface API
The web interface also provides a REST API:
```bash
# Get model list
curl http://localhost:5000/api/models
# Get TTS voices
curl http://localhost:5000/api/tts-voices
# Get languages
curl http://localhost:5000/api/languages
# Upload a file
curl -X POST -F "file=@image.png" -F "type=image" http://localhost:5000/api/upload
# Create a job
curl -X POST -H "Content-Type: application/json" \
-d '{"mode":"t2v","prompt":"a cat playing","model":"wan_1.3b_t2v"}' \
http://localhost:5000/api/jobs
# Get job status
curl http://localhost:5000/api/jobs/JOB_ID
# Cancel a job
curl -X POST http://localhost:5000/api/jobs/JOB_ID/cancel
# List outputs
curl http://localhost:5000/api/outputs
# Download a file
curl http://localhost:5000/api/download/filename.mp4 -o output.mp4
```
### WebSocket Events
For real-time updates, connect via Socket.IO:
```javascript
const socket = io('http://localhost:5000');
// Subscribe to job updates
socket.emit('subscribe_job', 'JOB_ID');
// Receive job updates
socket.on('job_update', (job) => {
console.log('Progress:', job.progress);
console.log('Status:', job.status);
});
// Receive log lines
socket.on('job_log', (data) => {
console.log('Log:', data.line);
});
```
---
## Auto Mode ## Auto Mode
Auto mode analyzes your prompt and automatically: Auto mode analyzes your prompt and automatically:
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
**Copyleft © 2026 Stefy <stefy@nexlab.net>** **Copyleft © 2026 Stefy <stefy@nexlab.net>**
A comprehensive, GPU-accelerated video generation toolkit supporting Text-to-Video (T2V), Image-to-Video (I2V), Text-to-Image (T2I), Image-to-Image (I2I), Video-to-Video (V2V), Video-to-Image (V2I), and 2D-to-3D conversion with audio synthesis, synchronization, and lip-sync capabilities. A comprehensive, GPU-accelerated video generation toolkit supporting Text-to-Video (T2V), Image-to-Video (I2V), Text-to-Image (T2I), Image-to-Image (I2I), Video-to-Video (V2V), Video-to-Image (V2I), and 2D-to-3D conversion with audio synthesis, synchronization, lip-sync, dubbing, and translation capabilities.
--- ---
...@@ -34,6 +34,13 @@ A comprehensive, GPU-accelerated video generation toolkit supporting Text-to-Vid ...@@ -34,6 +34,13 @@ A comprehensive, GPU-accelerated video generation toolkit supporting Text-to-Vid
- **Audio Sync**: Match audio duration to video (stretch, trim, pad, loop) - **Audio Sync**: Match audio duration to video (stretch, trim, pad, loop)
- **Lip Sync**: Wav2Lip and SadTalker integration - **Lip Sync**: Wav2Lip and SadTalker integration
### Video Dubbing & Translation
- **Video Dubbing**: Translate and dub videos while preserving original voice
- **Voice Cloning**: Preserve speaker's voice in translated video
- **Automatic Subtitles**: Generate subtitles using Whisper
- **Subtitle Translation**: Translate subtitles to 20+ languages
- **Subtitle Burning**: Burn subtitles directly into video
### Model Support ### Model Support
- **Small Models** (<16GB VRAM): Wan 1.3B, Zeroscope, ModelScope - **Small Models** (<16GB VRAM): Wan 1.3B, Zeroscope, ModelScope
- **Medium Models** (16-30GB VRAM): Wan 14B, CogVideoX, Mochi - **Medium Models** (16-30GB VRAM): Wan 14B, CogVideoX, Mochi
...@@ -44,12 +51,14 @@ A comprehensive, GPU-accelerated video generation toolkit supporting Text-to-Vid ...@@ -44,12 +51,14 @@ A comprehensive, GPU-accelerated video generation toolkit supporting Text-to-Vid
- **Auto Mode**: Automatic model selection and configuration - **Auto Mode**: Automatic model selection and configuration
- **NSFW Detection**: Automatic content classification - **NSFW Detection**: Automatic content classification
- **Prompt Splitting**: Intelligent I2V prompt separation - **Prompt Splitting**: Intelligent I2V prompt separation
- **Time Estimation**: Predict generation time before starting - **Time Estimation**: Hardware-aware generation time prediction
- **Multi-GPU**: Distributed generation across multiple GPUs - **Multi-GPU**: Distributed generation across multiple GPUs
- **Auto-Disable**: Models that fail 3 times are auto-disabled
### AI Integration ### User Interfaces
- **Command Line**: Full-featured CLI with all options
- **Web Interface**: Modern web UI with real-time progress updates
- **MCP Server**: Model Context Protocol wrapper for AI agents - **MCP Server**: Model Context Protocol wrapper for AI agents
- **Skill Documentation**: Comprehensive AI agent integration guide
--- ---
...@@ -77,6 +86,11 @@ pip install opencv-python face-recognition dlib --break-system-packages ...@@ -77,6 +86,11 @@ pip install opencv-python face-recognition dlib --break-system-packages
git clone https://github.com/Rudrabha/Wav2Lip.git git clone https://github.com/Rudrabha/Wav2Lip.git
``` ```
### Web Interface
```bash
pip install flask flask-cors flask-socketio eventlet
```
### MCP Server (For AI Agents) ### MCP Server (For AI Agents)
```bash ```bash
pip install mcp pip install mcp
...@@ -187,6 +201,38 @@ See [SKILL.md](SKILL.md) for comprehensive AI agent integration guide including: ...@@ -187,6 +201,38 @@ See [SKILL.md](SKILL.md) for comprehensive AI agent integration guide including:
--- ---
## Web Interface
VideoGen includes a modern web interface for easy access to all features:
### Starting the Web Server
```bash
python3 webapp.py --port 5000 --host 0.0.0.0
```
Then open `http://localhost:5000` in your browser.
### Web Interface Features
- **All Generation Modes**: T2V, I2V, T2I, I2I, V2V, Upscale, Dubbing, Subtitles
- **Real-time Progress**: Live progress updates with output log streaming
- **File Upload/Download**: Upload images, videos, audio; download generated content
- **Model Selection**: Browse and select from all available models
- **Job Management**: View, cancel, retry, and track all generation jobs
- **Gallery**: Browse and download all generated files
- **Responsive Design**: Works on desktop and mobile devices
### Web Interface Screenshots
The web interface provides:
1. **Generate Tab**: Main generation form with all options
2. **Jobs Tab**: Real-time job monitoring with progress bars
3. **Gallery Tab**: Browse and download generated content
4. **Settings Tab**: Configuration and about information
---
## Documentation ## Documentation
- **[EXAMPLES.md](EXAMPLES.md)**: Comprehensive command-line examples for all features - **[EXAMPLES.md](EXAMPLES.md)**: Comprehensive command-line examples for all features
...@@ -266,6 +312,12 @@ export CUDA_VISIBLE_DEVICES=0,1 # GPU selection ...@@ -266,6 +312,12 @@ export CUDA_VISIBLE_DEVICES=0,1 # GPU selection
``` ```
videogen/ videogen/
├── videogen # Main script ├── videogen # Main script
├── webapp.py # Web interface server
├── templates/ # HTML templates
│ └── index.html # Main web UI
├── static/ # Static assets
│ ├── css/style.css # Styles
│ └── js/app.js # JavaScript
├── videogen_mcp_server.py # MCP server for AI agents ├── videogen_mcp_server.py # MCP server for AI agents
├── README.md # This file ├── README.md # This file
├── EXAMPLES.md # Comprehensive examples ├── EXAMPLES.md # Comprehensive examples
......
...@@ -55,5 +55,16 @@ pydantic>=2.0.0 ...@@ -55,5 +55,16 @@ pydantic>=2.0.0
# Distributed Processing # Distributed Processing
# accelerate # Already listed above # accelerate # Already listed above
# Web Interface Dependencies (Optional - for webapp.py)
flask>=3.0.0
flask-cors>=4.0.0
flask-socketio>=5.3.0
eventlet>=0.33.0
python-socketio>=5.10.0
werkzeug>=3.0.0
# MCP Server Dependencies (Optional - for AI agent integration)
# mcp>=0.9.0 # Install with: pip install mcp
# Optional: NSFW Classification # Optional: NSFW Classification
# onnxruntime>=1.16.0 # onnxruntime>=1.16.0
\ No newline at end of file
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment