Add character consistency features: IP-Adapter, InstantID, Character Profiles, LoRA Training

- Add IP-Adapter integration for character consistency using reference images
- Add InstantID support for superior face identity preservation
- Add Character Profile System to store reference images and face embeddings
- Add LoRA Training Workflow for perfect character consistency
- Add command-line arguments for all character consistency features
- Update EXAMPLES.md with comprehensive character consistency documentation
- Update requirements.txt with optional dependencies (insightface, onnxruntime)

New commands:
- --character: Use saved character profile
- --create-character: Create new character profile from reference images
- --list-characters: List all saved profiles
- --show-character: Show profile details
- --ipadapter: Enable IP-Adapter for consistency
- --instantid: Enable InstantID for face identity
- --train-lora: Train custom LoRA for character
parent 84d460f6
...@@ -14,12 +14,13 @@ This document contains comprehensive examples for using the VideoGen toolkit, co ...@@ -14,12 +14,13 @@ This document contains comprehensive examples for using the VideoGen toolkit, co
6. [Image-to-Image (I2I)](#image-to-image-i2i) 6. [Image-to-Image (I2I)](#image-to-image-i2i)
7. [Audio Generation](#audio-generation) 7. [Audio Generation](#audio-generation)
8. [Lip Sync](#lip-sync) 8. [Lip Sync](#lip-sync)
9. [Distributed Multi-GPU](#distributed-multi-gpu) 9. [Character Consistency](#character-consistency)
10. [Model Management](#model-management) 10. [Distributed Multi-GPU](#distributed-multi-gpu)
11. [VRAM Management](#vram-management) 11. [Model Management](#model-management)
12. [Upscaling](#upscaling) 12. [VRAM Management](#vram-management)
13. [NSFW Content](#nsfw-content) 13. [Upscaling](#upscaling)
14. [Advanced Combinations](#advanced-combinations) 14. [NSFW Content](#nsfw-content)
15. [Advanced Combinations](#advanced-combinations)
--- ---
...@@ -614,6 +615,227 @@ python3 videogen --image_to_video --model svd_xt_1.1 \ ...@@ -614,6 +615,227 @@ python3 videogen --image_to_video --model svd_xt_1.1 \
--- ---
## Character Consistency
Character consistency features allow you to maintain the same character appearance across multiple generations using IP-Adapter, InstantID, Character Profiles, and LoRA training.
### Character Profiles
Character profiles store reference images and face embeddings for consistent character generation.
```bash
# Create a character profile from reference images
python3 videogen --create-character alice \
--character-images ref1.jpg ref2.jpg ref3.jpg \
--character-desc "young woman with blue eyes and blonde hair"
# List all saved character profiles
python3 videogen --list-characters
# Show details of a character profile
python3 videogen --show-character alice
# Use a character profile for generation
python3 videogen --model flux_dev \
--character alice \
--prompt "alice walking in a park" \
--output alice_park.png
# Use character profile with I2V
python3 videogen --image_to_video --model svd_xt_1.1 \
--image_model flux_dev \
--character alice \
--prompt "alice smiling at camera" \
--prompt_animation "subtle head movement" \
--output alice_animated
```
### IP-Adapter for Character Consistency
IP-Adapter uses reference images to maintain character identity across generations.
```bash
# Basic IP-Adapter usage
python3 videogen --model flux_dev \
--ipadapter \
--reference-images character_ref.jpg \
--prompt "portrait of the same person in different lighting" \
--output portrait_variant.png
# IP-Adapter with multiple reference images
python3 videogen --model sdxl_base \
--ipadapter \
--reference-images ref1.jpg ref2.jpg ref3.jpg \
--prompt "the person in a business suit" \
--output business.png
# IP-Adapter with custom scale (higher = more similar to reference)
python3 videogen --model flux_dev \
--ipadapter --ipadapter-scale 0.9 \
--reference-images character.jpg \
--prompt "the person in fantasy armor" \
--output fantasy_armor.png
# IP-Adapter with specific model variant
python3 videogen --model sdxl_base \
--ipadapter --ipadapter-model plus_sdxl \
--reference-images ref.jpg \
--prompt "cinematic portrait" \
--output cinematic.png
```
### InstantID for Face Identity
InstantID provides superior face identity preservation compared to IP-Adapter.
```bash
# Basic InstantID usage
python3 videogen --model flux_dev \
--instantid \
--reference-images face_ref.jpg \
--prompt "portrait in different style" \
--output styled_portrait.png
# InstantID with custom scale
python3 videogen --model sdxl_base \
--instantid --instantid-scale 0.85 \
--reference-images face.jpg \
--prompt "the person as a medieval knight" \
--output knight.png
# Combine IP-Adapter and InstantID for best results
python3 videogen --model flux_dev \
--ipadapter --ipadapter-scale 0.7 \
--instantid --instantid-scale 0.8 \
--reference-images ref1.jpg ref2.jpg \
--prompt "the person in a sci-fi setting" \
--output scifi.png
```
### LoRA Training for Characters
Train a custom LoRA for perfect character consistency.
```bash
# Prepare training data (collect 10-50 images of the character)
mkdir -p training_images/alice
# Copy your reference images to the directory
# Generate LoRA training setup
python3 videogen --train-lora alice \
--training-images ./training_images/alice \
--training-epochs 100 \
--lora-rank 4 \
--base-model-for-training runwayml/stable-diffusion-v1-5
# Higher rank LoRA (more detail, larger file)
python3 videogen --train-lora alice_detailed \
--training-images ./training_images/alice \
--training-epochs 200 \
--lora-rank 16
# The training command will be generated in:
# ~/.config/videogen/characters/alice/lora/train_alice.sh
```
### Complete Character Consistency Workflow
```bash
# Step 1: Create character profile
python3 videogen --create-character my_character \
--character-images photo1.jpg photo2.jpg photo3.jpg \
--character-desc "detailed character description"
# Step 2: Generate base image with character
python3 videogen --model flux_dev \
--character my_character \
--ipadapter --ipadapter-scale 0.8 \
--instantid --instantid-scale 0.85 \
--prompt "my_character in casual clothes at a cafe" \
--output base_image.png
# Step 3: Create variations
python3 videogen --model flux_dev \
--character my_character \
--ipadapter --instantid \
--prompt "my_character in formal attire at a gala" \
--output formal.png
# Step 4: Animate with I2V
python3 videogen --model svd_xt_1.1 \
--image base_image.png \
--character my_character \
--prompt "subtle natural movement" \
--output animated
# Step 5: Add audio with lip sync
python3 videogen --model svd_xt_1.1 \
--image base_image.png \
--character my_character \
--prompt "speaking naturally" \
--generate_audio --audio_type tts \
--audio_text "Hello, nice to meet you" \
--lip_sync \
--output speaking
```
### Character Consistency for Video Series
```bash
# Create a character for a video series
python3 videogen --create-character series_protagonist \
--character-images protagonist_*.jpg \
--character-desc "main character for video series"
# Generate multiple scenes with the same character
SCENES=(
"walking through a forest"
"entering a mysterious cave"
"discovering a treasure chest"
"celebrating the discovery"
)
for i, scene in "${SCENES[@]}"; do
python3 videogen --model wan_14b_t2v \
--character series_protagonist \
--ipadapter --instantid \
--prompt "series_protagonist $scene" \
--output "scene_$i"
done
```
### Character Consistency Flags
| Flag | Description | Example |
|------|-------------|---------|
| `--character` | Use saved character profile | `--character alice` |
| `--create-character` | Create new profile | `--create-character bob` |
| `--character-images` | Reference images for profile | `--character-images img1.jpg img2.jpg` |
| `--character-desc` | Character description | `--character-desc "tall man with beard"` |
| `--list-characters` | List all profiles | `--list-characters` |
| `--show-character` | Show profile details | `--show-character alice` |
| `--ipadapter` | Enable IP-Adapter | `--ipadapter` |
| `--ipadapter-scale` | IP-Adapter influence | `--ipadapter-scale 0.8` |
| `--ipadapter-model` | IP-Adapter variant | `--ipadapter-model plus_sdxl` |
| `--reference-images` | Images for IP-Adapter/InstantID | `--reference-images ref.jpg` |
| `--instantid` | Enable InstantID | `--instantid` |
| `--instantid-scale` | InstantID influence | `--instantid-scale 0.85` |
| `--train-lora` | Train character LoRA | `--train-lora alice` |
| `--training-images` | Training image directory | `--training-images ./images/` |
| `--training-epochs` | Training epochs | `--training-epochs 100` |
| `--lora-rank` | LoRA rank | `--lora-rank 4` |
### Character Consistency Tips
1. **Reference Images**: Use 3-10 high-quality reference images showing different angles and expressions
2. **IP-Adapter Scale**: 0.7-0.9 works best; higher values = more similar to reference
3. **InstantID**: Better for face identity; IP-Adapter better for overall style
4. **Combining Methods**: Use both IP-Adapter and InstantID for best results
5. **LoRA Training**: Best for perfect consistency; requires 20-50+ training images
6. **Character Profiles**: Store embeddings to avoid re-extracting faces each time
---
## Distributed Multi-GPU ## Distributed Multi-GPU
### Basic Distributed Setup ### Basic Distributed Setup
......
...@@ -31,6 +31,11 @@ opencv-python>=4.8.0 ...@@ -31,6 +31,11 @@ opencv-python>=4.8.0
face-recognition>=1.14.0 face-recognition>=1.14.0
# dlib # Install with: pip install dlib (requires cmake) # dlib # Install with: pip install dlib (requires cmake)
# Character Consistency Dependencies (Optional - for IP-Adapter, InstantID)
# insightface>=0.7.3 # Install with: pip install insightface
# onnxruntime-gpu>=1.16.0 # Required for insightface GPU acceleration
# or onnxruntime>=1.16.0 # CPU only
# Model Management # Model Management
requests>=2.31.0 requests>=2.31.0
urllib3>=2.0.0 urllib3>=2.0.0
......
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment