Add character consistency features, fix model loading for non-diffusers models

- Add character profile management (create, list, show, delete) - Add IP-Adapter and InstantID support for character consistency - Fix model loading for models with config.json only (no model_index.json) - Add component-only model detection (fine-tuned weights) - Update MCP server with character consistency tools - Update SKILL.md and README.md documentation - Add memory management for dubbing/translation - Add chunked processing for Whisper transcription - Add character persistency options to web interface

Add character consistency features, fix model loading for non-diffusers models
- Add character profile management (create, list, show, delete) - Add IP-Adapter and InstantID support for character consistency - Fix model loading for models with config.json only (no model_index.json) - Add component-only model detection (fine-tuned weights) - Update MCP server with character consistency tools - Update SKILL.md and README.md documentation - Add memory management for dubbing/translation - Add chunked processing for Whisper transcription - Add character persistency options to web interface
1f5226ed · Stefy Lanza (nextime / spora ) · 627eb38f · 1f5226ed · 1f5226ed · 1f5226ed
Commit 1f5226ed authored Feb 25, 2026 by Stefy Lanza (nextime / spora )
9 changed files
--- a/README.md
+++ b/README.md
@@ -47,6 +47,12 @@ A comprehensive, GPU-accelerated video generation toolkit supporting Text-to-Vid
 - **Large Models** (30-50GB VRAM): Allegro, HunyuanVideo
 - **Huge Models** (50GB+ VRAM): Open-Sora, Step-Video, Lumina
+### Character Consistency
+- **Character Profiles**: Save and reuse character references across generations
+- **IP-Adapter**: Image prompt adapter for consistent character generation
+- **InstantID**: Face identity preservation for consistent faces
+- **Reference Images**: Use multiple reference images for character consistency
 ### Smart Features
 - **Auto Mode**: Automatic model selection and configuration
 - **NSFW Detection**: Automatic content classification
@@ -54,6 +60,7 @@ A comprehensive, GPU-accelerated video generation toolkit supporting Text-to-Vid
 - **Time Estimation**: Hardware-aware generation time prediction
 - **Multi-GPU**: Distributed generation across multiple GPUs
 - **Auto-Disable**: Models that fail 3 times are auto-disabled
+- **Memory Management**: Automatic chunking for long videos and low VRAM
 ### User Interfaces
 - **Command Line**: Full-featured CLI with all options
@@ -147,6 +154,40 @@ python3 videogen --image_to_video --model svd_xt_1.1 \
  --lip_sync --output speaker
 ```
+### Character Consistency
+VideoGen supports character consistency across multiple generations using IP-Adapter and InstantID.
+```bash
+# Create a character profile from reference images
+python3 videogen --create-character my_character \
+  --character-images ref1.jpg ref2.jpg ref3.jpg \
+  --character-desc "A young woman with red hair"
+# List saved character profiles
+python3 videogen --list-characters
+# Generate with character consistency
+python3 videogen --model flux_dev \
+  --character my_character \
+  --prompt "my_character walking in a park" \
+  --output character_park
+# Use IP-Adapter directly with reference images
+python3 videogen --model sdxl_base \
+  --ipadapter --ipadapter-scale 0.8 \
+  --reference-images ref1.jpg ref2.jpg \
+  --prompt "a person reading a book" \
+  --output reading
+# Use InstantID for face consistency
+python3 videogen --model sdxl_base \
+  --ipadapter --instantid \
+  --reference-images face_ref.jpg \
+  --prompt "portrait of a person smiling" \
+  --output portrait
+```
 ---
 ## AI Agent Integration

--- a/SKILL.md
+++ b/SKILL.md
@@ -341,6 +341,74 @@ python3 videogen --video input.mp4 --dub-video --target-lang de --tts_voice edge
 ---
+## Character Consistency
+VideoGen supports character consistency across multiple generations using IP-Adapter, InstantID, and Character Profiles.
+### Create Character Profile
+```bash
+# Create a character profile from reference images
+python3 videogen --create-character alice \
+  --character-images ref1.jpg ref2.jpg ref3.jpg \
+  --character-desc "young woman with blue eyes and blonde hair"
+# List all saved character profiles
+python3 videogen --list-characters
+# Show details of a character profile
+python3 videogen --show-character alice
+# Delete a character profile
+python3 videogen --delete-character alice
+```
+### Generate with Character
+```bash
+# Generate image with character consistency
+python3 videogen --model flux_dev \
+  --character alice \
+  --prompt "alice walking in a park" \
+  --output alice_park.png
+# Generate video with character (I2V)
+python3 videogen --image_to_video --model svd_xt_1.1 \
+  --image_model flux_dev \
+  --character alice \
+  --prompt "alice smiling at camera" \
+  --prompt_animation "subtle head movement" \
+  --output alice_animated
+```
+### IP-Adapter Direct Usage
+```bash
+# Use IP-Adapter with reference images directly
+python3 videogen --model flux_dev \
+  --ipadapter --ipadapter-scale 0.8 \
+  --reference-images ref1.jpg ref2.jpg \
+  --prompt "the person in a business suit" \
+  --output business.png
+# Use InstantID for face identity
+python3 videogen --model flux_dev \
+  --ipadapter --instantid \
+  --reference-images face_ref.jpg \
+  --prompt "portrait of the person smiling" \
+  --output portrait.png
+```
+### Character Consistency Tips
+1. **Use multiple reference images** (3-5) for better consistency
+2. **IP-Adapter scale**: 0.7-0.9 for good balance (higher = more similar)
+3. **InstantID** is better for face identity preservation
+4. **Character profiles** are reusable across sessions
+5. **Combine IP-Adapter + InstantID** for best results
+---
 ## Output Files
 VideoGen creates these output files:

--- a/static/css/style.css
+++ b/static/css/style.css
@@ -933,4 +933,139 @@ body {
 .fa-spin {
    animation: spin 1s linear infinite;
 }
\ No newline at end of file
+/* Character Consistency Styles */
+/* File Upload Box */
+.file-upload {
+    position: relative;
+    border: 2px dashed var(--border-color);
+    border-radius: var(--border-radius);
+    padding: 2rem;
+    text-align: center;
+    transition: all 0.3s;
+    cursor: pointer;
+}
+.file-upload:hover {
+    border-color: var(--primary);
+    background: rgba(99, 102, 241, 0.05);
+}
+.file-upload input[type="file"] {
+    position: absolute;
+    top: 0;
+    left: 0;
+    width: 100%;
+    height: 100%;
+    opacity: 0;
+    cursor: pointer;
+}
+.file-upload .file-label {
+    display: flex;
+    flex-direction: column;
+    align-items: center;
+    gap: 0.5rem;
+    pointer-events: none;
+}
+.file-upload .file-label i {
+    font-size: 2rem;
+    color: var(--primary);
+}
+.file-upload .file-label span {
+    font-weight: 500;
+}
+.file-upload .file-label small {
+    color: var(--text-muted);
+    font-size: 0.8rem;
+}
+/* Image Preview Grid */
+.image-preview-grid {
+    display: grid;
+    grid-template-columns: repeat(auto-fill, minmax(100px, 1fr));
+    gap: 0.75rem;
+    margin-top: 1rem;
+}
+.preview-item {
+    position: relative;
+    aspect-ratio: 1;
+    border-radius: var(--border-radius);
+    overflow: hidden;
+    background: var(--bg-darker);
+}
+.preview-item img {
+    width: 100%;
+    height: 100%;
+    object-fit: cover;
+}
+.preview-item .remove-btn {
+    position: absolute;
+    top: 4px;
+    right: 4px;
+    width: 24px;
+    height: 24px;
+    border-radius: 50%;
+    background: rgba(0, 0, 0, 0.7);
+    color: white;
+    border: none;
+    cursor: pointer;
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    font-size: 0.7rem;
+    opacity: 0;
+    transition: opacity 0.2s;
+}
+.preview-item:hover .remove-btn {
+    opacity: 1;
+}
+.preview-item .remove-btn:hover {
+    background: var(--danger);
+}
+/* Character Section */
+#character-section {
+    border-left: 3px solid var(--primary);
+}
+#character-section h3 {
+    color: var(--primary-light);
+}
+/* IP-Adapter and InstantID Options */
+#ipadapter-options,
+#instantid-options,
+#character-profile-options {
+    margin-top: 1rem;
+    padding-top: 1rem;
+    border-top: 1px solid var(--border-color);
+}
+/* Range Slider with Value Display */
+.form-group input[type="range"] {
+    width: calc(100% - 50px);
+    vertical-align: middle;
+}
+.form-group input[type="range"] + span {
+    display: inline-block;
+    width: 40px;
+    text-align: right;
+    font-weight: 600;
+    color: var(--primary);
+}
+/* Hidden class */
+.hidden {
+    display: none !important;
+}
--- a/static/js/app.js
+++ b/static/js/app.js
--- a/templates/index.html
+++ b/templates/index.html
@@ -327,6 +327,85 @@
                            </div>
                        </div>
+                        <!-- Character Consistency -->
+                        <div class="form-section" id="character-section">
+                            <h3><i class="fas fa-user-circle"></i> Character Consistency</h3>
+                            <div class="form-row">
+                                <label class="checkbox-label">
+                                    <input type="checkbox" id="use_character" name="use_character" onchange="toggleCharacterOptions()">
+                                    <span>Use Character Profile</span>
+                                </label>
+                                <label class="checkbox-label">
+                                    <input type="checkbox" id="use_ipadapter" name="use_ipadapter" onchange="toggleIPAdapterOptions()">
+                                    <span>IP-Adapter</span>
+                                </label>
+                                <label class="checkbox-label">
+                                    <input type="checkbox" id="use_instantid" name="use_instantid">
+                                    <span>InstantID (Face)</span>
+                                </label>
+                            </div>
+                            <!-- Character Profile Selection -->
+                            <div id="character-profile-options" class="hidden">
+                                <div class="form-row">
+                                    <div class="form-group">
+                                        <label for="character_profile">Character Profile</label>
+                                        <select id="character_profile" name="character_profile">
+                                            <option value="">Select a character...</option>
+                                        </select>
+                                    </div>
+                                    <div class="form-group">
+                                        <button type="button" class="btn btn-secondary" onclick="showCreateCharacterModal()">
+                                            <i class="fas fa-plus"></i> New Character
+                                        </button>
+                                    </div>
+                                </div>
+                            </div>
+                            <!-- IP-Adapter Options -->
+                            <div id="ipadapter-options" class="hidden">
+                                <div class="form-row">
+                                    <div class="form-group">
+                                        <label for="ipadapter_scale">IP-Adapter Scale</label>
+                                        <input type="range" id="ipadapter_scale" name="ipadapter_scale" value="0.8" min="0.0" max="1.0" step="0.1">
+                                        <span id="ipadapter-scale-value">0.8</span>
+                                    </div>
+                                    <div class="form-group">
+                                        <label for="ipadapter_type">IP-Adapter Type</label>
+                                        <select id="ipadapter_type" name="ipadapter_type">
+                                            <option value="plus_sd15">Plus (SD 1.5)</option>
+                                            <option value="plus_sdxl">Plus (SDXL)</option>
+                                            <option value="faceid_sd15">FaceID (SD 1.5)</option>
+                                            <option value="faceid_sdxl">FaceID (SDXL)</option>
+                                        </select>
+                                    </div>
+                                </div>
+                                <div class="form-group">
+                                    <label>Reference Images for IP-Adapter</label>
+                                    <div class="file-upload" id="reference-upload-box">
+                                        <input type="file" id="reference_images" name="reference_images" accept="image/*" multiple onchange="handleReferenceUpload(this)">
+                                        <label for="reference_images" class="file-label">
+                                            <i class="fas fa-images"></i>
+                                            <span>Upload Reference Images</span>
+                                            <small>Select 1-5 images</small>
+                                        </label>
+                                    </div>
+                                    <div id="reference-preview" class="image-preview-grid"></div>
+                                </div>
+                            </div>
+                            <!-- InstantID Options -->
+                            <div id="instantid-options" class="hidden">
+                                <div class="form-row">
+                                    <div class="form-group">
+                                        <label for="instantid_scale">InstantID Scale</label>
+                                        <input type="range" id="instantid_scale" name="instantid_scale" value="0.8" min="0.0" max="1.0" step="0.1">
+                                        <span id="instantid-scale-value">0.8</span>
+                                    </div>
+                                </div>
+                            </div>
+                        </div>
                        <!-- Advanced Options -->
                        <div class="form-section collapsible">
                            <h3 onclick="toggleSection(this)">

--- a/videogen
+++ b/videogen
--- a/videogen_mcp_server.py
+++ b/videogen_mcp_server.py
@@ -789,6 +789,160 @@ async def list_tools() -> list:
                "required": ["text", "source_lang", "target_lang"]
            }
        ),
+        # Character Consistency Tools
+        Tool(
+            name="videogen_create_character",
+            description="Create a character profile from reference images for consistent character generation across multiple images/videos.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "name": {
+                        "type": "string",
+                        "description": "Character name (alphanumeric, underscores, hyphens only)"
+                    },
+                    "reference_images": {
+                        "type": "array",
+                        "items": {"type": "string"},
+                        "description": "List of paths to reference images (1-5 images)"
+                    },
+                    "description": {
+                        "type": "string",
+                        "description": "Optional description of the character"
+                    }
+                },
+                "required": ["name", "reference_images"]
+            }
+        ),
+        Tool(
+            name="videogen_list_characters",
+            description="List all saved character profiles.",
+            inputSchema={
+                "type": "object",
+                "properties": {}
+            }
+        ),
+        Tool(
+            name="videogen_show_character",
+            description="Show details of a specific character profile.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "name": {
+                        "type": "string",
+                        "description": "Character profile name"
+                    }
+                },
+                "required": ["name"]
+            }
+        ),
+        Tool(
+            name="videogen_delete_character",
+            description="Delete a character profile.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "name": {
+                        "type": "string",
+                        "description": "Character profile name to delete"
+                    }
+                },
+                "required": ["name"]
+            }
+        ),
+        Tool(
+            name="videogen_generate_with_character",
+            description="Generate an image or video with a specific character using IP-Adapter and/or InstantID for consistency.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "prompt": {
+                        "type": "string",
+                        "description": "Description of what to generate with the character"
+                    },
+                    "character": {
+                        "type": "string",
+                        "description": "Character profile name to use"
+                    },
+                    "model": {
+                        "type": "string",
+                        "description": "Model name (e.g., flux_dev, sdxl_base)"
+                    },
+                    "output": {
+                        "type": "string",
+                        "default": "output"
+                    },
+                    "use_ipadapter": {
+                        "type": "boolean",
+                        "description": "Use IP-Adapter for character consistency",
+                        "default": True
+                    },
+                    "use_instantid": {
+                        "type": "boolean",
+                        "description": "Use InstantID for face identity preservation",
+                        "default": False
+                    },
+                    "ipadapter_scale": {
+                        "type": "number",
+                        "description": "IP-Adapter influence scale (0.0-1.0)",
+                        "default": 0.8
+                    },
+                    "instantid_scale": {
+                        "type": "number",
+                        "description": "InstantID influence scale (0.0-1.0)",
+                        "default": 0.8
+                    },
+                    "animate": {
+                        "type": "boolean",
+                        "description": "Generate video instead of image (I2V)",
+                        "default": False
+                    }
+                },
+                "required": ["prompt", "character", "model"]
+            }
+        ),
+        Tool(
+            name="videogen_generate_with_reference",
+            description="Generate an image using reference images directly (without creating a character profile).",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "prompt": {
+                        "type": "string",
+                        "description": "Description of what to generate"
+                    },
+                    "reference_images": {
+                        "type": "array",
+                        "items": {"type": "string"},
+                        "description": "List of paths to reference images"
+                    },
+                    "model": {
+                        "type": "string",
+                        "description": "Model name (e.g., flux_dev, sdxl_base)"
+                    },
+                    "output": {
+                        "type": "string",
+                        "default": "output"
+                    },
+                    "ipadapter_scale": {
+                        "type": "number",
+                        "description": "IP-Adapter influence scale (0.0-1.0)",
+                        "default": 0.8
+                    },
+                    "use_instantid": {
+                        "type": "boolean",
+                        "description": "Use InstantID for face identity",
+                        "default": False
+                    }
+                },
+                "required": ["prompt", "reference_images", "model"]
+            }
+        ),
    ]
@@ -1123,6 +1277,84 @@ async def call_tool(name: str, arguments: dict) -> list:
        output, code = run_videogen_command(args)
        return [TextContent(type="text", text=output)]
+    # Character Consistency Tools
+    elif name == "videogen_create_character":
+        args = [
+            "--create-character", arguments["name"],
+        ]
+        # Add reference images
+        for img in arguments["reference_images"][:5]:  # Max 5 images
+            args.extend(["--character-images", img])
+        if arguments.get("description"):
+            args.extend(["--character-desc", arguments["description"]])
+        output, code = run_videogen_command(args)
+        return [TextContent(type="text", text=output)]
+    elif name == "videogen_list_characters":
+        args = ["--list-characters"]
+        output, code = run_videogen_command(args)
+        return [TextContent(type="text", text=output)]
+    elif name == "videogen_show_character":
+        args = ["--show-character", arguments["name"]]
+        output, code = run_videogen_command(args)
+        return [TextContent(type="text", text=output)]
+    elif name == "videogen_delete_character":
+        args = ["--delete-character", arguments["name"]]
+        output, code = run_videogen_command(args)
+        return [TextContent(type="text", text=output)]
+    elif name == "videogen_generate_with_character":
+        args = [
+            "--model", arguments["model"],
+            "--character", arguments["character"],
+            "--prompt", arguments["prompt"],
+            "--output", arguments.get("output", "output"),
+        ]
+        # IP-Adapter options
+        if arguments.get("use_ipadapter", True):
+            args.append("--ipadapter")
+            if arguments.get("ipadapter_scale"):
+                args.extend(["--ipadapter-scale", str(arguments["ipadapter_scale"])])
+        # InstantID options
+        if arguments.get("use_instantid", False):
+            args.append("--instantid")
+            if arguments.get("instantid_scale"):
+                args.extend(["--instantid-scale", str(arguments["instantid_scale"])])
+        # Animate for I2V
+        if arguments.get("animate", False):
+            args.append("--image_to_video")
+        output, code = run_videogen_command(args)
+        return [TextContent(type="text", text=output)]
+    elif name == "videogen_generate_with_reference":
+        args = [
+            "--model", arguments["model"],
+            "--prompt", arguments["prompt"],
+            "--output", arguments.get("output", "output"),
+        ]
+        # Add reference images
+        for img in arguments["reference_images"]:
+            args.extend(["--reference-images", img])
+        # IP-Adapter options
+        args.append("--ipadapter")
+        if arguments.get("ipadapter_scale"):
+            args.extend(["--ipadapter-scale", str(arguments["ipadapter_scale"])])
+        # InstantID options
+        if arguments.get("use_instantid", False):
+            args.append("--instantid")
+        output, code = run_videogen_command(args)
+        return [TextContent(type="text", text=output)]
    else:
        return [TextContent(type="text", text=f"Unknown tool: {name}")]

--- a/videogen_models.json
+++ b/videogen_models.json
-{
+i {
  "models": {
    "wan_1.3b_i2v": {
      "id": "Wan-AI/Wan2.1-I2V-1.3B-Diffusers",

--- a/webapp.py
+++ b/webapp.py
@@ -614,6 +614,136 @@ def delete_output(filename):
        return jsonify({'success': True})
    return jsonify({'error': 'File not found'}), 404
+# Character Profile API endpoints
+CHARACTERS_DIR = Path.home() / ".config" / "videogen" / "characters"
+@app.route('/api/characters', methods=['GET'])
+def api_list_characters():
+    """List all character profiles"""
+    characters = []
+    if CHARACTERS_DIR.exists():
+        for profile_file in CHARACTERS_DIR.glob("*.json"):
+            try:
+                with open(profile_file, 'r') as f:
+                    profile = json.load(f)
+                    characters.append({
+                        'name': profile.get('name', profile_file.stem),
+                        'description': profile.get('description', ''),
+                        'image_count': len(profile.get('reference_images', [])),
+                        'created': profile.get('created', ''),
+                        'tags': profile.get('tags', [])
+                    })
+            except Exception as e:
+                print(f"Error loading character profile {profile_file}: {e}")
+    return jsonify(characters)
+@app.route('/api/characters/<name>', methods=['GET'])
+def api_get_character(name):
+    """Get a specific character profile"""
+    profile_path = CHARACTERS_DIR / f"{name}.json"
+    if profile_path.exists():
+        try:
+            with open(profile_path, 'r') as f:
+                return jsonify(json.load(f))
+        except Exception as e:
+            return jsonify({'error': str(e)}), 500
+    return jsonify({'error': 'Character not found'}), 404
+@app.route('/api/characters', methods=['POST'])
+def api_create_character():
+    """Create a new character profile"""
+    name = request.form.get('name')
+    description = request.form.get('description', '')
+    if not name:
+        return jsonify({'error': 'Name is required'}), 400
+    # Sanitize name
+    name = re.sub(r'[^a-zA-Z0-9_-]', '_', name)
+    # Handle uploaded images
+    images = request.files.getlist('images')
+    if not images or len(images) == 0:
+        return jsonify({'error': 'At least one reference image is required'}), 400
+    # Create character directory
+    CHARACTERS_DIR.mkdir(parents=True, exist_ok=True)
+    char_image_dir = CHARACTERS_DIR / name
+    char_image_dir.mkdir(parents=True, exist_ok=True)
+    # Save images
+    saved_images = []
+    for i, img in enumerate(images[:5]):  # Max 5 images
+        if img and img.filename:
+            ext = img.filename.rsplit('.', 1)[-1].lower()
+            if ext in ALLOWED_EXTENSIONS['image']:
+                filename = f"reference_{i+1}.{ext}"
+                filepath = char_image_dir / filename
+                img.save(filepath)
+                saved_images.append(str(filepath))
+    if not saved_images:
+        return jsonify({'error': 'No valid images uploaded'}), 400
+    # Create profile
+    profile = {
+        'name': name,
+        'description': description,
+        'reference_images': saved_images,
+        'created': datetime.now().isoformat(),
+        'tags': []
+    }
+    # Save profile
+    profile_path = CHARACTERS_DIR / f"{name}.json"
+    with open(profile_path, 'w') as f:
+        json.dump(profile, f, indent=2)
+    return jsonify(profile)
+@app.route('/api/characters/<name>', methods=['DELETE'])
+def api_delete_character(name):
+    """Delete a character profile"""
+    profile_path = CHARACTERS_DIR / f"{name}.json"
+    char_image_dir = CHARACTERS_DIR / name
+    if not profile_path.exists():
+        return jsonify({'error': 'Character not found'}), 404
+    try:
+        # Delete profile file
+        profile_path.unlink()
+        # Delete images directory
+        if char_image_dir.exists():
+            shutil.rmtree(char_image_dir)
+        return jsonify({'success': True})
+    except Exception as e:
+        return jsonify({'error': str(e)}), 500
+@app.route('/api/upload-multiple', methods=['POST'])
+def upload_multiple_files():
+    """Upload multiple files (for reference images)"""
+    files = request.files.getlist('files')
+    upload_type = request.form.get('type', 'general')
+    if not files:
+        return jsonify({'error': 'No files provided'}), 400
+    saved_paths = []
+    for f in files:
+        if f and f.filename:
+            ext = f.filename.rsplit('.', 1)[-1].lower()
+            if ext in ALLOWED_EXTENSIONS['image']:
+                filename = f"{uuid.uuid4().hex[:8]}_{secure_filename(f.filename)}"
+                filepath = UPLOAD_FOLDER / filename
+                f.save(filepath)
+                saved_paths.append(str(filepath))
+    return jsonify({'paths': saved_paths})
 # WebSocket events
 @socketio.on('connect')
 def handle_connect():