Add model type filters and update MCP server

Features Added: - Model type filters: --t2i-only, --v2v-only, --v2i-only, --3d-only, --tts-only, --audio-only - Enhanced model list table with new capability columns (V2V, V2I, 3D, TTS) - Updated detect_model_type() to detect all model capabilities MCP Server Updates: - Added videogen_video_to_video tool for V2V style transfer - Added videogen_apply_video_filter tool for video filters - Added videogen_extract_frames tool for frame extraction - Added videogen_create_collage tool for thumbnail grids - Added videogen_upscale_video tool for AI upscaling - Added videogen_convert_3d tool for 2D-to-3D conversion - Added videogen_concat_videos tool for video concatenation - Updated model list filter to support all new types SKILL.md Updates: - Added V2V, V2I, 3D to generation types table - Added model filter examples - Added 8 new use cases for V2V, filters, frames, collage, upscale, 3D, concat

Add model type filters and update MCP server
Features Added: - Model type filters: --t2i-only, --v2v-only, --v2i-only, --3d-only, --tts-only, --audio-only - Enhanced model list table with new capability columns (V2V, V2I, 3D, TTS) - Updated detect_model_type() to detect all model capabilities MCP Server Updates: - Added videogen_video_to_video tool for V2V style transfer - Added videogen_apply_video_filter tool for video filters - Added videogen_extract_frames tool for frame extraction - Added videogen_create_collage tool for thumbnail grids - Added videogen_upscale_video tool for AI upscaling - Added videogen_convert_3d tool for 2D-to-3D conversion - Added videogen_concat_videos tool for video concatenation - Updated model list filter to support all new types SKILL.md Updates: - Added V2V, V2I, 3D to generation types table - Added model filter examples - Added 8 new use cases for V2V, filters, frames, collage, upscale, 3D, concat
1c01f5b7 · Stefy Lanza (nextime / spora ) · e69c2d81 · 1c01f5b7 · 1c01f5b7 · 1c01f5b7
Commit 1c01f5b7 authored Feb 25, 2026 by Stefy Lanza (nextime / spora )
Hide whitespace changes
Inline Side-by-side

Showing with 519 additions and 26 deletions

SKILL.md SKILL.md +107 -0

videogen videogen +86 -25

videogen_mcp_server.py videogen_mcp_server.py +326 -1

No files found.
--- a/SKILL.md
+++ b/SKILL.md
@@ -13,6 +13,10 @@ VideoGen is a universal video generation toolkit that supports:
 - **Image-to-Video (I2V)**: Animate static images
 - **Text-to-Image (T2I)**: Generate images from text
 - **Image-to-Image (I2I)**: Transform existing images
+- **Video-to-Video (V2V)**: Style transfer and filters for videos
+- **Video-to-Image (V2I)**: Extract frames and keyframes from videos
+- **2D-to-3D Conversion**: Convert 2D videos to 3D SBS, anaglyph, or VR 360
+- **Video Upscaling**: AI-powered video upscaling
 - **Audio Generation**: TTS and music generation
 - **Lip Sync**: Synchronize lip movements with audio
@@ -64,6 +68,31 @@ python3 videogen --show-model <model_id_or_name>
 | I2V | `--image_to_video --model i2v_model --prompt "..."` | Animate an image |
 | T2I | `--model t2i_model --prompt "..." --output image.png` | Generate image |
 | I2I | `--image-to-image --image input.png --prompt "..."` | Transform image |
+| V2V | `--video input.mp4 --video-to-video --prompt "..."` | Style transfer on video |
+| V2I | `--video input.mp4 --extract-keyframes` | Extract frames from video |
+| 3D | `--video input.mp4 --convert-3d-sbs` | Convert 2D to 3D |
+### Model Filters
+```bash
+# List models by type
+python3 videogen --model-list --t2v-only      # Text-to-Video models
+python3 videogen --model-list --i2v-only      # Image-to-Video models
+python3 videogen --model-list --t2i-only      # Text-to-Image models
+python3 videogen --model-list --v2v-only      # Video-to-Video models
+python3 videogen --model-list --v2i-only      # Video-to-Image models
+python3 videogen --model-list --3d-only       # 2D-to-3D models
+python3 videogen --model-list --tts-only      # TTS models
+python3 videogen --model-list --audio-only    # Audio models
+# List by VRAM requirement
+python3 videogen --model-list --low-vram      # ≤16GB VRAM
+python3 videogen --model-list --high-vram    # >30GB VRAM
+python3 videogen --model-list --huge-vram    # >55GB VRAM
+# List NSFW-friendly models
+python3 videogen --model-list --nsfw-friendly
+```
 ### Auto Mode
@@ -144,6 +173,84 @@ python3 videogen --image_to_video --model svd_xt_1.1 \
  --lip_sync --output speaker
 ```
+### 8. Video-to-Video Style Transfer
+```bash
+# Apply style transfer to a video
+python3 videogen --video input.mp4 --video-to-video \
+  --prompt "make it look like a watercolor painting" \
+  --v2v-strength 0.7 --output styled
+```
+### 9. Apply Video Filter
+```bash
+# Apply grayscale filter
+python3 videogen --video input.mp4 --video-filter grayscale --output gray
+# Apply slow motion
+python3 videogen --video input.mp4 --video-filter slow \
+  --filter-params "factor=0.5" --output slowmo
+# Apply blur
+python3 videogen --video input.mp4 --video-filter blur \
+  --filter-params "radius=10" --output blurred
+```
+### 10. Extract Frames from Video
+```bash
+# Extract keyframes
+python3 videogen --video input.mp4 --extract-keyframes --frames-dir keyframes
+# Extract single frame at timestamp
+python3 videogen --video input.mp4 --extract-frame --timestamp 5.5 --output frame.png
+# Extract all frames
+python3 videogen --video input.mp4 --extract-frames --v2v-max-frames 100 --frames-dir all_frames
+```
+### 11. Create Video Collage
+```bash
+# Create 4x4 thumbnail grid
+python3 videogen --video input.mp4 --video-collage \
+  --collage-grid 4x4 --output collage.png
+```
+### 12. Upscale Video
+```bash
+# 2x upscale using ffmpeg
+python3 videogen --video input.mp4 --upscale-video \
+  --upscale-factor 2.0 --output upscaled
+# 4x upscale using AI (ESRGAN)
+python3 videogen --video input.mp4 --upscale-video \
+  --upscale-factor 4.0 --upscale-method esrgan --output upscaled_4k
+```
+### 13. Convert 2D to 3D
+```bash
+# Convert to side-by-side 3D for VR
+python3 videogen --video input.mp4 --convert-3d-sbs \
+  --depth-method ai --output 3d_sbs
+# Convert to anaglyph 3D (red/cyan glasses)
+python3 videogen --video input.mp4 --convert-3d-anaglyph --output anaglyph
+# Convert to VR 360 format
+python3 videogen --video input.mp4 --convert-vr --output vr360
+```
+### 14. Concatenate Videos
+```bash
+# Join multiple videos
+python3 videogen --concat-videos video1.mp4 video2.mp4 video3.mp4 --output joined
+```
 ---
 ## Model Selection Guide

--- a/videogen
+++ b/videogen
@@ -2039,9 +2039,28 @@ def detect_model_type(info):
    is_img2img_pipeline = "Img2Img" in pipeline_class or "img2img" in model_id
    i2i = t2i or is_img2img_pipeline or any(x in tags for x in ["image-to-image", "img2img"])
-    # Audio detection
+    # V2V detection (video-to-video) - style transfer, video editing
-    audio = any(x in model_id or x in desc for x in ["tts", "bark", "musicgen", "audio", "voice", "speech"])
+    v2v = any(x in model_id or x in desc for x in ["video-to-video", "v2v", "video editing", "video style"])
-    audio = audio or any(x in tags for x in ["tts", "audio", "speech", "music"])
+    v2v = v2v or any(x in tags for x in ["video-to-video", "v2v", "video-editing"])
+    # Video models can generally do V2V with style transfer
+    v2v = v2v or (t2v or i2v)
+    # V2I detection (video-to-image) - frame extraction, keyframe detection
+    v2i = any(x in model_id or x in desc for x in ["video-to-image", "v2i", "frame extraction", "keyframe"])
+    v2i = v2i or any(x in tags for x in ["video-to-image", "v2i"])
+    # 2D-to-3D detection - depth estimation, stereo, VR
+    to_3d = any(x in model_id or x in desc for x in ["depth", "stereo", "3d", "vr", "equirectangular", "midas", "dpt"])
+    to_3d = to_3d or any(x in tags for x in ["depth-estimation", "stereo", "3d", "vr", "monocular-depth"])
+    # TTS detection (text-to-speech)
+    tts = any(x in model_id or x in desc for x in ["tts", "bark", "speech", "voice synthesis", "vits", "xtts"])
+    tts = tts or any(x in tags for x in ["tts", "text-to-speech", "speech-synthesis"])
+    # Audio detection (general audio - music, sound effects)
+    audio = any(x in model_id or x in desc for x in ["musicgen", "audioldm", "audio generation", "music generation"])
+    audio = audio or any(x in tags for x in ["audio-generation", "music-generation", "audioldm"])
+    audio = audio or tts  # TTS models are also audio models
    # NSFW detection
    nsfw_keywords = ["nsfw", "adult", "uncensored", "porn", "explicit", "xxx", "erotic", "nude"]
@@ -2056,6 +2075,10 @@ def detect_model_type(info):
        "t2v": t2v,
        "t2i": t2i,
        "i2i": i2i,
+        "v2v": v2v,
+        "v2i": v2i,
+        "to_3d": to_3d,
+        "tts": tts,
        "audio": audio,
        "nsfw": nsfw,
        "lora": lora
@@ -2127,10 +2150,14 @@ def show_model_details(model_id_or_name, args):
    # Capabilities
    caps = detect_model_type(model)
    print(f"\n  Capabilities:")
-    print(f"    I2V (Image-to-Video):  {'✅ Yes' if caps['i2v'] else '❌ No'}")
    print(f"    T2V (Text-to-Video):   {'✅ Yes' if caps['t2v'] else '❌ No'}")
+    print(f"    I2V (Image-to-Video):  {'✅ Yes' if caps['i2v'] else '❌ No'}")
    print(f"    T2I (Text-to-Image):   {'✅ Yes' if caps['t2i'] else '❌ No'}")
    print(f"    I2I (Image-to-Image):  {'✅ Yes' if caps['i2i'] else '❌ No'}")
+    print(f"    V2V (Video-to-Video):  {'✅ Yes' if caps['v2v'] else '❌ No'}")
+    print(f"    V2I (Video-to-Image):  {'✅ Yes' if caps['v2i'] else '❌ No'}")
+    print(f"    2D-to-3D:              {'✅ Yes' if caps['to_3d'] else '❌ No'}")
+    print(f"    TTS (Text-to-Speech):  {'✅ Yes' if caps['tts'] else '❌ No'}")
    print(f"    Audio:                 {'✅ Yes' if caps['audio'] else '❌ No'}")
    print(f"    NSFW-friendly:         {'✅ Yes' if caps['nsfw'] else '❌ No'}")
    print(f"    LoRA Adapter:          {'✅ Yes' if caps['lora'] else '❌ No'}")
@@ -2179,12 +2206,26 @@ def print_model_list(args):
    auto_disable_data = load_auto_disable_data()
    for name, info in sorted(MODELS.items()):
-        if args.i2v_only and not info.get("supports_i2v", False):
+        caps = detect_model_type(info)
+        # Apply filters
+        if args.i2v_only and not caps["i2v"]:
+            continue
+        if args.t2v_only and not caps["t2v"]:
            continue
-        if args.t2v_only and info.get("supports_i2v", False):
+        if getattr(args, 't2i_only', False) and not caps["t2i"]:
            continue
-        if args.nsfw_friendly and not any(word in name.lower() or word in info.get("desc", "").lower()
+        if getattr(args, 'v2v_only', False) and not caps["v2v"]:
-                                          for word in ["uncensored", "nsfw", "adult", "realism", "erotic", "explicit"]):
+            continue
+        if getattr(args, 'v2i_only', False) and not caps["v2i"]:
+            continue
+        if getattr(args, '3d_only', False) and not caps["to_3d"]:
+            continue
+        if getattr(args, 'tts_only', False) and not caps["tts"]:
+            continue
+        if getattr(args, 'audio_only', False) and not caps["audio"]:
+            continue
+        if args.nsfw_friendly and not caps["nsfw"]:
            continue
        if args.low_vram:
            est = parse_vram_estimate(info["vram"])
@@ -2200,7 +2241,6 @@ def print_model_list(args):
                continue
        shown += 1
-        caps = detect_model_type(info)
        # Check if model is disabled for auto mode
        model_id = info.get("id", "")
@@ -2212,21 +2252,24 @@ def print_model_list(args):
    if shown == 0:
        print("No models match the selected filters.")
    else:
-        # Print table header with new Auto column
+        # Print table header with all capability columns
-        print(f"{'ID':>4}  {'Name':<26} {'VRAM':<11} {'I2V':<4} {'T2V':<4} {'T2I':<4} {'I2I':<4} {'NSFW':<5} {'LoRA':<5} {'Auto':<6}")
+        print(f"{'ID':>4}  {'Name':<22} {'VRAM':<9} {'T2V':<3} {'I2V':<3} {'T2I':<3} {'V2V':<3} {'V2I':<3} {'3D':<3} {'TTS':<3} {'NSFW':<4} {'LoRA':<4} {'Auto':<5}")
-        print("-" * 100)
+        print("-" * 110)
        for idx, (name, info, caps, is_disabled, fail_count) in enumerate(results, 1):
            # Truncate name if too long
-            display_name = name[:24] + ".." if len(name) > 26 else name
+            display_name = name[:20] + ".." if len(name) > 22 else name
-            vram = info["vram"][:9] if len(info["vram"]) > 9 else info["vram"]
+            vram = info["vram"][:7] if len(info["vram"]) > 7 else info["vram"]
-            i2v = "Yes" if caps["i2v"] else "-"
+            t2v = "✓" if caps["t2v"] else "-"
-            t2v = "Yes" if caps["t2v"] else "-"
+            i2v = "✓" if caps["i2v"] else "-"
-            t2i = "Yes" if caps["t2i"] else "-"
+            t2i = "✓" if caps["t2i"] else "-"
-            i2i = "Yes" if caps["i2i"] else "-"
+            v2v = "✓" if caps["v2v"] else "-"
-            nsfw = "Yes" if caps["nsfw"] else "-"
+            v2i = "✓" if caps["v2i"] else "-"
-            lora = "Yes" if caps["lora"] else "-"
+            to_3d = "✓" if caps["to_3d"] else "-"
+            tts = "✓" if caps["tts"] else "-"
+            nsfw = "✓" if caps["nsfw"] else "-"
+            lora = "✓" if caps["lora"] else "-"
            # Show auto status
            if is_disabled:
@@ -2234,17 +2277,21 @@ def print_model_list(args):
            elif fail_count > 0:
                auto_status = f"{fail_count}/3"
            else:
-                auto_status = "Yes"
+                auto_status = "✓"
            # Add indicator for disabled models
            if is_disabled:
-                display_name = f"🚫{display_name[:23]}" if len(display_name) < 26 else f"🚫{display_name[:23]}.."
+                display_name = f"🚫{display_name[:19]}" if len(display_name) < 22 else f"🚫{display_name[:19]}.."
-            print(f"{idx:>4}  {display_name:<26} {vram:<11} {i2v:<4} {t2v:<4} {t2i:<4} {i2i:<4} {nsfw:<5} {lora:<5} {auto_status:<6}")
+            print(f"{idx:>4}  {display_name:<22} {vram:<9} {t2v:<3} {i2v:<3} {t2i:<3} {v2v:<3} {v2i:<3} {to_3d:<3} {tts:<3} {nsfw:<4} {lora:<4} {auto_status:<5}")
-        print("-" * 100)
+        print("-" * 110)
        print(f"Total shown: {shown} / {len(MODELS)} available")
+        # Show legend
+        print("\n  Columns: T2V=Text-to-Video, I2V=Image-to-Video, T2I=Text-to-Image")
+        print("           V2V=Video-to-Video, V2I=Video-to-Image, 3D=2D-to-3D, TTS=Text-to-Speech")
        # Show legend for auto column
        disabled_count = sum(1 for _, _, _, is_disabled, _ in results if is_disabled)
        if disabled_count > 0:
@@ -2252,6 +2299,8 @@ def print_model_list(args):
            print(f"  {disabled_count} model(s) disabled for --auto mode")
            print(f"  Use --model <name> manually to re-enable a disabled model")
+    print("\nFilters: --t2v-only, --i2v-only, --t2i-only, --v2v-only, --v2i-only, --3d-only, --tts-only, --audio-only")
+    print("         --nsfw-friendly, --low-vram, --high-vram, --huge-vram")
    print("\nUse --model <name> to select a model.")
    print("Use --show-model <ID|name> to see full model details.")
    sys.exit(0)
@@ -6738,6 +6787,18 @@ List TTS voices:
                        help="When using --model-list: only show I2V-capable models")
    parser.add_argument("--t2v-only", action="store_true",
                        help="When using --model-list: only show T2V-only models")
+    parser.add_argument("--t2i-only", action="store_true",
+                        help="When using --model-list: only show T2I (text-to-image) models")
+    parser.add_argument("--v2v-only", action="store_true",
+                        help="When using --model-list: only show V2V (video-to-video) models")
+    parser.add_argument("--v2i-only", action="store_true",
+                        help="When using --model-list: only show V2I (video-to-image) models")
+    parser.add_argument("--3d-only", action="store_true",
+                        help="When using --model-list: only show 2D-to-3D conversion models")
+    parser.add_argument("--tts-only", action="store_true",
+                        help="When using --model-list: only show TTS (text-to-speech) models")
+    parser.add_argument("--audio-only", action="store_true",
+                        help="When using --model-list: only show audio generation models")
    parser.add_argument("--nsfw-friendly", action="store_true",
                        help="When using --model-list: only show uncensored/NSFW-capable models")
    parser.add_argument("--low-vram", action="store_true",

--- a/videogen_mcp_server.py
+++ b/videogen_mcp_server.py
@@ -367,7 +367,7 @@ async def list_tools() -> list:
                "properties": {
                    "filter": {
                        "type": "string",
-                        "enum": ["all", "i2v", "t2v", "low_vram", "high_vram", "huge_vram", "nsfw"],
+                        "enum": ["all", "i2v", "t2v", "t2i", "v2v", "v2i", "3d", "tts", "audio", "low_vram", "high_vram", "huge_vram", "nsfw"],
                        "description": "Filter models by type or VRAM requirement",
                        "default": "all"
                    }
@@ -375,6 +375,221 @@ async def list_tools() -> list:
            }
        ),
+        Tool(
+            name="videogen_video_to_video",
+            description="Transform an existing video (Video-to-Video). Apply style transfer or filters to a video.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "video": {
+                        "type": "string",
+                        "description": "Path to the input video file"
+                    },
+                    "prompt": {
+                        "type": "string",
+                        "description": "Description of the desired transformation"
+                    },
+                    "output": {
+                        "type": "string",
+                        "default": "output"
+                    },
+                    "strength": {
+                        "type": "number",
+                        "description": "Transformation strength (0.0-1.0)",
+                        "default": 0.75
+                    },
+                    "fps": {
+                        "type": "integer",
+                        "description": "Processing FPS",
+                        "default": 15
+                    }
+                },
+                "required": ["video", "prompt"]
+            }
+        ),
+        Tool(
+            name="videogen_apply_video_filter",
+            description="Apply a filter to a video. Available filters: grayscale, sepia, blur, sharpen, contrast, saturation, speed, slow, reverse, fade_in, fade_out, denoise, stabilize.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "video": {
+                        "type": "string",
+                        "description": "Path to the input video file"
+                    },
+                    "filter": {
+                        "type": "string",
+                        "enum": ["grayscale", "sepia", "blur", "sharpen", "contrast", "saturation", "speed", "slow", "reverse", "fade_in", "fade_out", "denoise", "stabilize"],
+                        "description": "Filter to apply"
+                    },
+                    "params": {
+                        "type": "string",
+                        "description": "Filter parameters (e.g., 'factor=2.0' for speed, 'radius=10' for blur)"
+                    },
+                    "output": {
+                        "type": "string",
+                        "default": "output"
+                    }
+                },
+                "required": ["video", "filter"]
+            }
+        ),
+        Tool(
+            name="videogen_extract_frames",
+            description="Extract frames from a video. Can extract a single frame, keyframes, or all frames.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "video": {
+                        "type": "string",
+                        "description": "Path to the input video file"
+                    },
+                    "mode": {
+                        "type": "string",
+                        "enum": ["single", "keyframes", "all"],
+                        "description": "Extraction mode: single frame, keyframes, or all frames",
+                        "default": "keyframes"
+                    },
+                    "timestamp": {
+                        "type": "number",
+                        "description": "Timestamp for single frame extraction (seconds)"
+                    },
+                    "frame_number": {
+                        "type": "integer",
+                        "description": "Frame number for single frame extraction"
+                    },
+                    "max_frames": {
+                        "type": "integer",
+                        "description": "Maximum frames to extract",
+                        "default": 100
+                    },
+                    "output_dir": {
+                        "type": "string",
+                        "description": "Output directory for frames",
+                        "default": "frames"
+                    }
+                },
+                "required": ["video"]
+            }
+        ),
+        Tool(
+            name="videogen_create_collage",
+            description="Create a collage/thumbnail grid from a video.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "video": {
+                        "type": "string",
+                        "description": "Path to the input video file"
+                    },
+                    "grid": {
+                        "type": "string",
+                        "description": "Grid size (e.g., '4x4', '3x3')",
+                        "default": "4x4"
+                    },
+                    "method": {
+                        "type": "string",
+                        "enum": ["uniform", "keyframes", "random"],
+                        "description": "Sampling method",
+                        "default": "uniform"
+                    },
+                    "output": {
+                        "type": "string",
+                        "default": "collage.png"
+                    }
+                },
+                "required": ["video"]
+            }
+        ),
+        Tool(
+            name="videogen_upscale_video",
+            description="Upscale a video using AI upscaling models.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "video": {
+                        "type": "string",
+                        "description": "Path to the input video file"
+                    },
+                    "scale": {
+                        "type": "number",
+                        "description": "Upscale factor (2.0 or 4.0)",
+                        "default": 2.0
+                    },
+                    "method": {
+                        "type": "string",
+                        "enum": ["ffmpeg", "esrgan", "real_esrgan", "swinir"],
+                        "description": "Upscaling method",
+                        "default": "ffmpeg"
+                    },
+                    "output": {
+                        "type": "string",
+                        "default": "output"
+                    }
+                },
+                "required": ["video"]
+            }
+        ),
+        Tool(
+            name="videogen_convert_3d",
+            description="Convert 2D video to 3D format (SBS, anaglyph, or VR 360).",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "video": {
+                        "type": "string",
+                        "description": "Path to the input video file"
+                    },
+                    "format": {
+                        "type": "string",
+                        "enum": ["sbs", "anaglyph", "vr"],
+                        "description": "3D output format: sbs (side-by-side), anaglyph (red/cyan), vr (360)"
+                    },
+                    "depth_method": {
+                        "type": "string",
+                        "enum": ["ai", "disparity", "shift"],
+                        "description": "Depth estimation method",
+                        "default": "shift"
+                    },
+                    "disparity_scale": {
+                        "type": "number",
+                        "description": "Disparity scale (0.5-2.0)",
+                        "default": 1.0
+                    },
+                    "output": {
+                        "type": "string",
+                        "default": "output"
+                    }
+                },
+                "required": ["video", "format"]
+            }
+        ),
+        Tool(
+            name="videogen_concat_videos",
+            description="Concatenate multiple videos into one.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "videos": {
+                        "type": "array",
+                        "items": {"type": "string"},
+                        "description": "List of video file paths to concatenate"
+                    },
+                    "output": {
+                        "type": "string",
+                        "default": "output"
+                    }
+                },
+                "required": ["videos"]
+            }
+        ),
        Tool(
            name="videogen_show_model",
            description="Show detailed information about a specific model.",
@@ -572,6 +787,18 @@ async def call_tool(name: str, arguments: dict) -> list:
            args.append("--i2v-only")
        elif filter_type == "t2v":
            args.append("--t2v-only")
+        elif filter_type == "t2i":
+            args.append("--t2i-only")
+        elif filter_type == "v2v":
+            args.append("--v2v-only")
+        elif filter_type == "v2i":
+            args.append("--v2i-only")
+        elif filter_type == "3d":
+            args.append("--3d-only")
+        elif filter_type == "tts":
+            args.append("--tts-only")
+        elif filter_type == "audio":
+            args.append("--audio-only")
        elif filter_type == "low_vram":
            args.append("--low-vram")
        elif filter_type == "high_vram":
@@ -614,6 +841,104 @@ async def call_tool(name: str, arguments: dict) -> list:
        output, code = run_videogen_command(args)
        return [TextContent(type="text", text=output)]
+    elif name == "videogen_video_to_video":
+        args = [
+            "--video", arguments["video"],
+            "--video-to-video",
+            "--prompt", arguments["prompt"],
+            "--output", arguments.get("output", "output"),
+            "--v2v-strength", str(arguments.get("strength", 0.75)),
+            "--v2v-fps", str(arguments.get("fps", 15)),
+        ]
+        output, code = run_videogen_command(args, timeout=3600)
+        return [TextContent(type="text", text=output)]
+    elif name == "videogen_apply_video_filter":
+        args = [
+            "--video", arguments["video"],
+            "--video-filter", arguments["filter"],
+            "--output", arguments.get("output", "output"),
+        ]
+        if arguments.get("params"):
+            args.extend(["--filter-params", arguments["params"]])
+        output, code = run_videogen_command(args, timeout=1800)
+        return [TextContent(type="text", text=output)]
+    elif name == "videogen_extract_frames":
+        mode = arguments.get("mode", "keyframes")
+        args = ["--video", arguments["video"]]
+        if mode == "single":
+            args.append("--extract-frame")
+            if arguments.get("timestamp"):
+                args.extend(["--timestamp", str(arguments["timestamp"])])
+            if arguments.get("frame_number"):
+                args.extend(["--frame-number", str(arguments["frame_number"])])
+        elif mode == "keyframes":
+            args.append("--extract-keyframes")
+            if arguments.get("max_frames"):
+                args.extend(["--max-keyframes", str(arguments["max_frames"])])
+        else:  # all
+            args.append("--extract-frames")
+            if arguments.get("max_frames"):
+                args.extend(["--v2v-max-frames", str(arguments["max_frames"])])
+        if arguments.get("output_dir"):
+            args.extend(["--frames-dir", arguments["output_dir"]])
+        output, code = run_videogen_command(args)
+        return [TextContent(type="text", text=output)]
+    elif name == "videogen_create_collage":
+        args = [
+            "--video", arguments["video"],
+            "--video-collage",
+            "--collage-grid", arguments.get("grid", "4x4"),
+            "--collage-method", arguments.get("method", "uniform"),
+            "--output", arguments.get("output", "collage.png"),
+        ]
+        output, code = run_videogen_command(args)
+        return [TextContent(type="text", text=output)]
+    elif name == "videogen_upscale_video":
+        args = [
+            "--video", arguments["video"],
+            "--upscale-video",
+            "--upscale-factor", str(arguments.get("scale", 2.0)),
+            "--upscale-method", arguments.get("method", "ffmpeg"),
+            "--output", arguments.get("output", "output"),
+        ]
+        output, code = run_videogen_command(args, timeout=3600)
+        return [TextContent(type="text", text=output)]
+    elif name == "videogen_convert_3d":
+        format_type = arguments["format"]
+        args = [
+            "--video", arguments["video"],
+            "--output", arguments.get("output", "output"),
+        ]
+        if format_type == "sbs":
+            args.append("--convert-3d-sbs")
+        elif format_type == "anaglyph":
+            args.append("--convert-3d-anaglyph")
+        elif format_type == "vr":
+            args.append("--convert-vr")
+        if arguments.get("depth_method"):
+            args.extend(["--depth-method", arguments["depth_method"]])
+        if arguments.get("disparity_scale"):
+            args.extend(["--disparity-scale", str(arguments["disparity_scale"])])
+        output, code = run_videogen_command(args, timeout=3600)
+        return [TextContent(type="text", text=output)]
+    elif name == "videogen_concat_videos":
+        videos = arguments["videos"]
+        args = ["--concat-videos"] + videos + ["--output", arguments.get("output", "output")]
+        output, code = run_videogen_command(args)
+        return [TextContent(type="text", text=output)]
    else:
        return [TextContent(type="text", text=f"Unknown tool: {name}")]