Add model type filters and update MCP server

Features Added:
- Model type filters: --t2i-only, --v2v-only, --v2i-only, --3d-only, --tts-only, --audio-only
- Enhanced model list table with new capability columns (V2V, V2I, 3D, TTS)
- Updated detect_model_type() to detect all model capabilities

MCP Server Updates:
- Added videogen_video_to_video tool for V2V style transfer
- Added videogen_apply_video_filter tool for video filters
- Added videogen_extract_frames tool for frame extraction
- Added videogen_create_collage tool for thumbnail grids
- Added videogen_upscale_video tool for AI upscaling
- Added videogen_convert_3d tool for 2D-to-3D conversion
- Added videogen_concat_videos tool for video concatenation
- Updated model list filter to support all new types

SKILL.md Updates:
- Added V2V, V2I, 3D to generation types table
- Added model filter examples
- Added 8 new use cases for V2V, filters, frames, collage, upscale, 3D, concat
parent e69c2d81
...@@ -13,6 +13,10 @@ VideoGen is a universal video generation toolkit that supports: ...@@ -13,6 +13,10 @@ VideoGen is a universal video generation toolkit that supports:
- **Image-to-Video (I2V)**: Animate static images - **Image-to-Video (I2V)**: Animate static images
- **Text-to-Image (T2I)**: Generate images from text - **Text-to-Image (T2I)**: Generate images from text
- **Image-to-Image (I2I)**: Transform existing images - **Image-to-Image (I2I)**: Transform existing images
- **Video-to-Video (V2V)**: Style transfer and filters for videos
- **Video-to-Image (V2I)**: Extract frames and keyframes from videos
- **2D-to-3D Conversion**: Convert 2D videos to 3D SBS, anaglyph, or VR 360
- **Video Upscaling**: AI-powered video upscaling
- **Audio Generation**: TTS and music generation - **Audio Generation**: TTS and music generation
- **Lip Sync**: Synchronize lip movements with audio - **Lip Sync**: Synchronize lip movements with audio
...@@ -64,6 +68,31 @@ python3 videogen --show-model <model_id_or_name> ...@@ -64,6 +68,31 @@ python3 videogen --show-model <model_id_or_name>
| I2V | `--image_to_video --model i2v_model --prompt "..."` | Animate an image | | I2V | `--image_to_video --model i2v_model --prompt "..."` | Animate an image |
| T2I | `--model t2i_model --prompt "..." --output image.png` | Generate image | | T2I | `--model t2i_model --prompt "..." --output image.png` | Generate image |
| I2I | `--image-to-image --image input.png --prompt "..."` | Transform image | | I2I | `--image-to-image --image input.png --prompt "..."` | Transform image |
| V2V | `--video input.mp4 --video-to-video --prompt "..."` | Style transfer on video |
| V2I | `--video input.mp4 --extract-keyframes` | Extract frames from video |
| 3D | `--video input.mp4 --convert-3d-sbs` | Convert 2D to 3D |
### Model Filters
```bash
# List models by type
python3 videogen --model-list --t2v-only # Text-to-Video models
python3 videogen --model-list --i2v-only # Image-to-Video models
python3 videogen --model-list --t2i-only # Text-to-Image models
python3 videogen --model-list --v2v-only # Video-to-Video models
python3 videogen --model-list --v2i-only # Video-to-Image models
python3 videogen --model-list --3d-only # 2D-to-3D models
python3 videogen --model-list --tts-only # TTS models
python3 videogen --model-list --audio-only # Audio models
# List by VRAM requirement
python3 videogen --model-list --low-vram # ≤16GB VRAM
python3 videogen --model-list --high-vram # >30GB VRAM
python3 videogen --model-list --huge-vram # >55GB VRAM
# List NSFW-friendly models
python3 videogen --model-list --nsfw-friendly
```
### Auto Mode ### Auto Mode
...@@ -144,6 +173,84 @@ python3 videogen --image_to_video --model svd_xt_1.1 \ ...@@ -144,6 +173,84 @@ python3 videogen --image_to_video --model svd_xt_1.1 \
--lip_sync --output speaker --lip_sync --output speaker
``` ```
### 8. Video-to-Video Style Transfer
```bash
# Apply style transfer to a video
python3 videogen --video input.mp4 --video-to-video \
--prompt "make it look like a watercolor painting" \
--v2v-strength 0.7 --output styled
```
### 9. Apply Video Filter
```bash
# Apply grayscale filter
python3 videogen --video input.mp4 --video-filter grayscale --output gray
# Apply slow motion
python3 videogen --video input.mp4 --video-filter slow \
--filter-params "factor=0.5" --output slowmo
# Apply blur
python3 videogen --video input.mp4 --video-filter blur \
--filter-params "radius=10" --output blurred
```
### 10. Extract Frames from Video
```bash
# Extract keyframes
python3 videogen --video input.mp4 --extract-keyframes --frames-dir keyframes
# Extract single frame at timestamp
python3 videogen --video input.mp4 --extract-frame --timestamp 5.5 --output frame.png
# Extract all frames
python3 videogen --video input.mp4 --extract-frames --v2v-max-frames 100 --frames-dir all_frames
```
### 11. Create Video Collage
```bash
# Create 4x4 thumbnail grid
python3 videogen --video input.mp4 --video-collage \
--collage-grid 4x4 --output collage.png
```
### 12. Upscale Video
```bash
# 2x upscale using ffmpeg
python3 videogen --video input.mp4 --upscale-video \
--upscale-factor 2.0 --output upscaled
# 4x upscale using AI (ESRGAN)
python3 videogen --video input.mp4 --upscale-video \
--upscale-factor 4.0 --upscale-method esrgan --output upscaled_4k
```
### 13. Convert 2D to 3D
```bash
# Convert to side-by-side 3D for VR
python3 videogen --video input.mp4 --convert-3d-sbs \
--depth-method ai --output 3d_sbs
# Convert to anaglyph 3D (red/cyan glasses)
python3 videogen --video input.mp4 --convert-3d-anaglyph --output anaglyph
# Convert to VR 360 format
python3 videogen --video input.mp4 --convert-vr --output vr360
```
### 14. Concatenate Videos
```bash
# Join multiple videos
python3 videogen --concat-videos video1.mp4 video2.mp4 video3.mp4 --output joined
```
--- ---
## Model Selection Guide ## Model Selection Guide
......
...@@ -2039,9 +2039,28 @@ def detect_model_type(info): ...@@ -2039,9 +2039,28 @@ def detect_model_type(info):
is_img2img_pipeline = "Img2Img" in pipeline_class or "img2img" in model_id is_img2img_pipeline = "Img2Img" in pipeline_class or "img2img" in model_id
i2i = t2i or is_img2img_pipeline or any(x in tags for x in ["image-to-image", "img2img"]) i2i = t2i or is_img2img_pipeline or any(x in tags for x in ["image-to-image", "img2img"])
# Audio detection # V2V detection (video-to-video) - style transfer, video editing
audio = any(x in model_id or x in desc for x in ["tts", "bark", "musicgen", "audio", "voice", "speech"]) v2v = any(x in model_id or x in desc for x in ["video-to-video", "v2v", "video editing", "video style"])
audio = audio or any(x in tags for x in ["tts", "audio", "speech", "music"]) v2v = v2v or any(x in tags for x in ["video-to-video", "v2v", "video-editing"])
# Video models can generally do V2V with style transfer
v2v = v2v or (t2v or i2v)
# V2I detection (video-to-image) - frame extraction, keyframe detection
v2i = any(x in model_id or x in desc for x in ["video-to-image", "v2i", "frame extraction", "keyframe"])
v2i = v2i or any(x in tags for x in ["video-to-image", "v2i"])
# 2D-to-3D detection - depth estimation, stereo, VR
to_3d = any(x in model_id or x in desc for x in ["depth", "stereo", "3d", "vr", "equirectangular", "midas", "dpt"])
to_3d = to_3d or any(x in tags for x in ["depth-estimation", "stereo", "3d", "vr", "monocular-depth"])
# TTS detection (text-to-speech)
tts = any(x in model_id or x in desc for x in ["tts", "bark", "speech", "voice synthesis", "vits", "xtts"])
tts = tts or any(x in tags for x in ["tts", "text-to-speech", "speech-synthesis"])
# Audio detection (general audio - music, sound effects)
audio = any(x in model_id or x in desc for x in ["musicgen", "audioldm", "audio generation", "music generation"])
audio = audio or any(x in tags for x in ["audio-generation", "music-generation", "audioldm"])
audio = audio or tts # TTS models are also audio models
# NSFW detection # NSFW detection
nsfw_keywords = ["nsfw", "adult", "uncensored", "porn", "explicit", "xxx", "erotic", "nude"] nsfw_keywords = ["nsfw", "adult", "uncensored", "porn", "explicit", "xxx", "erotic", "nude"]
...@@ -2056,6 +2075,10 @@ def detect_model_type(info): ...@@ -2056,6 +2075,10 @@ def detect_model_type(info):
"t2v": t2v, "t2v": t2v,
"t2i": t2i, "t2i": t2i,
"i2i": i2i, "i2i": i2i,
"v2v": v2v,
"v2i": v2i,
"to_3d": to_3d,
"tts": tts,
"audio": audio, "audio": audio,
"nsfw": nsfw, "nsfw": nsfw,
"lora": lora "lora": lora
...@@ -2127,10 +2150,14 @@ def show_model_details(model_id_or_name, args): ...@@ -2127,10 +2150,14 @@ def show_model_details(model_id_or_name, args):
# Capabilities # Capabilities
caps = detect_model_type(model) caps = detect_model_type(model)
print(f"\n Capabilities:") print(f"\n Capabilities:")
print(f" I2V (Image-to-Video): {'✅ Yes' if caps['i2v'] else '❌ No'}")
print(f" T2V (Text-to-Video): {'✅ Yes' if caps['t2v'] else '❌ No'}") print(f" T2V (Text-to-Video): {'✅ Yes' if caps['t2v'] else '❌ No'}")
print(f" I2V (Image-to-Video): {'✅ Yes' if caps['i2v'] else '❌ No'}")
print(f" T2I (Text-to-Image): {'✅ Yes' if caps['t2i'] else '❌ No'}") print(f" T2I (Text-to-Image): {'✅ Yes' if caps['t2i'] else '❌ No'}")
print(f" I2I (Image-to-Image): {'✅ Yes' if caps['i2i'] else '❌ No'}") print(f" I2I (Image-to-Image): {'✅ Yes' if caps['i2i'] else '❌ No'}")
print(f" V2V (Video-to-Video): {'✅ Yes' if caps['v2v'] else '❌ No'}")
print(f" V2I (Video-to-Image): {'✅ Yes' if caps['v2i'] else '❌ No'}")
print(f" 2D-to-3D: {'✅ Yes' if caps['to_3d'] else '❌ No'}")
print(f" TTS (Text-to-Speech): {'✅ Yes' if caps['tts'] else '❌ No'}")
print(f" Audio: {'✅ Yes' if caps['audio'] else '❌ No'}") print(f" Audio: {'✅ Yes' if caps['audio'] else '❌ No'}")
print(f" NSFW-friendly: {'✅ Yes' if caps['nsfw'] else '❌ No'}") print(f" NSFW-friendly: {'✅ Yes' if caps['nsfw'] else '❌ No'}")
print(f" LoRA Adapter: {'✅ Yes' if caps['lora'] else '❌ No'}") print(f" LoRA Adapter: {'✅ Yes' if caps['lora'] else '❌ No'}")
...@@ -2179,12 +2206,26 @@ def print_model_list(args): ...@@ -2179,12 +2206,26 @@ def print_model_list(args):
auto_disable_data = load_auto_disable_data() auto_disable_data = load_auto_disable_data()
for name, info in sorted(MODELS.items()): for name, info in sorted(MODELS.items()):
if args.i2v_only and not info.get("supports_i2v", False): caps = detect_model_type(info)
# Apply filters
if args.i2v_only and not caps["i2v"]:
continue
if args.t2v_only and not caps["t2v"]:
continue continue
if args.t2v_only and info.get("supports_i2v", False): if getattr(args, 't2i_only', False) and not caps["t2i"]:
continue continue
if args.nsfw_friendly and not any(word in name.lower() or word in info.get("desc", "").lower() if getattr(args, 'v2v_only', False) and not caps["v2v"]:
for word in ["uncensored", "nsfw", "adult", "realism", "erotic", "explicit"]): continue
if getattr(args, 'v2i_only', False) and not caps["v2i"]:
continue
if getattr(args, '3d_only', False) and not caps["to_3d"]:
continue
if getattr(args, 'tts_only', False) and not caps["tts"]:
continue
if getattr(args, 'audio_only', False) and not caps["audio"]:
continue
if args.nsfw_friendly and not caps["nsfw"]:
continue continue
if args.low_vram: if args.low_vram:
est = parse_vram_estimate(info["vram"]) est = parse_vram_estimate(info["vram"])
...@@ -2200,7 +2241,6 @@ def print_model_list(args): ...@@ -2200,7 +2241,6 @@ def print_model_list(args):
continue continue
shown += 1 shown += 1
caps = detect_model_type(info)
# Check if model is disabled for auto mode # Check if model is disabled for auto mode
model_id = info.get("id", "") model_id = info.get("id", "")
...@@ -2212,21 +2252,24 @@ def print_model_list(args): ...@@ -2212,21 +2252,24 @@ def print_model_list(args):
if shown == 0: if shown == 0:
print("No models match the selected filters.") print("No models match the selected filters.")
else: else:
# Print table header with new Auto column # Print table header with all capability columns
print(f"{'ID':>4} {'Name':<26} {'VRAM':<11} {'I2V':<4} {'T2V':<4} {'T2I':<4} {'I2I':<4} {'NSFW':<5} {'LoRA':<5} {'Auto':<6}") print(f"{'ID':>4} {'Name':<22} {'VRAM':<9} {'T2V':<3} {'I2V':<3} {'T2I':<3} {'V2V':<3} {'V2I':<3} {'3D':<3} {'TTS':<3} {'NSFW':<4} {'LoRA':<4} {'Auto':<5}")
print("-" * 100) print("-" * 110)
for idx, (name, info, caps, is_disabled, fail_count) in enumerate(results, 1): for idx, (name, info, caps, is_disabled, fail_count) in enumerate(results, 1):
# Truncate name if too long # Truncate name if too long
display_name = name[:24] + ".." if len(name) > 26 else name display_name = name[:20] + ".." if len(name) > 22 else name
vram = info["vram"][:9] if len(info["vram"]) > 9 else info["vram"] vram = info["vram"][:7] if len(info["vram"]) > 7 else info["vram"]
i2v = "Yes" if caps["i2v"] else "-" t2v = "✓" if caps["t2v"] else "-"
t2v = "Yes" if caps["t2v"] else "-" i2v = "✓" if caps["i2v"] else "-"
t2i = "Yes" if caps["t2i"] else "-" t2i = "✓" if caps["t2i"] else "-"
i2i = "Yes" if caps["i2i"] else "-" v2v = "✓" if caps["v2v"] else "-"
nsfw = "Yes" if caps["nsfw"] else "-" v2i = "✓" if caps["v2i"] else "-"
lora = "Yes" if caps["lora"] else "-" to_3d = "✓" if caps["to_3d"] else "-"
tts = "✓" if caps["tts"] else "-"
nsfw = "✓" if caps["nsfw"] else "-"
lora = "✓" if caps["lora"] else "-"
# Show auto status # Show auto status
if is_disabled: if is_disabled:
...@@ -2234,17 +2277,21 @@ def print_model_list(args): ...@@ -2234,17 +2277,21 @@ def print_model_list(args):
elif fail_count > 0: elif fail_count > 0:
auto_status = f"{fail_count}/3" auto_status = f"{fail_count}/3"
else: else:
auto_status = "Yes" auto_status = ""
# Add indicator for disabled models # Add indicator for disabled models
if is_disabled: if is_disabled:
display_name = f"🚫{display_name[:23]}" if len(display_name) < 26 else f"🚫{display_name[:23]}.." display_name = f"🚫{display_name[:19]}" if len(display_name) < 22 else f"🚫{display_name[:19]}.."
print(f"{idx:>4} {display_name:<26} {vram:<11} {i2v:<4} {t2v:<4} {t2i:<4} {i2i:<4} {nsfw:<5} {lora:<5} {auto_status:<6}") print(f"{idx:>4} {display_name:<22} {vram:<9} {t2v:<3} {i2v:<3} {t2i:<3} {v2v:<3} {v2i:<3} {to_3d:<3} {tts:<3} {nsfw:<4} {lora:<4} {auto_status:<5}")
print("-" * 100) print("-" * 110)
print(f"Total shown: {shown} / {len(MODELS)} available") print(f"Total shown: {shown} / {len(MODELS)} available")
# Show legend
print("\n Columns: T2V=Text-to-Video, I2V=Image-to-Video, T2I=Text-to-Image")
print(" V2V=Video-to-Video, V2I=Video-to-Image, 3D=2D-to-3D, TTS=Text-to-Speech")
# Show legend for auto column # Show legend for auto column
disabled_count = sum(1 for _, _, _, is_disabled, _ in results if is_disabled) disabled_count = sum(1 for _, _, _, is_disabled, _ in results if is_disabled)
if disabled_count > 0: if disabled_count > 0:
...@@ -2252,6 +2299,8 @@ def print_model_list(args): ...@@ -2252,6 +2299,8 @@ def print_model_list(args):
print(f" {disabled_count} model(s) disabled for --auto mode") print(f" {disabled_count} model(s) disabled for --auto mode")
print(f" Use --model <name> manually to re-enable a disabled model") print(f" Use --model <name> manually to re-enable a disabled model")
print("\nFilters: --t2v-only, --i2v-only, --t2i-only, --v2v-only, --v2i-only, --3d-only, --tts-only, --audio-only")
print(" --nsfw-friendly, --low-vram, --high-vram, --huge-vram")
print("\nUse --model <name> to select a model.") print("\nUse --model <name> to select a model.")
print("Use --show-model <ID|name> to see full model details.") print("Use --show-model <ID|name> to see full model details.")
sys.exit(0) sys.exit(0)
...@@ -6738,6 +6787,18 @@ List TTS voices: ...@@ -6738,6 +6787,18 @@ List TTS voices:
help="When using --model-list: only show I2V-capable models") help="When using --model-list: only show I2V-capable models")
parser.add_argument("--t2v-only", action="store_true", parser.add_argument("--t2v-only", action="store_true",
help="When using --model-list: only show T2V-only models") help="When using --model-list: only show T2V-only models")
parser.add_argument("--t2i-only", action="store_true",
help="When using --model-list: only show T2I (text-to-image) models")
parser.add_argument("--v2v-only", action="store_true",
help="When using --model-list: only show V2V (video-to-video) models")
parser.add_argument("--v2i-only", action="store_true",
help="When using --model-list: only show V2I (video-to-image) models")
parser.add_argument("--3d-only", action="store_true",
help="When using --model-list: only show 2D-to-3D conversion models")
parser.add_argument("--tts-only", action="store_true",
help="When using --model-list: only show TTS (text-to-speech) models")
parser.add_argument("--audio-only", action="store_true",
help="When using --model-list: only show audio generation models")
parser.add_argument("--nsfw-friendly", action="store_true", parser.add_argument("--nsfw-friendly", action="store_true",
help="When using --model-list: only show uncensored/NSFW-capable models") help="When using --model-list: only show uncensored/NSFW-capable models")
parser.add_argument("--low-vram", action="store_true", parser.add_argument("--low-vram", action="store_true",
......
...@@ -367,7 +367,7 @@ async def list_tools() -> list: ...@@ -367,7 +367,7 @@ async def list_tools() -> list:
"properties": { "properties": {
"filter": { "filter": {
"type": "string", "type": "string",
"enum": ["all", "i2v", "t2v", "low_vram", "high_vram", "huge_vram", "nsfw"], "enum": ["all", "i2v", "t2v", "t2i", "v2v", "v2i", "3d", "tts", "audio", "low_vram", "high_vram", "huge_vram", "nsfw"],
"description": "Filter models by type or VRAM requirement", "description": "Filter models by type or VRAM requirement",
"default": "all" "default": "all"
} }
...@@ -375,6 +375,221 @@ async def list_tools() -> list: ...@@ -375,6 +375,221 @@ async def list_tools() -> list:
} }
), ),
Tool(
name="videogen_video_to_video",
description="Transform an existing video (Video-to-Video). Apply style transfer or filters to a video.",
inputSchema={
"type": "object",
"properties": {
"video": {
"type": "string",
"description": "Path to the input video file"
},
"prompt": {
"type": "string",
"description": "Description of the desired transformation"
},
"output": {
"type": "string",
"default": "output"
},
"strength": {
"type": "number",
"description": "Transformation strength (0.0-1.0)",
"default": 0.75
},
"fps": {
"type": "integer",
"description": "Processing FPS",
"default": 15
}
},
"required": ["video", "prompt"]
}
),
Tool(
name="videogen_apply_video_filter",
description="Apply a filter to a video. Available filters: grayscale, sepia, blur, sharpen, contrast, saturation, speed, slow, reverse, fade_in, fade_out, denoise, stabilize.",
inputSchema={
"type": "object",
"properties": {
"video": {
"type": "string",
"description": "Path to the input video file"
},
"filter": {
"type": "string",
"enum": ["grayscale", "sepia", "blur", "sharpen", "contrast", "saturation", "speed", "slow", "reverse", "fade_in", "fade_out", "denoise", "stabilize"],
"description": "Filter to apply"
},
"params": {
"type": "string",
"description": "Filter parameters (e.g., 'factor=2.0' for speed, 'radius=10' for blur)"
},
"output": {
"type": "string",
"default": "output"
}
},
"required": ["video", "filter"]
}
),
Tool(
name="videogen_extract_frames",
description="Extract frames from a video. Can extract a single frame, keyframes, or all frames.",
inputSchema={
"type": "object",
"properties": {
"video": {
"type": "string",
"description": "Path to the input video file"
},
"mode": {
"type": "string",
"enum": ["single", "keyframes", "all"],
"description": "Extraction mode: single frame, keyframes, or all frames",
"default": "keyframes"
},
"timestamp": {
"type": "number",
"description": "Timestamp for single frame extraction (seconds)"
},
"frame_number": {
"type": "integer",
"description": "Frame number for single frame extraction"
},
"max_frames": {
"type": "integer",
"description": "Maximum frames to extract",
"default": 100
},
"output_dir": {
"type": "string",
"description": "Output directory for frames",
"default": "frames"
}
},
"required": ["video"]
}
),
Tool(
name="videogen_create_collage",
description="Create a collage/thumbnail grid from a video.",
inputSchema={
"type": "object",
"properties": {
"video": {
"type": "string",
"description": "Path to the input video file"
},
"grid": {
"type": "string",
"description": "Grid size (e.g., '4x4', '3x3')",
"default": "4x4"
},
"method": {
"type": "string",
"enum": ["uniform", "keyframes", "random"],
"description": "Sampling method",
"default": "uniform"
},
"output": {
"type": "string",
"default": "collage.png"
}
},
"required": ["video"]
}
),
Tool(
name="videogen_upscale_video",
description="Upscale a video using AI upscaling models.",
inputSchema={
"type": "object",
"properties": {
"video": {
"type": "string",
"description": "Path to the input video file"
},
"scale": {
"type": "number",
"description": "Upscale factor (2.0 or 4.0)",
"default": 2.0
},
"method": {
"type": "string",
"enum": ["ffmpeg", "esrgan", "real_esrgan", "swinir"],
"description": "Upscaling method",
"default": "ffmpeg"
},
"output": {
"type": "string",
"default": "output"
}
},
"required": ["video"]
}
),
Tool(
name="videogen_convert_3d",
description="Convert 2D video to 3D format (SBS, anaglyph, or VR 360).",
inputSchema={
"type": "object",
"properties": {
"video": {
"type": "string",
"description": "Path to the input video file"
},
"format": {
"type": "string",
"enum": ["sbs", "anaglyph", "vr"],
"description": "3D output format: sbs (side-by-side), anaglyph (red/cyan), vr (360)"
},
"depth_method": {
"type": "string",
"enum": ["ai", "disparity", "shift"],
"description": "Depth estimation method",
"default": "shift"
},
"disparity_scale": {
"type": "number",
"description": "Disparity scale (0.5-2.0)",
"default": 1.0
},
"output": {
"type": "string",
"default": "output"
}
},
"required": ["video", "format"]
}
),
Tool(
name="videogen_concat_videos",
description="Concatenate multiple videos into one.",
inputSchema={
"type": "object",
"properties": {
"videos": {
"type": "array",
"items": {"type": "string"},
"description": "List of video file paths to concatenate"
},
"output": {
"type": "string",
"default": "output"
}
},
"required": ["videos"]
}
),
Tool( Tool(
name="videogen_show_model", name="videogen_show_model",
description="Show detailed information about a specific model.", description="Show detailed information about a specific model.",
...@@ -572,6 +787,18 @@ async def call_tool(name: str, arguments: dict) -> list: ...@@ -572,6 +787,18 @@ async def call_tool(name: str, arguments: dict) -> list:
args.append("--i2v-only") args.append("--i2v-only")
elif filter_type == "t2v": elif filter_type == "t2v":
args.append("--t2v-only") args.append("--t2v-only")
elif filter_type == "t2i":
args.append("--t2i-only")
elif filter_type == "v2v":
args.append("--v2v-only")
elif filter_type == "v2i":
args.append("--v2i-only")
elif filter_type == "3d":
args.append("--3d-only")
elif filter_type == "tts":
args.append("--tts-only")
elif filter_type == "audio":
args.append("--audio-only")
elif filter_type == "low_vram": elif filter_type == "low_vram":
args.append("--low-vram") args.append("--low-vram")
elif filter_type == "high_vram": elif filter_type == "high_vram":
...@@ -614,6 +841,104 @@ async def call_tool(name: str, arguments: dict) -> list: ...@@ -614,6 +841,104 @@ async def call_tool(name: str, arguments: dict) -> list:
output, code = run_videogen_command(args) output, code = run_videogen_command(args)
return [TextContent(type="text", text=output)] return [TextContent(type="text", text=output)]
elif name == "videogen_video_to_video":
args = [
"--video", arguments["video"],
"--video-to-video",
"--prompt", arguments["prompt"],
"--output", arguments.get("output", "output"),
"--v2v-strength", str(arguments.get("strength", 0.75)),
"--v2v-fps", str(arguments.get("fps", 15)),
]
output, code = run_videogen_command(args, timeout=3600)
return [TextContent(type="text", text=output)]
elif name == "videogen_apply_video_filter":
args = [
"--video", arguments["video"],
"--video-filter", arguments["filter"],
"--output", arguments.get("output", "output"),
]
if arguments.get("params"):
args.extend(["--filter-params", arguments["params"]])
output, code = run_videogen_command(args, timeout=1800)
return [TextContent(type="text", text=output)]
elif name == "videogen_extract_frames":
mode = arguments.get("mode", "keyframes")
args = ["--video", arguments["video"]]
if mode == "single":
args.append("--extract-frame")
if arguments.get("timestamp"):
args.extend(["--timestamp", str(arguments["timestamp"])])
if arguments.get("frame_number"):
args.extend(["--frame-number", str(arguments["frame_number"])])
elif mode == "keyframes":
args.append("--extract-keyframes")
if arguments.get("max_frames"):
args.extend(["--max-keyframes", str(arguments["max_frames"])])
else: # all
args.append("--extract-frames")
if arguments.get("max_frames"):
args.extend(["--v2v-max-frames", str(arguments["max_frames"])])
if arguments.get("output_dir"):
args.extend(["--frames-dir", arguments["output_dir"]])
output, code = run_videogen_command(args)
return [TextContent(type="text", text=output)]
elif name == "videogen_create_collage":
args = [
"--video", arguments["video"],
"--video-collage",
"--collage-grid", arguments.get("grid", "4x4"),
"--collage-method", arguments.get("method", "uniform"),
"--output", arguments.get("output", "collage.png"),
]
output, code = run_videogen_command(args)
return [TextContent(type="text", text=output)]
elif name == "videogen_upscale_video":
args = [
"--video", arguments["video"],
"--upscale-video",
"--upscale-factor", str(arguments.get("scale", 2.0)),
"--upscale-method", arguments.get("method", "ffmpeg"),
"--output", arguments.get("output", "output"),
]
output, code = run_videogen_command(args, timeout=3600)
return [TextContent(type="text", text=output)]
elif name == "videogen_convert_3d":
format_type = arguments["format"]
args = [
"--video", arguments["video"],
"--output", arguments.get("output", "output"),
]
if format_type == "sbs":
args.append("--convert-3d-sbs")
elif format_type == "anaglyph":
args.append("--convert-3d-anaglyph")
elif format_type == "vr":
args.append("--convert-vr")
if arguments.get("depth_method"):
args.extend(["--depth-method", arguments["depth_method"]])
if arguments.get("disparity_scale"):
args.extend(["--disparity-scale", str(arguments["disparity_scale"])])
output, code = run_videogen_command(args, timeout=3600)
return [TextContent(type="text", text=output)]
elif name == "videogen_concat_videos":
videos = arguments["videos"]
args = ["--concat-videos"] + videos + ["--output", arguments.get("output", "output")]
output, code = run_videogen_command(args)
return [TextContent(type="text", text=output)]
else: else:
return [TextContent(type="text", text=f"Unknown tool: {name}")] return [TextContent(type="text", text=f"Unknown tool: {name}")]
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment