• Stefy Lanza (nextime / spora )'s avatar
    Add --audio-chunk option for audio/video chunking strategies · 20db65c1
    Stefy Lanza (nextime / spora ) authored
    Added --audio-chunk argument with 3 modes:
    - overlap (default): overlapping chunks like [0-60], [58-118]
    - word-boundary: uses Whisper timestamps to split at word boundaries
    - vad: uses Voice Activity Detection to skip silence
    
    Also added --audio-chunk-overlap to control overlap duration.
    
    New functions added:
    - process_video_with_vad(): VAD-based chunking
    - process_video_word_boundary(): Word-boundary chunking using Whisper
    
    Modified:
    - transcribe_video_audio(): accepts audio_chunk_type and audio_chunk_overlap params
    - _transcribe_chunked(): accepts chunk_type and overlap params
    20db65c1
Name
Last commit
Last update
static Loading commit data...
templates Loading commit data...
.gitignore Loading commit data...
EXAMPLES.md Loading commit data...
LICENSE.md Loading commit data...
README.md Loading commit data...
SKILL.md Loading commit data...
check_model.py Loading commit data...
check_pipelines.py Loading commit data...
debug_model_select.py Loading commit data...
logo.png Loading commit data...
requirements.txt Loading commit data...
screenshot.png Loading commit data...
videogen.py Loading commit data...
videogen_mcp_server.py Loading commit data...
videogen_models.json Loading commit data...
webapp.py Loading commit data...