Add --audio-chunk option for audio/video chunking strategies
Added --audio-chunk argument with 3 modes: - overlap (default): overlapping chunks like [0-60], [58-118] - word-boundary: uses Whisper timestamps to split at word boundaries - vad: uses Voice Activity Detection to skip silence Also added --audio-chunk-overlap to control overlap duration. New functions added: - process_video_with_vad(): VAD-based chunking - process_video_word_boundary(): Word-boundary chunking using Whisper Modified: - transcribe_video_audio(): accepts audio_chunk_type and audio_chunk_overlap params - _transcribe_chunked(): accepts chunk_type and overlap params
Showing
This diff is collapsed.
Please
register
or
sign in
to comment