- 09 Mar, 2026 24 commits
-
-
Your Name authored
Fix remaining occurrences of model_name (singular) being used instead of model_names (list) in main function.
-
Your Name authored
Fixed bug where model_name (singular) was used instead of model_names (list) in several places, causing UnboundLocalError when running without --model.
-
Your Name authored
- CLI (coder): Add --alias argument to create model aliases - Config: Add model_aliases dict and resolve_model() method - Server (coderai): Add server-side alias support with --model-alias - Aliases are resolved in both client and server when making API calls - Aliases appear in /v1/models endpoint - Aliases are persisted in config file
-
Your Name authored
- Add QueueManager class to track waiting requests - Send 'waiting for model...' frames with time counter at regular intervals - Send 'Model starting' frame when model begins processing - Add x_queue_info field to streaming response frames for queue status - Track queue position and wait time for each client
-
Your Name authored
- Add support for multiple --audio-model arguments (action='append') - Add support for multiple --image-model arguments (action='append') - Add 'audio' alias pointing to first audio model - Add 'vision'/'image' aliases pointing to first image model - Update MultiModelManager to store audio_models and image_models as lists - Add audio_model and image_model properties for accessing first model - Update get_model_for_request to handle aliases - Update list_models to show all models and aliases - Fix remaining references in main function to use list-based variables
-
Your Name authored
-
Your Name authored
-
Your Name authored
- Changed --model to -m - Changed --output to -otxt (output as text) - Changed --device to -dev - Changed --file to -f for input audio
-
Your Name authored
-
Your Name authored
The args variable was not accessible in the create_transcription function, causing a NameError when using --whisper-cpp CLI option. This fix adds global_args to store the parsed arguments for access in endpoint functions.
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
- Modified build.sh to build whispercpp with Vulkan support - Added --audio-vulkan-device argument to specify GPU device for Whisper - Added Vulkan detection and logging for Whisper transcription - Set GGML_VULKAN_DEVICE environment variable for GPU selection
-
Your Name authored
-
Your Name authored
-
Your Name authored
faster-whisper doesn't support GGUF format (it's llama.cpp format). Now detects GGUF files by extension and goes directly to whispercpp.
-
Your Name authored
- Add faster_whisper_failed flag to properly track failures - When faster-whisper throws non-ImportError (e.g., GGUF not supported), now falls back to whispercpp instead of failing - Applies to both pre-loading and transcription endpoint
-
Your Name authored
- Add specific detection for 'invalid ELF' / 'Mach-O' architecture mismatch errors - Improve error messages to mention both options: - Install PyTorch + faster-whisper - Use built-in whispercpp model (tiny/base/small/medium/large) - Fix critical bug: now raises HTTPException instead of returning None
-
Your Name authored
- Recognize built-in model names: tiny, base, small, medium, large-v1, large - Allow pre-loading these models directly without file path
-
Your Name authored
- Add better error detection for 'not a valid preconverted model' errors - Provide clear guidance to users about whispercpp limitations - Suggest installing faster-whisper with PyTorch or using built-in model names - Update both transcription endpoint and pre-loading code
-
Your Name authored
- Update transcription endpoint to try faster-whisper first, then whispercpp - Update pre-loading code to support both backends - Add whispercpp to all requirements files (vulkan, nvidia, default) - Remove broken llama.cpp fallback (llama.cpp cannot transcribe Whisper)
-
Your Name authored
-
Your Name authored
-
- 08 Mar, 2026 16 commits
-
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
When audio model is in GGUF format, use llama.cpp instead of faster-whisper for pre-loading. This allows using Vulkan backend for audio transcription.
-
Stefy Lanza (nextime / spora ) authored
When only one model type is specified (e.g., only --audio-model with no --model), automatically pre-load it even in on-demand mode. This ensures the model is downloaded and ready for use.
-
Stefy Lanza (nextime / spora ) authored
- Add --loadall flag to pre-load all models at startup - Add --loadswap flag to keep models in RAM, swap active to VRAM - Fix bug where load_mode was used before being defined in audio model section - Remove duplicate load_mode determination code - Improve error message for no main model specified to include TTS
-
Stefy Lanza (nextime / spora ) authored
-
Stefy Lanza (nextime / spora ) authored
- Add --tts-model option for Kokoro TTS models - Add /v1/audio/speech endpoint (OpenAI-compatible) - Add model caching to prevent redundant downloads - Replace MD5 with SHA-256 for cache keys - Move hashlib and pathlib imports to module level
-
Stefy Lanza (nextime / spora ) authored
- --model is now optional if using audio or image models only - Shows helpful error message with examples if no model specified - Prints available models at startup
-
Stefy Lanza (nextime / spora ) authored
- Accept full HTTPS URLs for --model (Vulkan/GGUF models) - Accept full HTTPS URLs for --audio-model (faster-whisper models) - Downloads file to temp directory before loading - Shows download progress percentage
-