Commits · 71aa70a0fd6a88963272f515b9e4d80ba3a72a59 · nexlab / coderai

10 Mar, 2026 21 commits

Use diffusion_model_path for sd.cpp and look for additional model files · 71aa70a0
Your Name authored Mar 10, 2026

71aa70a0
Fix: Use StableDiffusion class for sd.cpp fallback · 3c682962
Your Name authored Mar 10, 2026

3c682962
Fix import name: stable_diffusion_cpp · 5b818bc6
Your Name authored Mar 10, 2026

5b818bc6

Add stable-diffusion-cpp-python fallback when llama.cpp fails · de678849

Your Name authored Mar 10, 2026

When GGUF image model fails to load with llama.cpp, try loading with stable-diffusion-cpp-python (sd.cpp) as fallback.

de678849

Simplify error message when GGUF image model fails · 9ad2a7f1
Your Name authored Mar 10, 2026

9ad2a7f1
Fix pip install command quoting · 0fc0ab29
Your Name authored Mar 10, 2026

0fc0ab29
Improve error message when GGUF loading fails · e35fd018
Your Name authored Mar 10, 2026
```
Tell user to update llama.cpp instead of falling back to diffusers.
```
e35fd018
Revert "Add diffusers fallback when GGUF image model fails to load" · 963aa5d5
Your Name authored Mar 10, 2026
```
This reverts commit 596d1245.
```
963aa5d5

Add diffusers fallback when GGUF image model fails to load · 596d1245

Your Name authored Mar 10, 2026

If llama.cpp fails to load a GGUF image model (e.g., unsupported architecture like lumina2), try loading via diffusers instead.

596d1245

Fix: Accept GGUF files with any version byte · fe8268fd

Your Name authored Mar 10, 2026

GGUF files can have different version bytes after 'GGUF' (e.g., GGUF\x03 for version 3).
Changed magic byte check from exact match to prefix check.

fe8268fd

Fix: Preload image model when configured (even with audio models) · 8ebeafef

Your Name authored Mar 10, 2026

Changed condition from 'only if no audio models' to 'always when image_models is configured'.
This ensures the image model downloads at startup even when audio models are present.

8ebeafef

Fix: Remove redundant import os statements causing UnboundLocalError · 0609c4cf

Your Name authored Mar 10, 2026

- Removed redundant 'import os' statements inside functions (lines 4522, 4926, 5005)
- Added back missing 'from llama_cpp import Llama' that was accidentally deleted
- Global 'import os' at line 12 is now the only one

This fixes the UnboundLocalError when running --list-cached-models or other CLI options.

0609c4cf

Add cache management CLI options · 496c4e53

Your Name authored Mar 10, 2026

- --list-cached-models: List all cached models with sizes
- --remove-all-models: Remove all cached models
- --remove-model <modelid>: Remove specific model by name/hash (partial match)

496c4e53

Add GGUF magic bytes validation · 015c6908

Your Name authored Mar 10, 2026

- Check if downloaded file is valid GGUF (magic bytes = 'GGUF')
- If not valid, show clear error that URL is wrong (returns HTML instead)
- Explain that URL must be direct download link ending in .gguf

015c6908

Add file verification and magic bytes check for GGUF models · 611bfd8f
Your Name authored Mar 10, 2026

611bfd8f

Add verbose error handling for GGUF image model loading · 141329bc

Your Name authored Mar 10, 2026

- Enable verbose=True in llama.cpp to see actual error
- Print GGUF model file size for debugging
- Add try/except with traceback to see detailed errors

141329bc

Improve GGUF image model loading - better URL handling · 9af89755

Your Name authored Mar 10, 2026

- Check if model is URL before any processing
- Use original model name with query params for URL download
- Strip query params only for HuggingFace repo ID parsing
- Added more debug output to trace issues

9af89755

Fix GGUF image model loading - strip query parameters · 6bc9af36

Your Name authored Mar 10, 2026

- Strip query parameters from model name before processing
- Handle URLs with ?download=true or other query params

6bc9af36

Add GGUF image model support in --loadall mode · e848dd47

Your Name authored Mar 10, 2026

- Detect if image model is GGUF (ends with .gguf or contains 'gguf')
- If GGUF, load using llama.cpp (same as text Vulkan models)
- If diffusers model, load using Stable Diffusion pipeline
- Fixed both locations where image model preloading happens
- Now supports both GGUF and diffusers image generation models

e848dd47

Fix image model preloading with --loadall flag · 2308d5b0

Your Name authored Mar 10, 2026

- Fixed bug where image model wasn't actually being loaded when --loadall was specified
- The code only printed messages but never loaded the diffusers pipeline
- Now actually loads the Stable Diffusion pipeline using diffusers library
- Tries StableDiffusionXLPipeline first, falls back to generic DiffusionPipeline
- Moves to GPU if CUDA available, enables attention slicing for memory efficiency
- Also fixes second location where image model is the only configured model

- Debug command line output was already implemented

2308d5b0

Fix --loadall model preloading and --debug command line output · 9193536a

Your Name authored Mar 10, 2026

- Fixed undefined variable bug where model_name wasn't defined in scope
- Fixed duplicate model loading when using --loadall/--loadswap with multiple models
- First model is now only loaded once (skipped in loop if already loaded)
- Loadall mode now properly preloads all models in VRAM respecting offload strategy
- Loadswap mode properly loads additional models to RAM
- Ondemand mode doesn't reload first model

Feature 1: --debug now shows full command line as first output
Feature 2: --loadall with multiple models now preloads all in VRAM

9193536a

09 Mar, 2026 19 commits

Add --convert flag to whisper-server command for audio format conversion · b51d08b1
Your Name authored Mar 09, 2026

b51d08b1
Add debug logging for whisper-server transcription requests · 3ffd8f3e
Your Name authored Mar 09, 2026

3ffd8f3e

Fix whisper-server: check for whisper-server FIRST in transcription endpoint · 37df61f9

Your Name authored Mar 09, 2026

- Move whisper-server check before audio_model check
- Now whisper-server will be used if available, regardless of audio_model setting
- Also update multi_model_manager.audio_models with cached path

37df61f9

Fix whisper-server: use cached model path for audio_models · bbf12dd4

Your Name authored Mar 09, 2026

- WhisperServerManager.start() now returns actual model path (useful for URL -> cached path)
- Update audio_models[0] with cached path after downloading
- Store actual_model_path in current_model instead of original URL

bbf12dd4

Fix whisper-server: use different port, check availability, handle URL models · 0e67a9a2

Your Name authored Mar 09, 2026

- Changed default port from 8081 to 8744 (less common)
- Check if port is available before using, auto-find available port if needed
- Download URL models before starting whisper-server (use model cache)

0e67a9a2

Add --whisper-server support for audio transcription · 005dfd46

Your Name authored Mar 09, 2026

- Add WhisperServerManager class to manage whisper-server subprocess
- Add --whisper-server argument to specify whisper-server binary path
- Add --whisper-server-port argument for port configuration (default 8081)
- Modify audio transcription endpoint to proxy to whisper-server
- Add cleanup on shutdown to stop whisper-server
- Model can stay loaded in VRAM as long as the server runs

005dfd46

Fix more model_name vs model_names bugs · 1ca724e8

Your Name authored Mar 09, 2026

Fix remaining occurrences of model_name (singular) being used instead
of model_names (list) in main function.

1ca724e8

Fix UnboundLocalError: model_name vs model_names · e0530b65

Your Name authored Mar 09, 2026

Fixed bug where model_name (singular) was used instead of model_names (list)
in several places, causing UnboundLocalError when running without --model.

e0530b65

Add --alias support for model names · a3c476ec

Your Name authored Mar 09, 2026

- CLI (coder): Add --alias argument to create model aliases
- Config: Add model_aliases dict and resolve_model() method
- Server (coderai): Add server-side alias support with --model-alias
- Aliases are resolved in both client and server when making API calls
- Aliases appear in /v1/models endpoint
- Aliases are persisted in config file

a3c476ec

Implement queue notification system for streaming responses · aafd41eb

Your Name authored Mar 09, 2026

- Add QueueManager class to track waiting requests
- Send 'waiting for model...' frames with time counter at regular intervals
- Send 'Model starting' frame when model begins processing
- Add x_queue_info field to streaming response frames for queue status
- Track queue position and wait time for each client

aafd41eb

Implement multiple audio/image model support with aliases · 65caf41f

Your Name authored Mar 09, 2026

- Add support for multiple --audio-model arguments (action='append')
- Add support for multiple --image-model arguments (action='append')
- Add 'audio' alias pointing to first audio model
- Add 'vision'/'image' aliases pointing to first image model
- Update MultiModelManager to store audio_models and image_models as lists
- Add audio_model and image_model properties for accessing first model
- Update get_model_for_request to handle aliases
- Update list_models to show all models and aliases
- Fix remaining references in main function to use list-based variables

65caf41f

Fix: Always use configured audio model regardless of request model parameter · c2bd5ffa
Your Name authored Mar 09, 2026

c2bd5ffa
Fix: Download model if not cached when using whisper.cpp CLI · 1cb7f4b3
Your Name authored Mar 09, 2026

1cb7f4b3

Fix: Use correct whisper.cpp CLI arguments · 4af2538e

Your Name authored Mar 09, 2026

- Changed --model to -m
- Changed --output to -otxt (output as text)
- Changed --device to -dev
- Changed --file to -f for input audio

4af2538e

Debug: Show whisper.cpp CLI command in debug mode · d343d706
Your Name authored Mar 09, 2026

d343d706

Fix: Make args accessible in FastAPI transcription endpoint · a388b95e

Your Name authored Mar 09, 2026

The args variable was not accessible in the create_transcription function,
causing a NameError when using --whisper-cpp CLI option. This fix adds
global_args to store the parsed arguments for access in endpoint functions.

a388b95e

Add --whisper-cpp option to use whisper.cpp CLI directly · 4eaa850f
Your Name authored Mar 09, 2026

4eaa850f
Add debug output for whispercpp import errors · 4c24c7b9
Your Name authored Mar 09, 2026

4c24c7b9
Fix UnboundLocalError for model_path in startup code · 966fad45
Your Name authored Mar 09, 2026

966fad45