Commits · c3a5417f187ceb1f9fa8047c11385dd433d15a22 · nexlab / coderai

15 Mar, 2026 17 commits

Don't use GGUF text models as fallback for image generation · c3a5417f
Your Name authored Mar 15, 2026
```
GGUF models are for text/LLM and cannot do image generation.
```
c3a5417f
Revert import name - package imports as stable_diffusion_cpp · 5a1b67fe
Your Name authored Mar 15, 2026

5a1b67fe
Add stable-diffusion-cpp-python to requirements for image generation · 273264bf
Your Name authored Mar 15, 2026

273264bf
Add diffusers and safetensors to requirements for image generation · 0f1c2385
Your Name authored Mar 15, 2026
```
diffusers is required for Stable Diffusion image generation
```
0f1c2385
Fix stable-diffusion-cpp-python import name · c5c74197
Your Name authored Mar 15, 2026
```
The package is named 'stable_diffusion_cpp_python', not 'stable_diffusion_cpp'
```
c5c74197

Fallback to main --model for image generation if --image-model not set · b70ae900

Your Name authored Mar 15, 2026

If --image-model is not specified, try to use the main --model as
the image model fallback when requesting 'default' model.

b70ae900

Auto-retry download when cached model file is corrupted · 9c681ed1

Your Name authored Mar 15, 2026

If loading a cached GGUF model fails with corruption indicators
(invalid, corrupt, magic, header), delete the corrupted cache and
re-download the model automatically.

9c681ed1

Fix cached model path lookup to match download format · 102a464b

Your Name authored Mar 15, 2026

The cache filename format was inconsistent:
- get_cached_model_path used: {hash}{ext}
- load_model download used: {hash}_{filename}

This caused cache lookups to always fail. Now both use {hash}_{filename}
format to ensure cached models are properly found.

102a464b

Simplify error message for unsupported image generation models · c72d8384
Your Name authored Mar 15, 2026
```
Change from detailed installation instructions to simple message:
'Model does not support image generation'
```
c72d8384

Fix image model routing: use request model, fall back to default · c792d752

Your Name authored Mar 15, 2026

- If request specifies a model, use that
- If request doesn't specify a model (empty or 'image'), use default
- Legacy 'image:' prefix also falls back to default
- Error handling already exists for when no backend is available

c792d752

Fix debug message to not reference VulkanBackend when using CUDA · fa5c634f

Your Name authored Mar 15, 2026

Change message from 'VulkanBackend will use CUDA backend' to
'GGUF model will use CUDA backend (forced by --backend nvidia)'

fa5c634f

Remove vulkan from available backends when --backend nvidia is used with GGUF models · 88479315

Your Name authored Mar 15, 2026

When user explicitly passes --backend nvidia with a GGUF model,
vulkan is now removed from the available backends list since
llama-cpp-python will use CUDA instead of Vulkan.

88479315

Add note when GGUF model with nvidia backend uses CUDA instead of Vulkan · 273ab8c8

Your Name authored Mar 15, 2026

- When user specifies --backend nvidia with a GGUF model, show a note
  indicating that the vulkan backend will use CUDA
- This clarifies that Vulkan isn't being used in this scenario

273ab8c8

Add 'all' backend option to build.sh for installing all backends at once · a6070221

Your Name authored Mar 15, 2026

- Add 'all' as a valid backend option
- Change default from 'nvidia' to 'all'
- Add comprehensive 'all' backend section that installs:
  - Base requirements
  - PyTorch with CUDA (nvidia backend)
  - llama-cpp-python with CUDA and Vulkan support
  - stable-diffusion-cpp-python with OpenCL
  - Additional requirements
- Detect available hardware (CUDA, Vulkan, OpenCL) and enable accordingly
- Show summary of available backends after installation

a6070221

Force CUDA backend in llama-cpp-python when NVIDIA backend is requested with GGUF models · 1bd92fe1

Your Name authored Mar 15, 2026

- Store original backend before switching to vulkan for GGUF files
- Pass original_backend to VulkanBackend constructor
- Add force_cuda flag that triggers CUDA environment setup
- Set CUDA_VISIBLE_DEVICES when force_cuda is True
- Update success/error messages to reflect actual backend used
- Add debug output for CUDA detection

1bd92fe1

Fix UnboundLocalError for os module in --list-cached-models · d8765ac3

Your Name authored Mar 15, 2026

The local import of os inside the HTTPS block caused Python to treat os as a local variable throughout the main() function.

d8765ac3

Add python-multipart to requirements, GGUF support for CUDA backend · dd4dfff4

Your Name authored Mar 15, 2026

- Add python-multipart to requirements.txt, requirements-nvidia.txt, requirements-vulkan.txt
- Add llama-cpp-python to requirements-nvidia.txt for GGUF support
- When using CUDA/nvidia backend with GGUF file, automatically use llama-cpp-python

dd4dfff4

14 Mar, 2026 23 commits
- Add --vision-model, fix --file-path to return URL by default, add HTTPS support · c152ee28
  Your Name authored Mar 14, 2026
```
- Add --vision-model for image/video-to-text models
- When --file-path is set, return URL by default, base64 only if explicitly requested
- Add --https flag with auto-certificate generation
- Add --privkey and --pubkey for custom certificates
```
  c152ee28
- Add --image-cfg-scale and auto-detect VRAM for Vulkan · 29d2ed78
  Your Name authored Mar 14, 2026
```
- Add --image-cfg-scale CLI option (default 1.0)
- Add get_cfg_scale() helper that auto-detects VRAM
- If Vulkan and VRAM < 16GB, use cfg_scale=1.0 automatically
```
  29d2ed78
- Add per-model backend selection and OpenCL support · bf2a5318
  Your Name authored Mar 14, 2026
```
- Add --image-backend, --audio-backend, --tts-backend CLI args
- Add opencl to backend choices
- Add OpenCL build target in build.sh
```
  bf2a5318
- Remove vision model alias - use image only · 4ce1f330
  Your Name authored Mar 14, 2026
  
  4ce1f330
- Rename --vision-ctx and --vision-offload to --image-ctx and --image-offload · d5f7b07c
  Your Name authored Mar 14, 2026
  
  d5f7b07c
- Add --debug command line output and --nopreload flag · 2cdd7538
  Your Name authored Mar 14, 2026
```
- When --debug is enabled, show full command line coderai was called with
- Add --nopreload flag to disable model preloading at startup
- When --nopreload is not specified, skip checking for preloaded sd.cpp models (forces load in worker thread to avoid Vulkan context issues)
- Fix image model preloading to respect --nopreload flag
```
  2cdd7538
- Add time.sleep after sd.cpp generation to let Vulkan driver settle · ac069fe2
  Your Name authored Mar 14, 2026
```
This helps prevent the GPU from staying at 100% utilization after
image generation by allowing the Vulkan driver to transition from
compute state to idle state.
```
  ac069fe2
- Fix sd.cpp parameters and add n_threads · 9fd66d57
  Your Name authored Mar 14, 2026
```
- Use keep_clip_on_cpu instead of clip_on_cpu
- VAE tiling is handled internally in newer builds
- Add n_threads using psutil.cpu_count() for optimal performance
```
  9fd66d57
- Add --vae-tiling and --clip-on-cpu for sd.cpp · d3fab6b1
  Your Name authored Mar 14, 2026
```
- Added --vae-tiling flag to enable VAE tiling for lower VRAM usage
- Added --clip-on-cpu flag to run CLIP on CPU to save VRAM
- Both options work with stable-diffusion-cpp-python
```
  d3fab6b1
- Add --image-seed CLI argument for default image generation seed · a680d5ab
  Your Name authored Mar 14, 2026
```
- Added --image-seed argument to set default seed for image generation
- Updated diffusers and sd.cpp code to use request seed or CLI default seed
- Priority: request seed > CLI default seed > random
```
  a680d5ab
- Fix model_key undefined in on-demand image loading section · 7f60e0df
  Your Name authored Mar 14, 2026
```
- Added model_key initialization before sd.cpp loading in on-demand section
- Added model_key assignment before adding model to manager
```
  7f60e0df
- Fix --loadall model preloading bugs · ffd3932c
  Your Name authored Mar 14, 2026
```
- Fixed model_key variable scope issue in GGUF->sd.cpp fallback
- Fixed model_path undefined in diffusers preloading section
- These fixes prevent startup crashes when using --loadall
```
  ffd3932c
- Implement image generation fallback chain: try torch/diffusers first, then sd.cpp · 7f5bf82d
  Your Name authored Mar 14, 2026
```
- Reordered the image generation backend priority to try torch/diffusers first
- If torch/diffusers fails (ImportError or other error), fallback to stable-diffusion-cpp-python
- If both backends fail, return a helpful error message with installation instructions
- Added dynamic loading of sd.cpp model if not pre-loaded
```
  7f5bf82d
- Fix /v1/models API to show cached file paths instead of URLs for image models · 3b527c5a
  Your Name authored Mar 14, 2026
  
  3b527c5a
- Fix model_key to use cached path instead of URL · f2d4e4c2
  Your Name authored Mar 14, 2026
  
  f2d4e4c2
- Fix syntax error from GGML_VK_VISIBLE_DEVICES removal · 95fc9855
  Your Name authored Mar 14, 2026
  
  95fc9855
- Remove all GGML_VK_VISIBLE_DEVICES environment variable handling - user sets it externally · 895e94ca
  Your Name authored Mar 14, 2026
  
  895e94ca
- Remove GGML_VK_VISIBLE_DEVICES env var setting, user will set it externally · a4674d60
  Your Name authored Mar 14, 2026
  
  a4674d60
- Add debug for GPU device and environment in whisper-server · 881f2d46
  Your Name authored Mar 14, 2026
  
  881f2d46
- Add debug output for whisper-server startup and audio requests · 73d753e3
  Your Name authored Mar 14, 2026
  
  73d753e3
- Check if whisper-server already running before creating new instance · b037bfa8
  Your Name authored Mar 14, 2026
  
  b037bfa8
- Add debug for models in manager before audio setup · 30412d42
  Your Name authored Mar 14, 2026
  
  30412d42
- Add backend to image_models debug · 24b4e779
  Your Name authored Mar 14, 2026
  
  24b4e779