Commits · dd4dfff4badb195fd3aea10c86513ae572b5fa4b · nexlab / coderai

15 Mar, 2026 1 commit

Add python-multipart to requirements, GGUF support for CUDA backend · dd4dfff4

Your Name authored Mar 15, 2026

- Add python-multipart to requirements.txt, requirements-nvidia.txt, requirements-vulkan.txt
- Add llama-cpp-python to requirements-nvidia.txt for GGUF support
- When using CUDA/nvidia backend with GGUF file, automatically use llama-cpp-python

dd4dfff4

14 Mar, 2026 39 commits
- Add --vision-model, fix --file-path to return URL by default, add HTTPS support · c152ee28
  Your Name authored Mar 14, 2026
```
- Add --vision-model for image/video-to-text models
- When --file-path is set, return URL by default, base64 only if explicitly requested
- Add --https flag with auto-certificate generation
- Add --privkey and --pubkey for custom certificates
```
  c152ee28
- Add --image-cfg-scale and auto-detect VRAM for Vulkan · 29d2ed78
  Your Name authored Mar 14, 2026
```
- Add --image-cfg-scale CLI option (default 1.0)
- Add get_cfg_scale() helper that auto-detects VRAM
- If Vulkan and VRAM < 16GB, use cfg_scale=1.0 automatically
```
  29d2ed78
- Add per-model backend selection and OpenCL support · bf2a5318
  Your Name authored Mar 14, 2026
```
- Add --image-backend, --audio-backend, --tts-backend CLI args
- Add opencl to backend choices
- Add OpenCL build target in build.sh
```
  bf2a5318
- Remove vision model alias - use image only · 4ce1f330
  Your Name authored Mar 14, 2026
  
  4ce1f330
- Rename --vision-ctx and --vision-offload to --image-ctx and --image-offload · d5f7b07c
  Your Name authored Mar 14, 2026
  
  d5f7b07c
- Add --debug command line output and --nopreload flag · 2cdd7538
  Your Name authored Mar 14, 2026
```
- When --debug is enabled, show full command line coderai was called with
- Add --nopreload flag to disable model preloading at startup
- When --nopreload is not specified, skip checking for preloaded sd.cpp models (forces load in worker thread to avoid Vulkan context issues)
- Fix image model preloading to respect --nopreload flag
```
  2cdd7538
- Add time.sleep after sd.cpp generation to let Vulkan driver settle · ac069fe2
  Your Name authored Mar 14, 2026
```
This helps prevent the GPU from staying at 100% utilization after
image generation by allowing the Vulkan driver to transition from
compute state to idle state.
```
  ac069fe2
- Fix sd.cpp parameters and add n_threads · 9fd66d57
  Your Name authored Mar 14, 2026
```
- Use keep_clip_on_cpu instead of clip_on_cpu
- VAE tiling is handled internally in newer builds
- Add n_threads using psutil.cpu_count() for optimal performance
```
  9fd66d57
- Add --vae-tiling and --clip-on-cpu for sd.cpp · d3fab6b1
  Your Name authored Mar 14, 2026
```
- Added --vae-tiling flag to enable VAE tiling for lower VRAM usage
- Added --clip-on-cpu flag to run CLIP on CPU to save VRAM
- Both options work with stable-diffusion-cpp-python
```
  d3fab6b1
- Add --image-seed CLI argument for default image generation seed · a680d5ab
  Your Name authored Mar 14, 2026
```
- Added --image-seed argument to set default seed for image generation
- Updated diffusers and sd.cpp code to use request seed or CLI default seed
- Priority: request seed > CLI default seed > random
```
  a680d5ab
- Fix model_key undefined in on-demand image loading section · 7f60e0df
  Your Name authored Mar 14, 2026
```
- Added model_key initialization before sd.cpp loading in on-demand section
- Added model_key assignment before adding model to manager
```
  7f60e0df
- Fix --loadall model preloading bugs · ffd3932c
  Your Name authored Mar 14, 2026
```
- Fixed model_key variable scope issue in GGUF->sd.cpp fallback
- Fixed model_path undefined in diffusers preloading section
- These fixes prevent startup crashes when using --loadall
```
  ffd3932c
- Implement image generation fallback chain: try torch/diffusers first, then sd.cpp · 7f5bf82d
  Your Name authored Mar 14, 2026
```
- Reordered the image generation backend priority to try torch/diffusers first
- If torch/diffusers fails (ImportError or other error), fallback to stable-diffusion-cpp-python
- If both backends fail, return a helpful error message with installation instructions
- Added dynamic loading of sd.cpp model if not pre-loaded
```
  7f5bf82d
- Fix /v1/models API to show cached file paths instead of URLs for image models · 3b527c5a
  Your Name authored Mar 14, 2026
  
  3b527c5a
- Fix model_key to use cached path instead of URL · f2d4e4c2
  Your Name authored Mar 14, 2026
  
  f2d4e4c2
- Fix syntax error from GGML_VK_VISIBLE_DEVICES removal · 95fc9855
  Your Name authored Mar 14, 2026
  
  95fc9855
- Remove all GGML_VK_VISIBLE_DEVICES environment variable handling - user sets it externally · 895e94ca
  Your Name authored Mar 14, 2026
  
  895e94ca
- Remove GGML_VK_VISIBLE_DEVICES env var setting, user will set it externally · a4674d60
  Your Name authored Mar 14, 2026
  
  a4674d60
- Add debug for GPU device and environment in whisper-server · 881f2d46
  Your Name authored Mar 14, 2026
  
  881f2d46
- Add debug output for whisper-server startup and audio requests · 73d753e3
  Your Name authored Mar 14, 2026
  
  73d753e3
- Check if whisper-server already running before creating new instance · b037bfa8
  Your Name authored Mar 14, 2026
  
  b037bfa8
- Add debug for models in manager before audio setup · 30412d42
  Your Name authored Mar 14, 2026
  
  30412d42
- Add backend to image_models debug · 24b4e779
  Your Name authored Mar 14, 2026
  
  24b4e779
- Add backend debug output for load_mode · 41fc5e8c
  Your Name authored Mar 14, 2026
  
  41fc5e8c
- Make --loadswap preload models like --loadall for Vulkan backend · acf62437
  Your Name authored Mar 14, 2026
  
  acf62437
- Fix UnboundLocalError for load_mode variable · 07f7a4d3
  Your Name authored Mar 14, 2026
  
  07f7a4d3
- Add debug logging for loadall mode preloading · 84345e4d
  Your Name authored Mar 14, 2026
  
  84345e4d
- Add debug logging for audio model configuration · 7d953711
  Your Name authored Mar 14, 2026
```
- Add debug output to trace audio model registration at startup
- Add debug output when audio endpoint checks for audio_model
- Fix global load_mode to be updated at startup based on --loadall/--loadswap flags
```
  7d953711
- Fix bugs in image generation endpoint · 3be02b8c
  Your Name authored Mar 14, 2026
```
- Fix missing indentation in async with semaphore block
- Fix invalid elif syntax in load_mode determination
- Fix request.steps reference (field doesn't exist in request model)
```
  3be02b8c
- feat: add --image-1, --model-1, --audio-1, --tts-1 options to return 409 if model is busy · 2c0503d9
  Your Name authored Mar 14, 2026
  
  2c0503d9
- feat: add per-model semaphores for concurrent request handling · afbf976c
  Your Name authored Mar 14, 2026
```
- Without --loadall: serialize all requests (one at a time)
- With --loadall: allow one concurrent request per model
```
  afbf976c
- fix: use GGML_VK_VISIBLE_DEVICES for --image-vulkan-device · 85d3e544
  Your Name authored Mar 14, 2026
  
  85d3e544
- remove --image-vulkan-device - not supported by stable-diffusion-cpp · 7f5168c2
  Your Name authored Mar 14, 2026
  
  7f5168c2
- fix: set GGML_VULKAN_DEVICE before importing stable_diffusion_cpp · f4034d4c
  Your Name authored Mar 14, 2026
  
  f4034d4c
- fix: remove local os import that was causing scope issue · 0f102700
  Your Name authored Mar 14, 2026
  
  0f102700
- fix: use GGML_VULKAN_DEVICE for image model GPU selection · 12bedc4f
  Your Name authored Mar 14, 2026
  
  12bedc4f
- feat: add --image-vulkan-device for separate image model GPU selection · c019c1d9
  Your Name authored Mar 14, 2026
  
  c019c1d9
- fix: use generate_image instead of generate for stable-diffusion-cpp · 1b57ae00
  Your Name authored Mar 14, 2026
  
  1b57ae00
- feat: add 'image' and 'vision' aliases for image models · 8092e302
  Your Name authored Mar 14, 2026
  
  8092e302