- 15 Mar, 2026 1 commit
-
-
Your Name authored
- Add python-multipart to requirements.txt, requirements-nvidia.txt, requirements-vulkan.txt - Add llama-cpp-python to requirements-nvidia.txt for GGUF support - When using CUDA/nvidia backend with GGUF file, automatically use llama-cpp-python
-
- 14 Mar, 2026 39 commits
-
-
Your Name authored
- Add --vision-model for image/video-to-text models - When --file-path is set, return URL by default, base64 only if explicitly requested - Add --https flag with auto-certificate generation - Add --privkey and --pubkey for custom certificates
-
Your Name authored
- Add --image-cfg-scale CLI option (default 1.0) - Add get_cfg_scale() helper that auto-detects VRAM - If Vulkan and VRAM < 16GB, use cfg_scale=1.0 automatically
-
Your Name authored
- Add --image-backend, --audio-backend, --tts-backend CLI args - Add opencl to backend choices - Add OpenCL build target in build.sh
-
Your Name authored
-
Your Name authored
-
Your Name authored
- When --debug is enabled, show full command line coderai was called with - Add --nopreload flag to disable model preloading at startup - When --nopreload is not specified, skip checking for preloaded sd.cpp models (forces load in worker thread to avoid Vulkan context issues) - Fix image model preloading to respect --nopreload flag
-
Your Name authored
This helps prevent the GPU from staying at 100% utilization after image generation by allowing the Vulkan driver to transition from compute state to idle state.
-
Your Name authored
- Use keep_clip_on_cpu instead of clip_on_cpu - VAE tiling is handled internally in newer builds - Add n_threads using psutil.cpu_count() for optimal performance
-
Your Name authored
- Added --vae-tiling flag to enable VAE tiling for lower VRAM usage - Added --clip-on-cpu flag to run CLIP on CPU to save VRAM - Both options work with stable-diffusion-cpp-python
-
Your Name authored
- Added --image-seed argument to set default seed for image generation - Updated diffusers and sd.cpp code to use request seed or CLI default seed - Priority: request seed > CLI default seed > random
-
Your Name authored
- Added model_key initialization before sd.cpp loading in on-demand section - Added model_key assignment before adding model to manager
-
Your Name authored
- Fixed model_key variable scope issue in GGUF->sd.cpp fallback - Fixed model_path undefined in diffusers preloading section - These fixes prevent startup crashes when using --loadall
-
Your Name authored
- Reordered the image generation backend priority to try torch/diffusers first - If torch/diffusers fails (ImportError or other error), fallback to stable-diffusion-cpp-python - If both backends fail, return a helpful error message with installation instructions - Added dynamic loading of sd.cpp model if not pre-loaded
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
- Add debug output to trace audio model registration at startup - Add debug output when audio endpoint checks for audio_model - Fix global load_mode to be updated at startup based on --loadall/--loadswap flags
-
Your Name authored
- Fix missing indentation in async with semaphore block - Fix invalid elif syntax in load_mode determination - Fix request.steps reference (field doesn't exist in request model)
-
Your Name authored
-
Your Name authored
- Without --loadall: serialize all requests (one at a time) - With --loadall: allow one concurrent request per model
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-
Your Name authored
-