Files · 7947fb752625cdd68a570f7bcc4cccb0180d33e1 · nexlab / coderai

Use llama.cpp's create_chat_completion for proper chat template handling · 7947fb75

Stefy Lanza (nextime / spora ) authored Feb 28, 2026

- Add generate_chat() and generate_chat_stream() methods to VulkanBackend
- These use create_chat_completion() which properly applies model's chat template
- Fallback to manual formatting if create_chat_completion fails
- Update API endpoints to pass messages dict directly instead of formatted prompt
- Fixes garbled output with Qwen3 and other models that use custom chat templates

7947fb75

Name	Last commit	Last update
__pycache__		Loading commit data...
LICENSE.md		Loading commit data...
README.md		Loading commit data...
build.sh		Loading commit data...
coder		Loading commit data...
coderai		Loading commit data...
requirements-nvidia.txt		Loading commit data...
requirements-vulkan.txt		Loading commit data...
requirements.txt		Loading commit data...

README.md