• Stefy Lanza (nextime / spora )'s avatar
    Use llama.cpp's create_chat_completion for proper chat template handling · 7947fb75
    Stefy Lanza (nextime / spora ) authored
    - Add generate_chat() and generate_chat_stream() methods to VulkanBackend
    - These use create_chat_completion() which properly applies model's chat template
    - Fallback to manual formatting if create_chat_completion fails
    - Update API endpoints to pass messages dict directly instead of formatted prompt
    - Fixes garbled output with Qwen3 and other models that use custom chat templates
    7947fb75
Name
Last commit
Last update
__pycache__ Loading commit data...
LICENSE.md Loading commit data...
README.md Loading commit data...
build.sh Loading commit data...
coder Loading commit data...
coderai Loading commit data...
requirements-nvidia.txt Loading commit data...
requirements-vulkan.txt Loading commit data...
requirements.txt Loading commit data...