• Stefy Lanza (nextime / spora )'s avatar
    vulkan: fold system role by template signal, not just architecture · 64eb74b7
    Stefy Lanza (nextime / spora ) authored
    Whether a model rejects the 'system' role is a property of the chat
    template baked into the specific GGUF, not the architecture: the gemma-2
    template and the official gemma template raise "System role not
    supported", while 'heretic' gemma4 quant conversions ship a permissive
    template that accepts system. Detect from the embedded
    tokenizer.chat_template (raise_exception/"system role") and fold only
    when it actually rejects system; fall back to architecture (Gemma) when
    no template is readable. Avoids needlessly folding permissive Gemma
    models while still covering gemma-2-9b and strict non-Gemma templates.
    The runtime "System role not supported" retry remains as a safety net.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    64eb74b7
Name
Last commit
Last update
..
admin Loading commit data...
api Loading commit data...
backends Loading commit data...
broker Loading commit data...
frontproxy Loading commit data...
models Loading commit data...
openai Loading commit data...
pydantic Loading commit data...
queue Loading commit data...
tasks Loading commit data...
__init__.py Loading commit data...
cli.py Loading commit data...
config.py Loading commit data...
main.py Loading commit data...
platform_paths.py Loading commit data...