• Stefy Lanza (nextime / spora )'s avatar
    vulkan: fold system role by template signal, not just architecture · 64eb74b7
    Stefy Lanza (nextime / spora ) authored
    Whether a model rejects the 'system' role is a property of the chat
    template baked into the specific GGUF, not the architecture: the gemma-2
    template and the official gemma template raise "System role not
    supported", while 'heretic' gemma4 quant conversions ship a permissive
    template that accepts system. Detect from the embedded
    tokenizer.chat_template (raise_exception/"system role") and fold only
    when it actually rejects system; fall back to architecture (Gemma) when
    no template is readable. Avoids needlessly folding permissive Gemma
    models while still covering gemma-2-9b and strict non-Gemma templates.
    The runtime "System role not supported" retry remains as a safety net.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    64eb74b7
Name
Last commit
Last update
..
__init__.py Loading commit data...
base.py Loading commit data...
cuda.py Loading commit data...
ds4.py Loading commit data...
vulkan.py Loading commit data...