-
Stefy Lanza (nextime / spora ) authored
Whether a model rejects the 'system' role is a property of the chat template baked into the specific GGUF, not the architecture: the gemma-2 template and the official gemma template raise "System role not supported", while 'heretic' gemma4 quant conversions ship a permissive template that accepts system. Detect from the embedded tokenizer.chat_template (raise_exception/"system role") and fold only when it actually rejects system; fall back to architecture (Gemma) when no template is readable. Avoids needlessly folding permissive Gemma models while still covering gemma-2-9b and strict non-Gemma templates. The runtime "System role not supported" retry remains as a safety net. Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
64eb74b7