vulkan: fold system message into user turn when template rejects it
Gemma's chat template has no 'system' role; llama.cpp raises "System
role not supported" and the generation fails (the Kilo client always
sends a system prompt). On that specific error, retry with the system
message(s) folded into the first user turn — Gemma's own convention,
and a no-op for models that accept system. Handles both streaming and
non-streaming paths and preserves multimodal (list) content.
Co-Authored-By:
Claude Opus 4.8 <noreply@anthropic.com>
Showing
Please
register
or
sign in
to comment