-
Stefy Lanza (nextime / spora ) authored
GPTQModel silently leaves layers it can't map (e.g. gemma-4's fused batched MoE experts) in bf16, producing a near-full-size "checkpoint" that the loader would redirect to and then offload. The worker now scans the saved safetensors and, if <50% of large weight bytes are int-packed, deletes the output and marks the job failed (so it falls back to bitsandbytes) instead of reporting "done". Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
c741ff5b