-
Stefy Lanza (nextime / spora ) authored
- New --offload_strategy balanced option - Loads model fully to GPU if it fits (with 15% buffer) - Only uses sequential offloading when VRAM is insufficient - Maximizes GPU utilization while preventing OOM errors
e5c12b7f