Add balanced offload strategy for better VRAM utilization
- New --offload_strategy balanced option - Loads model fully to GPU if it fits (with 15% buffer) - Only uses sequential offloading when VRAM is insufficient - Maximizes GPU utilization while preventing OOM errors
Showing
Please
register
or
sign in
to comment