Fix offload-strategy parameter passing to CUDA backend
- Add offload_strategy to kwargs in _load_default_model and _load_model_by_name - Fix parameter name: ram -> manual_ram_gb to match backend expectation - Also pass load_in_4bit, load_in_8bit, and max_gpu_percent
Showing
Please
register
or
sign in
to comment