• Your Name's avatar
    Fix offload-strategy parameter passing to CUDA backend · bf1d3f52
    Your Name authored
    - Add offload_strategy to kwargs in _load_default_model and _load_model_by_name
    - Fix parameter name: ram -> manual_ram_gb to match backend expectation
    - Also pass load_in_4bit, load_in_8bit, and max_gpu_percent
    bf1d3f52
manager.py 63.4 KB