• Stefy Lanza (nextime / spora )'s avatar
    Fix balanced offload strategy VRAM estimation · 15e14a57
    Stefy Lanza (nextime / spora ) authored
    - Account for LoRA overhead (~4GB) in VRAM calculations
    - Add 30% inference overhead for activation memory
    - Use more conservative 70% threshold (was 85%)
    - Add OOM fallback to model CPU offload if GPU loading fails
    - Switch fallback from sequential to model offload for better performance
    15e14a57
Name
Last commit
Last update
static Loading commit data...
templates Loading commit data...
.gitignore Loading commit data...
EXAMPLES.md Loading commit data...
LICENSE.md Loading commit data...
README.md Loading commit data...
SKILL.md Loading commit data...
check_model.py Loading commit data...
check_pipelines.py Loading commit data...
debug_model_select.py Loading commit data...
logo.png Loading commit data...
requirements.txt Loading commit data...
screenshot.png Loading commit data...
videogen.py Loading commit data...
videogen_mcp_server.py Loading commit data...
videogen_models.json Loading commit data...
webapp.py Loading commit data...