• Stefy Lanza (nextime / spora )'s avatar
    Implement GPU prioritization and weight-based job distribution · 4f6f914d
    Stefy Lanza (nextime / spora ) authored
    - Added --weight parameter to client connections (default: 100)
    - Modified cluster master to prioritize GPU-enabled clients for job distribution
    - GPU clients always get precedence over CPU-only clients
    - When no GPU workers have required model, GPU clients still preferred for model distribution
    - Client weights are combined with process weights for load balancing
    - Higher weight = more jobs assigned to that client
    
    Job distribution priority:
    1. GPU clients with required model already loaded
    2. CPU clients with required model already loaded
    3. GPU clients (model will be sent)
    4. CPU clients (model will be sent)
    
    Within each category, clients are selected based on combined weight.
    4f6f914d
Name
Last commit
Last update
docs Loading commit data...
templates Loading commit data...
vidai Loading commit data...
.gitignore Loading commit data...
AI.PROMPT Loading commit data...
CHANGELOG.md Loading commit data...
Dockerfile.runpod Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...
TODO.md Loading commit data...
build.bat Loading commit data...
build.sh Loading commit data...
clean.bat Loading commit data...
clean.sh Loading commit data...
create_pod.sh Loading commit data...
image.jpg Loading commit data...
requirements-cuda.txt Loading commit data...
requirements-rocm.txt Loading commit data...
requirements.txt Loading commit data...
setup.bat Loading commit data...
setup.sh Loading commit data...
start.bat Loading commit data...
test_comm.py Loading commit data...
test_runpod.py Loading commit data...
vidai.py Loading commit data...
vidai.sh Loading commit data...