• Stefy Lanza (nextime / spora )'s avatar
    Fix job execution in cluster: implement proper job assignment and result handling · 50a87bee
    Stefy Lanza (nextime / spora ) authored
    - Fixed process type mapping in queue manager ('analyze' -> 'analysis', 'train' -> 'training')
    - Implemented actual job sending in cluster master assign_job_to_worker()
    - Modified cluster client to forward jobs to local backend and monitor results
    - Added result polling mechanism for cluster jobs
    - Jobs should now execute on connected cluster workers instead of remaining queued
    
    The issue was that jobs were being assigned but never sent to workers. Now:
    1. Queue manager selects worker using VRAM-aware logic
    2. Cluster master assigns job and sends it via websocket
    3. Cluster client receives job and forwards to local backend
    4. Cluster client polls backend for results and sends back to master
    5. Results are properly returned to web interface
    50a87bee
Name
Last commit
Last update
docs Loading commit data...
templates Loading commit data...
vidai Loading commit data...
.gitignore Loading commit data...
AI.PROMPT Loading commit data...
CHANGELOG.md Loading commit data...
Dockerfile.runpod Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...
TODO.md Loading commit data...
build.bat Loading commit data...
build.sh Loading commit data...
clean.bat Loading commit data...
clean.sh Loading commit data...
create_pod.sh Loading commit data...
image.jpg Loading commit data...
requirements-cuda.txt Loading commit data...
requirements-rocm.txt Loading commit data...
requirements.txt Loading commit data...
setup.bat Loading commit data...
setup.sh Loading commit data...
start.bat Loading commit data...
test_comm.py Loading commit data...
test_runpod.py Loading commit data...
vidai.conf.sample Loading commit data...
vidai.py Loading commit data...
vidai.sh Loading commit data...