Improve GPU memory detection fallback chain
- Add nvidia-smi as intermediate fallback before PyTorch in GPU stats collection - Fallback order: pynvml -> nvidia-smi -> PyTorch - Applied to api.py, backend.py, and cluster_client.py GPU stats functions - nvidia-smi provides accurate memory usage and utilization data - Fix SocketCommunicator.receive_message() timeout parameter error - Added optional timeout parameter to receive_message method - Fixes 'unexpected keyword argument timeout' error in api_stats and backend functions
Showing
Please
register
or
sign in
to comment