Commits · f6cd6c06ffd56ca9912ca98ede44fc83f866c405 · SexHackMe / vidai

07 Oct, 2025 40 commits

Fix timezone issue in cluster client uptime calculation · f6cd6c06

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Store connected_at as proper UTC timestamp in database using FROM_UNIXTIME/datetime
- Update web interface to handle datetime objects and timestamps correctly
- Ensure uptime starts from actual connection time, not offset by timezone

f6cd6c06

Fix cluster client GPU detection - extract backends from capabilities · 3abb4c58

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Fix cluster master to properly detect GPU backends from client capabilities
- Extract available_backends from capabilities list instead of gpu_info
- Ensure clients with GPU workers are correctly identified as GPU-enabled

3abb4c58

Fix cluster client uptime and worker display issues · 55422911

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Fix cluster client uptime calculation to start from 0 by using explicit UTC timestamps
- Fix cluster client workers not showing by populating cluster_master.clients dictionary
- Ensure connected_at uses current UTC time instead of database CURRENT_TIMESTAMP

55422911

Fix cluster nodes API to show actual worker counts · 8845dc86

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Get real worker process information from cluster master instead of placeholder data
- Display correct number of workers and their actual backends
- Improve accuracy of cluster node statistics

8845dc86

Fix cluster client uptime to show time since last successful connection · 4907cef6
Stefy Lanza (nextime / spora ) authored Oct 07, 2025
```
- Update connected_at timestamp on each successful connection
- Uptime now resets to 00:00:00 on reconnections as expected
```
4907cef6

Fix cluster client issues · 54826337

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Extract real client IP address from websocket connection
- Preserve connected_at timestamp for accurate uptime calculation
- Send full GPU device info from client to master for proper VRAM reporting

54826337

Fix cluster nodes sorting error · e5e3cc98

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Handle timestamp strings in sorting function
- Parse ISO timestamp strings to numeric values for proper sorting
- Prevent TypeError when sorting by last_seen timestamp

e5e3cc98

Add database migration for connected_at column · 6f374101

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Add ALTER TABLE to create connected_at column in existing databases
- Handle migration gracefully for databases created before schema update

6f374101

Fix subprocess import error in detect_gpu_backends · 44bb31c4

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Move subprocess import to function scope to avoid UnboundLocalError
- Ensure subprocess is available for fallback GPU detection

44bb31c4

Fix cluster client uptime calculation · 5794b09d

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Add connected_at timestamp to track when client first connected
- Calculate uptime from connection time instead of last seen time
- Update database schema and API to use proper uptime tracking

5794b09d

Fix GPU VRAM detection for cluster clients · 1da6025d

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Update detect_gpu_backends to collect actual VRAM for each GPU device
- Store device info including VRAM in gpu_info sent to master
- Use real VRAM data in cluster nodes API instead of hardcoded values
- Ensure consistent VRAM reporting between master and clients

1da6025d

Remove Status column from cluster nodes table · 97a13987

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Use green background color to indicate connected nodes
- Remove status text column for cleaner interface
- Update colspan values for table messages

97a13987

Update cluster nodes API to read from database · eb1870d9

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Store cluster client info in database for persistence
- Update API to read connected clients from database
- Maintain compatibility with existing web interface

eb1870d9

Fix method call in cluster master register processes · 4b98cda4

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Use _get_client_by_websocket instead of non-existent _get_client_by_socket
- Fixes client connection error during process registration

4b98cda4

Remove /cluster path from websocket URI · 0fc8c705
Stefy Lanza (nextime / spora ) authored Oct 07, 2025
```
- Client connects to wss://host:port instead of wss://host:port/cluster
- Fixes connection loop issue
```
0fc8c705

Add reconnection logic to cluster client · d87215b6

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Client now attempts to reconnect if connection is lost
- Prevents processes from being restarted on reconnection
- Maintains persistent cluster node operation

d87215b6

Fix cluster client process registration dict comprehension · b0c7da40

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Correct the dict comprehension for registering processes with master
- Fix duplicate entries and incorrect model assignment
- Apply same fix to restart workers function

b0c7da40

Start local backend in cluster client mode · d3ac0046

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Workers need a local backend to connect to even in client mode
- Add backend startup and readiness check for cluster clients
- Ensure proper cleanup on exit

d3ac0046

Add cluster SSL certificates to .gitignore · 0bb73422
Stefy Lanza (nextime / spora ) authored Oct 07, 2025
```
- Ignore cluster.crt and cluster.key generated certificates
- Remove committed certificates from repository
```
0bb73422
Fix websockets handler signature for newer websockets version · 8d471b19
Stefy Lanza (nextime / spora ) authored Oct 07, 2025
```
- Remove 'path' parameter from _handle_client method
- Compatible with websockets 12+ which removed the path argument
```
8d471b19

Integrate secure websocket cluster master into main vidai.py · 81b440d2

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Modify ClusterMaster to accept host parameter
- Start cluster master in vidai.py when running as master
- Use --cluster-host and --cluster-port for websocket server binding
- Default to 0.0.0.0:5003 for cluster master

81b440d2

Fix variable name conflict in admin config · 772f6213

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Rename 'flash' variable to 'flash_enabled' to avoid shadowing the flash() function
- Resolve TypeError when saving admin configuration

772f6213

Remove backend selection from admin config AI Settings · 2c7ae960

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Remove analysis_backend and training_backend fields from /admin/config
- These are now configured per worker in the cluster nodes interface
- Clean up unused imports and form processing

2c7ae960

Fix missing imports in admin config page · 02b5e4e9

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Add missing set_* function imports to admin.py config route
- Resolve NameError when saving admin configuration

02b5e4e9

Enlarge cluster nodes page container to 95% width · a55af666

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Change container max-width to 95% for better use of screen space
- Maintain centered layout for the cluster nodes table

a55af666

Fix cluster nodes modal to show per-worker driver selects · 044c2cf2

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Show individual select forms for every worker in the driver modal
- Update API to handle per-worker driver selection for local nodes
- Maintain compatibility with existing backend switching logic

044c2cf2

Add --config CLI argument and fix cluster nodes driver selection · a7d2d90e

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Add --config <file> argument to load config from custom path
- Modify config loader to use custom config file if specified
- Fix cluster nodes interface to only show available GPU backends for workers
- Differentiate between local and remote node driver selection

a7d2d90e

Remove Settings page link from admin navbar · 63965769

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Removed the 'Settings' link from the admin navigation menu
- Settings page route and template still exist but are no longer accessible from navbar
- Admin navbar now shows: Cluster Tokens, Cluster Nodes (no Settings)

63965769

Fix Set Driver modal functionality · 1e53bfb9

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Added workers array to local node API response for modal population
- Fixed table colspan values to match 13 columns
- Removed debug console.log statements
- Modal should now open and show worker driver selection options

1e53bfb9

Fix Set Driver button click handler · 314a1125

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Fixed JavaScript template literal issue preventing button clicks from working
- Changed from inline onclick with template variables to data attributes + event delegation
- Added event listener for .set-driver-btn class buttons
- Buttons now properly read hostname and token from data attributes
- Modal should now open when clicking Set Driver buttons

314a1125

Remove NVIDIA-only GPU filtering, detect all working CUDA/ROCm GPUs · 13ffc88e

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Removed brand-specific filtering that only allowed NVIDIA GPUs
- Now detects any GPU that can actually perform CUDA or ROCm operations
- Functional test determines if GPU should be included, not brand
- GPUs are shown with correct system indices (Device 0, 1, etc.)
- AMD GPUs that support ROCm will be shown if functional
- CUDA GPUs from any vendor will be shown if functional

13ffc88e

Fix GPU VRAM detection to use correct method from /api/stats · efbb77ce

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Updated GPU VRAM detection to use torch.cuda.get_device_properties(i).total_memory / 1024**3
- Same method as used in /api/stats endpoint for consistency
- Still filters out non-NVIDIA and non-functional GPUs
- Now shows correct VRAM amounts (e.g., 24GB for RTX 3090 instead of hardcoded 8GB)
- Fixed both worker-level and node-level GPU detection

efbb77ce

Add debug logging to GPU detection · f91fafcf

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Added debug output to see what CUDA device names are detected
- Will help identify why AMD GPU is still being counted as CUDA device
- Debug output shows device names and functional test results
- User can now see what devices PyTorch is detecting

f91fafcf

Fix GPU detection to only count working, functional GPUs · 056cbbf3

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Modified detect_gpu_backends() to perform functional tests on GPUs
- CUDA detection now verifies devices can actually perform tensor operations
- ROCm detection now tests device functionality before counting
- Only NVIDIA GPUs are counted for CUDA, and only functional devices
- Prevents counting of non-working GPUs like old AMD cards misreported as CUDA
- Example: System with old AMD GPU (device 0) + working CUDA GPU (device 1) now correctly shows only the functional CUDA GPU
- Total VRAM calculation now reflects only actually usable GPUs
- Both PyTorch and nvidia-smi/rocm-smi detection paths updated

056cbbf3

Fix GPU VRAM detection to count only available GPUs · ffe34516

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Modified local node GPU memory calculation to only count GPUs that are actually available for supported backends
- Previously counted all GPUs in system, now only counts CUDA GPUs if CUDA is available and ROCm GPUs if ROCm is available
- Fixes issue where unsupported GPUs (like old AMD GPUs without ROCm support) were incorrectly included in VRAM totals
- Example: System with old AMD GPU (8GB, no ROCm) and CUDA GPU (24GB) now correctly shows 24GB total instead of 32GB
- Ensures accurate GPU resource reporting in cluster nodes interface

ffe34516

Implement per-worker driver selection modal · 4ca34e75

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Modified modal to show individual GPU-requiring workers on each node
- Allow granular driver selection (CUDA/ROCm/CPU) for each worker subprocess
- Updated database schema to store driver preferences per worker (hostname + token + worker_name)
- Enhanced API to handle per-worker driver setting with form field parsing
- Added restart_client_worker method to cluster master for individual worker restarts
- Frontend now displays worker-specific driver selection controls in modal
- Maintains node-level table view while providing worker-level configuration
- Supports CPU-only nodes and mixed GPU/CPU worker configurations
- Backward compatible with existing single-driver preference system

4ca34e75

Fix NameError in cluster nodes API · 5cbdab26

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Fixed undefined variable 'local_gpu_backends' in api_cluster_nodes function
- Properly defined local_available_backends, local_gpu_backends, and local_cpu_backends
- Updated local node detection to show nodes with any available backends (GPU or CPU)
- Ensured CPU-only nodes are correctly identified and displayed
- Maintained backward compatibility with existing GPU-only node detection

5cbdab26

Allow CPU-only cluster clients and flexible backend support · bd087af5

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Removed GPU-only requirement for cluster client connections
- CPU-only clients can now join cluster and run CPU-based workers
- Master accepts all clients regardless of GPU availability
- Nodes are properly marked as CPU-only when no GPUs detected
- Driver selection modal supports CUDA, ROCm, and CPU backends
- Local and remote workers can use any available backend (GPU or CPU)
- Enhanced cluster flexibility for mixed hardware environments
- CPU nodes contribute to cluster for CPU-only processing tasks
- Maintains backward compatibility with existing GPU-only workflows
- Clear node type identification in cluster management interface

bd087af5

Enforce GPU-only cluster participation · f57a1468

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Cluster clients now refuse to connect without GPU capabilities (CUDA/ROCm)
- Cluster master rejects authentication from clients without GPU backends
- Local master node only appears in cluster nodes list if GPU backends are available
- Master already prevented launching local worker processes without GPUs
- Systems without GPUs cannot participate in distributed processing
- Clear error messages when GPU requirements are not met
- Maintains cluster integrity by ensuring all nodes contribute computational power

f57a1468

Restrict driver selection to available GPU backends only · abec9e31

Stefy Lanza (nextime / spora ) authored Oct 07, 2025

- Removed CPU option from driver selection (only CUDA/ROCm GPU drivers)
- Set CUDA as default driver selection when available
- Added available_gpu_backends field to node API responses
- Frontend dynamically populates driver options based on node's available GPUs
- API validation rejects non-GPU driver requests
- Cluster clients only accept CUDA/ROCm backend restart commands
- Improved user experience by showing only relevant driver options per node

abec9e31