Commit 745909f9 authored by Lisa's avatar Lisa

Add PROTOCOL.md documentation for WebSocket protocol specification

parent 1d066e88
# Hermes Node Protocol Specification
**Version:** 1.0
**Date:** 2026-04-29
**Purpose:** Reverse-connection node execution with permission model, compatible with existing OpenClaw `sexec.sh` scripts.
---
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────┐
│ Hermes Gateway │
│ ┌────────────────────────────────────────────────────┐ │
│ │ WebSocket Server (port 8765) │ │
│ │ - Accepts node connections │ │
│ │ - Authenticates via token │ │
│ │ - Maintains node registry │ │
│ │ - HTTP API for command submission │ │
│ └─────────────────────────────┬───────────────────────┘ │
│ │ routes │
│ ┌─────────────────────────────▼───────────────────────┐ │
│ │ Command Router + Approval Engine │ │
│ │ - Matches commands to nodes │ │
│ │ - Handles "ask" list approval via user prompts │ │
│ │ - Streams output back │ │
│ └────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
▲ │
│ WebSocket (node-initiated) │ HTTP/WS
│ │
┌────────┴─────────────────────────────────┴─────────────┐
│ Remote Node Agent │
│ ┌──────────────────────────────────────────────────┐ │
│ │ WebSocket Client → connects to gateway │ │
│ │ - Auto-reconnect on disconnect │ │
│ │ - Heartbeat every 30s │ │
│ │ - Handles auth token │ │
│ └───────────────────────────┬──────────────────────┘ │
│ │ receives │
│ ┌───────────────────────────▼──────────────────────┐ │
│ │ Command Executor (sexec wrapper) │ │
│ │ - Runs: /path/to/sexec.sh run --command ... │ │
│ │ - Streams stdout/stderr back │ │
│ │ - Returns exit code │ │
│ └──────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────┘
Permission System (sexec.sh + config.json) - reused exactly
```
---
## Message Protocol
All messages are JSON over WebSocket.
### 1. Node → Gateway: Registration
**On connect:**
```json
{
"type": "register",
"node_name": "sissy",
"version": "1.0",
"capabilities": ["exec", "sysinfo"],
"sexec_path": "/home/openclaw/.openclaw/skills/sexec/sexec.sh"
}
```
**Gateway response:**
```json
{
"type": "register_ack",
"status": "ok",
"node_id": "sissy",
"gateway_version": "1.0"
}
```
### 2. Node → Gateway: Heartbeat
**Every 30 seconds:**
```json
{
"type": "heartbeat",
"timestamp": 1714392000
}
```
**Gateway response:**
```json
{
"type": "heartbeat_ack",
"timestamp": 1714392000
}
```
### 3. Gateway → Node: Execute Command
```json
{
"type": "exec",
"id": "cmd-a1b2c3d4",
"command": ["df", "-h"],
"timeout": 30,
"approved": false
}
```
**Fields:**
- `id`: Unique command ID for tracking
- `command`: Array of command + args (e.g., `["df", "-h"]`)
- `timeout`: Max execution time in seconds
- `approved`: If `true`, bypass "ask" list (user explicitly approved)
### 4. Node → Gateway: Command Output (streaming)
**Stdout chunk:**
```json
{
"type": "exec_output",
"id": "cmd-a1b2c3d4",
"stream": "stdout",
"data": "Filesystem Size Used Avail Use% Mounted on\n"
}
```
**Stderr chunk:**
```json
{
"type": "exec_output",
"id": "cmd-a1b2c3d4",
"stream": "stderr",
"data": "warning: deprecated option\n"
}
```
### 5. Node → Gateway: Command Complete
```json
{
"type": "exec_complete",
"id": "cmd-a1b2c3d4",
"exit_code": 0,
"duration_ms": 1234
}
```
### 6. Node → Gateway: Approval Required
**When command matches "ask" list:**
```json
{
"type": "exec_approval_required",
"id": "cmd-a1b2c3d4",
"command": ["sudo", "gnt-instance", "stop", "prod-db"],
"reason": "Command matches ask pattern: 'sudo gnt-instance stop *'"
}
```
**Gateway forwards to user, then responds:**
```json
{
"type": "exec_approval_response",
"id": "cmd-a1b2c3d4",
"approved": true,
"add_to_allowlist": false
}
```
### 7. Node → Gateway: Command Denied
**When command matches "deny" list:**
```json
{
"type": "exec_denied",
"id": "cmd-a1b2c3d4",
"command": ["rm", "-rf", "/"],
"reason": "Command matches deny pattern: 'rm -rf /*'"
}
```
### 8. Gateway → Node: Disconnect
```json
{
"type": "disconnect",
"reason": "gateway_shutdown"
}
```
---
## Browser Control
**Optional capability** — requires Playwright on the node.
### Capability Registration
Node registers `browser_control` in capabilities during registration:
```json
{
"type": "register",
"node_name": "sissy",
"version": "1.0",
"capabilities": ["exec", "sysinfo", "browser_control"],
"sexec_path": "/home/openclaw/.openclaw/skills/sexec/sexec.sh"
}
```
---
## Authentication
### Token-Based Auth
**Node config file** (`/etc/hermes-node/config.json`):
```json
{
"gateway_url": "ws://192.168.42.115:8765",
"node_name": "sissy",
"token": "node-sissy-secret-token-abc123",
"sexec_path": "/home/openclaw/.openclaw/skills/sexec/sexec.sh",
"reconnect_interval": 5,
"heartbeat_interval": 30
}
```
**WebSocket connection URL:**
```
ws://192.168.42.115:8765/nodes?token=node-sissy-secret-token-abc123
```
Gateway validates token against stored registry:
```json
{
"sissy": "node-sissy-secret-token-abc123",
"zeiss": "node-zeiss-secret-token-def456",
"spank": "node-spank-secret-token-ghi789"
}
```
---
## Gateway HTTP API
For Hermes skill to submit commands.
### POST /nodes/{node_name}/exec
**Request:**
```json
{
"command": ["df", "-h"],
"timeout": 30,
"approved": false
}
```
**Response (streaming):**
```json
{"type": "stdout", "data": "Filesystem Size...\n"}
{"type": "stderr", "data": ""}
{"type": "exit", "code": 0}
```
### GET /nodes
**Response:**
```json
{
"nodes": [
{
"name": "sissy",
"status": "connected",
"last_seen": 1714392000,
"uptime": 86400
},
{
"name": "zeiss",
"status": "connected",
"last_seen": 1714392005,
"uptime": 172800
}
]
}
```
### GET /nodes/{node_name}/status
**Response:**
```json
{
"name": "sissy",
"status": "connected",
"last_seen": 1714392000,
"uptime": 86400,
"version": "1.0",
"capabilities": ["exec", "sysinfo"]
}
```
---
## Security Model
### 1. No Gateway SSH Keys
- Gateway never stores SSH keys for nodes
- Gateway never initiates connections to nodes
- Nodes connect out to gateway (firewall-friendly)
### 2. Token Authentication
- Each node has unique pre-shared token
- Tokens stored in `/etc/hermes-node/config.json` on node
- Tokens stored in gateway registry (file or DB)
### 3. Permission System (Reused from sexec)
- Each node keeps existing `sexec.sh` + `config.json`
- `allow` list: auto-execute
- `ask` list: require user approval
- `deny` list: reject immediately
### 4. Command Approval Flow
```
User → Hermes → Gateway → Node
sexec checks config.json
matches "ask" list
sends approval_required
Gateway → User: "Approve 'sudo gnt-instance stop prod-db'?"
User: "yes"
Gateway → Node: approved=true
sexec executes
```
---
## Error Handling
### Node Disconnection
- Gateway marks node as "disconnected"
- Queued commands return error: `{"error": "node_offline"}`
- Node auto-reconnects with exponential backoff (5s, 10s, 20s, max 60s)
### Command Timeout
- If node doesn't respond within `timeout` seconds:
- Gateway sends `{"type": "exec_cancel", "id": "..."}`
- Node kills process
- Returns `{"type": "exec_complete", "exit_code": -1, "error": "timeout"}`
### Gateway Restart
- Nodes detect disconnect via heartbeat failure
- Nodes reconnect automatically
- In-flight commands are lost (client must retry)
---
## Deployment
### Gateway Side
1. Install `hermes-node-gateway` service
2. Configure tokens in `/etc/hermes-node-gateway/tokens.json`
3. Start service: `/etc/init.d/hermes-node-gateway start`
4. Install `hermes-node-exec` skill for Hermes
### Node Side
1. Install `hermes-node-agent` package
2. Configure `/etc/hermes-node/config.json` with gateway URL + token
3. Ensure `sexec.sh` is installed and configured
4. Start service: `/etc/init.d/hermes-node-agent start`
5. Verify connection: `tail -f /var/log/hermes-node-agent.log`
---
## Compatibility
### With OpenClaw sexec
- ✅ Reuses exact same `sexec.sh` binary
- ✅ Reuses exact same `config.json` format
- ✅ Reuses exact same permission logic
- ✅ No changes needed to existing sexec installations
### Migration from OpenClaw
1. Keep existing sexec on nodes
2. Install node agent alongside
3. Configure gateway URL + token
4. Start agent
5. Disable OpenClaw gateway (optional)
---
## Future Enhancements
- **TLS/WSS support** for encrypted connections
- **Certificate-based auth** instead of tokens
- **Command history** and audit log
- **Multi-gateway failover** (node connects to backup if primary down)
- **Bidirectional file transfer** (upload/download via WebSocket)
- **Real-time log streaming** (tail -f over WebSocket)
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment