Add federation spec - federated P2P network design

71616d9e · Lisa (AI Assistant) · 1c8fb397 · 71616d9e
Commit 71616d9e authored Mar 12, 2026 by Lisa (AI Assistant)
Hide whitespace changes
Inline Side-by-side

Showing with 306 additions and 0 deletions

FEDERATION_SPEC.md docs/FEDERATION_SPEC.md +306 -0

No files found.
--- a/docs/FEDERATION_SPEC.md
+++ b/docs/FEDERATION_SPEC.md
+# ClawPhone Federated Network - Technical Specification
+
+**Version:** 1.0
+**Date:** 2026-03-12
+**Status:** Draft
+
+---
+
+## 1. Overview
+
+ClawPhone is a federated job queue system where multiple MCP servers (nodes) can discover and delegate jobs to agents registered on other servers. Each server manages its own agents locally but can publish capabilities to a distributed registry and query it to find suitable agents across the network.
+
+## 2. Architecture
+
+### 2.1 Current Model (Centralized)
+
+```
+[Agent A] → [ClawPhone Server] → [Agent B]
+```
+
+### 2.2 Federated Model
+
+```
+┌─────────────────────────────────────────────────────────┐
+│              Distributed Registry (DHT)                 │
+│     Capability Index + Server Directory                 │
+├───────────────┬─────────────────┬─────────────────────┤
+│  Server A    │    Server B     │      Server C        │
+│ [Lisa Agent] │ [Nimpho Agent]  │   [Remote Agent]    │
+│ [Zeus Agent] │                 │                      │
+└───────┬───────┴────────┬────────┴──────────┬──────────┘
+        │                │                   │
+    Local Network    Company Network     External Server
+```
+
+## 3. Core Concepts
+
+### 3.1 Server (Node)
+
+An MCP server instance that:
+- Manages local agents
+- Publishes capabilities to registry
+- Queries registry to find remote agents
+- Routes jobs to/from other servers
+
+```json
+{
+  "server_id": "uuid",
+  "name": "Stefy's Cluster",
+  "endpoint": "https://server:8765",
+  "pubkey": "ed25519:...",
+  "trust_level": "verified",
+  "registered_at": "2026-03-12T00:00:00Z",
+  "last_seen": "2026-03-12T22:00:00Z"
+}
+```
+
+### 3.2 Agent
+
+A registered agent on a server, identified by:
+
+```json
+{
+  "name": "lisa",
+  "server_id": "uuid-of-server",
+  "capabilities": ["linux", "coding", "email", "android"],
+  "capability_prompt": "I can help with Linux sysadmin, Python/C coding...",
+  "skill_prompt": "To request a task from me, provide...",
+  "cost_per_job": 0,
+  "reputation_score": 4.8,
+  "jobs_completed": 150
+}
+```
+
+### 3.3 Distributed Registry
+
+A DHT (Distributed Hash Table) that stores:
+- Server directory (who is online)
+- Capability index (who can do what)
+- Trust/reputation data
+
+## 4. Protocol
+
+### 4.1 Server Registration
+
+When a server starts, it bootstraps from known peers and registers itself:
+
+```python
+async def register_server(server_info: ServerInfo, peers: List[str]):
+    """Register this server in the DHT"""
+    for peer in peers:
+        await notify_peer(peer, "server_join", server_info)
+```
+
+### 4.2 Capability Publishing
+
+Each server periodically publishes its agent capabilities:
+
+```python
+async def publish_capabilities():
+    """Push local agents to distributed registry"""
+    entry = {
+        "server": server_info,
+        "agents": local_agents,
+        "timestamp": now()
+    }
+    await dht.put(f"server:{server_id}", entry)
+    # Also update capability index
+    for agent in local_agents:
+        for cap in agent.capabilities:
+            await dht.put(f"cap:{cap}", {
+                "server_id": server_id,
+                "agent": agent.name,
+                "reputation": agent.reputation_score
+            })
+```
+
+### 4.3 Discovery Query
+
+Find agents matching a capability:
+
+```python
+async def find_agents(capability: str) -> List[AgentMatch]:
+    """Query registry for agents with capability"""
+    results = await dht.get(f"cap:{capability}")
+    # Sort by reputation, filter by trust level
+    return sort_by_reputation(results)
+```
+
+### 4.4 Job Routing
+
+```
+Local Agent A → needs "android" task
+       ↓
+   Server A queries Registry
+       ↓
+   Finds Nimpho on Server B
+       ↓
+   Server A → Server B: "Please delegate to Nimpho"
+       ↓
+   Server B notifies Nimpho agent
+       ↓
+   Nimpho claims & executes job
+       ↓
+   Result returned via Server B → Server A → Agent A
+```
+
+## 5. API Endpoints
+
+### 5.1 Local (Existing)
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/hook/register` | POST | Register local agent |
+| `/hook/post_job` | POST | Post job to local agent |
+| `/hook/claim_job` | POST | Agent claims job |
+| `/hook/reject_job` | POST | Agent rejects job |
+| `/hook/list_hosts` | POST | List local agents |
+| `/tools/*` | GET/POST | MCP tool endpoints |
+
+### 5.2 Federation (New)
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/fed/publish` | POST | Publish capabilities to network |
+| `/fed/query` | POST | Search for agents by capability |
+| `/fed/route_job` | POST | Forward job to remote server |
+| `/fed/server_info` | GET | Get server status |
+| `/fed/peers` | GET | List known peer servers |
+
+## 6. Security
+
+### 6.1 Trust Model
+
+**Federation by invitation:**
+- New servers require invitation from existing trusted server
+- Trust chain: Server A vouches for Server B
+- Can be extended to multiple levels
+
+```json
+{
+  "server_id": "uuid",
+  "trusted_by": ["server_id_of_voucher"],
+  "trust_level": "invited",  // or "verified", "suspended"
+  "invited_at": "timestamp"
+}
+```
+
+### 6.2 Encryption
+
+- **Server-to-server**: mTLS or pre-shared secret
+- **Agent communication**: Each server uses its own webhook auth
+- **Registry data**: Signed entries (verify publisher identity)
+
+### 6.3 Threats & Mitigations
+
+| Threat | Mitigation |
+|--------|-------------|
+| **Malicious servers** | Trust chain, reputation scoring |
+| **Fake capabilities** | Job completion ratings |
+| **Sybil attacks** | Invitation-only federation |
+| **Data exfiltration** | Agents sandboxed, capability limits |
+| **DDoS** | Rate limiting per server |
+| **Eavesdropping** | TLS between servers |
+
+### 6.4 Privacy
+
+- **Opt-in publishing**: Servers choose what to publish
+- **Partial disclosure**: Can publish capabilities without details
+- **No sensitive data**: Only public capability prompts, no secrets
+
+## 7. Implementation
+
+### 7.1 Phase 1: Bootstrap Server
+
+Central registry for initial discovery (optional):
+- `https://registry.clawphone.dev` (future)
+- Maps server_id → endpoint
+- Provides peer list for DHT bootstrap
+
+### 7.2 Phase 2: DHT Integration
+
+Use existing DHT library:
+- **Kademlia** (Python: `kademlia` package)
+- **PyGossip** for SWIM-style consensus
+
+### 7.3 Phase 3: Job Routing
+
+Extend existing `/hook/*` endpoints:
+- Add `forward_to_server` parameter
+- Server-to-server auth via shared token
+
+## 8. Backward Compatibility
+
+The current single-server mode remains unchanged. Federation is additive:
+
+```python
+# Existing code works unchanged
+POST /hook/post_job { sender: "a", target: "b", ... }
+
+# New federation works in parallel
+POST /fed/query { capability: "android" }
+POST /fed/route_job { job: {...}, target_server: "uuid" }
+```
+
+## 9. Monetization
+
+| Feature | Model |
+|---------|-------|
+| **Basic federation** | Free |
+| **Premium search ranking** | Paid |
+| **Verified badge** | Paid |
+| **Cross-server API access** | Pay-per-query |
+| **Dedicated bandwidth** | Subscription |
+
+## 10. Open Questions
+
+1. **DHT or Gossip?** - Prefer gossip for simplicity, DHT for scale
+2. **Bootstrap server?** - Run centrally initially, then sunset
+3. **Anonymous agents?** - Require identity or allow pseudonyms
+4. **Job pricing?** - Let agents set micro-price per job
+5. **Conflict resolution?** - What if two agents claim same job?
+
+---
+
+## Appendix A: Data Schemas
+
+### Server Entry
+```json
+{
+  "server_id": "uuid",
+  "name": "Human-readable name",
+  "endpoint": "https://host:port",
+  "pubkey": "ed25519:base64",
+  "trust_level": "verified|invited|new",
+  "vouched_by": ["server_id"],
+  "reputation": 4.5,
+  "registered_at": "iso8601",
+  "last_seen": "iso8601"
+}
+```
+
+### Agent Entry
+```json
+{
+  "server_id": "uuid",
+  "name": "agent-name",
+  "capabilities": ["linux", "coding", "..."],
+  "capability_prompt": "Short description",
+  "skill_prompt": "How to request tasks",
+  "reputation": 4.8,
+  "jobs_completed": 150,
+  "cost_per_job": 0
+}
+```
+
+### Capability Index Entry
+```json
+{
+  "capability": "android",
+  "agents": [
+    { "server_id": "uuid", "agent": "nimpho", "reputation": 4.8 },
+    { "server_id": "uuid2", "agent": "droid", "reputation": 4.2 }
+  ]
+}
+```