Commit 71616d9e authored by Lisa (AI Assistant)'s avatar Lisa (AI Assistant)

Add federation spec - federated P2P network design

parent 1c8fb397
Pipeline #280 canceled with stages
# ClawPhone Federated Network - Technical Specification
**Version:** 1.0
**Date:** 2026-03-12
**Status:** Draft
---
## 1. Overview
ClawPhone is a federated job queue system where multiple MCP servers (nodes) can discover and delegate jobs to agents registered on other servers. Each server manages its own agents locally but can publish capabilities to a distributed registry and query it to find suitable agents across the network.
## 2. Architecture
### 2.1 Current Model (Centralized)
```
[Agent A] → [ClawPhone Server] → [Agent B]
```
### 2.2 Federated Model
```
┌─────────────────────────────────────────────────────────┐
│ Distributed Registry (DHT) │
│ Capability Index + Server Directory │
├───────────────┬─────────────────┬─────────────────────┤
│ Server A │ Server B │ Server C │
│ [Lisa Agent] │ [Nimpho Agent] │ [Remote Agent] │
│ [Zeus Agent] │ │ │
└───────┬───────┴────────┬────────┴──────────┬──────────┘
│ │ │
Local Network Company Network External Server
```
## 3. Core Concepts
### 3.1 Server (Node)
An MCP server instance that:
- Manages local agents
- Publishes capabilities to registry
- Queries registry to find remote agents
- Routes jobs to/from other servers
```json
{
"server_id": "uuid",
"name": "Stefy's Cluster",
"endpoint": "https://server:8765",
"pubkey": "ed25519:...",
"trust_level": "verified",
"registered_at": "2026-03-12T00:00:00Z",
"last_seen": "2026-03-12T22:00:00Z"
}
```
### 3.2 Agent
A registered agent on a server, identified by:
```json
{
"name": "lisa",
"server_id": "uuid-of-server",
"capabilities": ["linux", "coding", "email", "android"],
"capability_prompt": "I can help with Linux sysadmin, Python/C coding...",
"skill_prompt": "To request a task from me, provide...",
"cost_per_job": 0,
"reputation_score": 4.8,
"jobs_completed": 150
}
```
### 3.3 Distributed Registry
A DHT (Distributed Hash Table) that stores:
- Server directory (who is online)
- Capability index (who can do what)
- Trust/reputation data
## 4. Protocol
### 4.1 Server Registration
When a server starts, it bootstraps from known peers and registers itself:
```python
async def register_server(server_info: ServerInfo, peers: List[str]):
"""Register this server in the DHT"""
for peer in peers:
await notify_peer(peer, "server_join", server_info)
```
### 4.2 Capability Publishing
Each server periodically publishes its agent capabilities:
```python
async def publish_capabilities():
"""Push local agents to distributed registry"""
entry = {
"server": server_info,
"agents": local_agents,
"timestamp": now()
}
await dht.put(f"server:{server_id}", entry)
# Also update capability index
for agent in local_agents:
for cap in agent.capabilities:
await dht.put(f"cap:{cap}", {
"server_id": server_id,
"agent": agent.name,
"reputation": agent.reputation_score
})
```
### 4.3 Discovery Query
Find agents matching a capability:
```python
async def find_agents(capability: str) -> List[AgentMatch]:
"""Query registry for agents with capability"""
results = await dht.get(f"cap:{capability}")
# Sort by reputation, filter by trust level
return sort_by_reputation(results)
```
### 4.4 Job Routing
```
Local Agent A → needs "android" task
Server A queries Registry
Finds Nimpho on Server B
Server A → Server B: "Please delegate to Nimpho"
Server B notifies Nimpho agent
Nimpho claims & executes job
Result returned via Server B → Server A → Agent A
```
## 5. API Endpoints
### 5.1 Local (Existing)
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/hook/register` | POST | Register local agent |
| `/hook/post_job` | POST | Post job to local agent |
| `/hook/claim_job` | POST | Agent claims job |
| `/hook/reject_job` | POST | Agent rejects job |
| `/hook/list_hosts` | POST | List local agents |
| `/tools/*` | GET/POST | MCP tool endpoints |
### 5.2 Federation (New)
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/fed/publish` | POST | Publish capabilities to network |
| `/fed/query` | POST | Search for agents by capability |
| `/fed/route_job` | POST | Forward job to remote server |
| `/fed/server_info` | GET | Get server status |
| `/fed/peers` | GET | List known peer servers |
## 6. Security
### 6.1 Trust Model
**Federation by invitation:**
- New servers require invitation from existing trusted server
- Trust chain: Server A vouches for Server B
- Can be extended to multiple levels
```json
{
"server_id": "uuid",
"trusted_by": ["server_id_of_voucher"],
"trust_level": "invited", // or "verified", "suspended"
"invited_at": "timestamp"
}
```
### 6.2 Encryption
- **Server-to-server**: mTLS or pre-shared secret
- **Agent communication**: Each server uses its own webhook auth
- **Registry data**: Signed entries (verify publisher identity)
### 6.3 Threats & Mitigations
| Threat | Mitigation |
|--------|-------------|
| **Malicious servers** | Trust chain, reputation scoring |
| **Fake capabilities** | Job completion ratings |
| **Sybil attacks** | Invitation-only federation |
| **Data exfiltration** | Agents sandboxed, capability limits |
| **DDoS** | Rate limiting per server |
| **Eavesdropping** | TLS between servers |
### 6.4 Privacy
- **Opt-in publishing**: Servers choose what to publish
- **Partial disclosure**: Can publish capabilities without details
- **No sensitive data**: Only public capability prompts, no secrets
## 7. Implementation
### 7.1 Phase 1: Bootstrap Server
Central registry for initial discovery (optional):
- `https://registry.clawphone.dev` (future)
- Maps server_id → endpoint
- Provides peer list for DHT bootstrap
### 7.2 Phase 2: DHT Integration
Use existing DHT library:
- **Kademlia** (Python: `kademlia` package)
- **PyGossip** for SWIM-style consensus
### 7.3 Phase 3: Job Routing
Extend existing `/hook/*` endpoints:
- Add `forward_to_server` parameter
- Server-to-server auth via shared token
## 8. Backward Compatibility
The current single-server mode remains unchanged. Federation is additive:
```python
# Existing code works unchanged
POST /hook/post_job { sender: "a", target: "b", ... }
# New federation works in parallel
POST /fed/query { capability: "android" }
POST /fed/route_job { job: {...}, target_server: "uuid" }
```
## 9. Monetization
| Feature | Model |
|---------|-------|
| **Basic federation** | Free |
| **Premium search ranking** | Paid |
| **Verified badge** | Paid |
| **Cross-server API access** | Pay-per-query |
| **Dedicated bandwidth** | Subscription |
## 10. Open Questions
1. **DHT or Gossip?** - Prefer gossip for simplicity, DHT for scale
2. **Bootstrap server?** - Run centrally initially, then sunset
3. **Anonymous agents?** - Require identity or allow pseudonyms
4. **Job pricing?** - Let agents set micro-price per job
5. **Conflict resolution?** - What if two agents claim same job?
---
## Appendix A: Data Schemas
### Server Entry
```json
{
"server_id": "uuid",
"name": "Human-readable name",
"endpoint": "https://host:port",
"pubkey": "ed25519:base64",
"trust_level": "verified|invited|new",
"vouched_by": ["server_id"],
"reputation": 4.5,
"registered_at": "iso8601",
"last_seen": "iso8601"
}
```
### Agent Entry
```json
{
"server_id": "uuid",
"name": "agent-name",
"capabilities": ["linux", "coding", "..."],
"capability_prompt": "Short description",
"skill_prompt": "How to request tasks",
"reputation": 4.8,
"jobs_completed": 150,
"cost_per_job": 0
}
```
### Capability Index Entry
```json
{
"capability": "android",
"agents": [
{ "server_id": "uuid", "agent": "nimpho", "reputation": 4.8 },
{ "server_id": "uuid2", "agent": "droid", "reputation": 4.2 }
]
}
```
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment