<pclass="muted small"style="margin-top:0;margin-bottom:.75rem">VAEs, text encoders (T5-XXL, CLIP), and other standalone files used alongside main models.</p>
<buttonclass="btn btn-ghost btn-sm"id="dl-browse-btn"onclick="browseHfFiles()"style="white-space:nowrap"title="Fetch available files from HuggingFace">Browse</button>
</div>
</div>
<!-- GGUF mode: specific file or pattern -->
<!-- Quant picker: shown when GGUF files are found via Browse -->
<spanclass="form-hint"id="dl-hint">Exact filename (e.g. <code>model-Q4_K_M.gguf</code>) or pattern (<code>.gguf</code>). Leave blank to download the first .gguf found.</span>
<spanclass="form-hint">Exact filename or suffix pattern. Leave blank to download all .gguf files.</span>
Will download the full repository using the HuggingFace snapshot API. This is the correct method for safetensors / non-GGUF models. Large repos may take a while.
<buttonclass="btn btn-ghost btn-sm"onclick="minimizeDownload()"title="Close this window — the download continues in the background">Minimize ↓</button>
<buttonclass="btn btn-secondary"id="cfg-save-new-btn"onclick="saveNewModelConfig()"style="display:none"title="Save as a separate configuration for the same model file">Save as new config</button>
<buttonclass="btn btn-danger btn-sm"id="cfg-remove-config-btn"onclick="removeThisConfig()"style="display:none;margin-left:auto"title="Remove only this configuration (keeps other configs and the model file)">Remove this config</button>
constincompleteBadgeHf=m.incomplete?'<span class="badge" style="background:rgba(255,160,0,.18);color:#b87200;font-size:10px;margin-left:.3rem" title="Download may be incomplete — some files are missing or truncated">⚠ incomplete</span>':'';
constmissingBadge=f.missing?' <span class="badge" style="background:rgba(220,50,50,.18);color:#e05555;font-size:10px" title="File not found at configured path — re-download or remove this configuration">✕ file missing</span>':'';
constincompleteBadge=f.incomplete?' <span class="badge" style="background:rgba(255,160,0,.18);color:#b87200;font-size:10px" title="Download may be incomplete">⚠ incomplete</span>':'';
@@ -89,6 +89,22 @@ The outbound WebSocket connection must include:
-`username`: either `global` or the AISBF username for user-owned providers
-`registration_token`: provider-scoped secret from AISBF provider configuration
### Current server-side resolution order
AISBF resolves broker identity in this exact order when the WebSocket handshake arrives:
-`provider_id`: query param `provider_id`, then header `x-coderai-provider-id`, then default `coderai`
-`client_id`: query param `client_id`, then header `x-coderai-client-id`, then generated fallback `anon-<unix_timestamp>`
-`username`: query param `username`, then header `x-coderai-username`, then the path scope name (`global` or the `/api/u/{username}` path segment)
-`registration_token`: query param `registration_token`, then header `x-coderai-registration-token`
Important constraints:
- the `registration_token` is required for admission
-`Authorization: Bearer ...` is currently not used by the broker WebSocket admission check
- if you omit `client_id`, AISBF generates an `anon-*` client id and future broker routing will only work if AISBF also targets that exact generated value
- the `client_id` used by the CoderAI client must match the `coderai_config.client_id` used by the AISBF provider, or the broker can show the session as connected while requests still fail to route
## Optional Headers
AISBF also accepts or may expect these headers:
...
...
@@ -109,6 +125,35 @@ Recommended behavior:
Open the outbound WebSocket to the correct scoped AISBF endpoint.
The handshake is a normal WebSocket upgrade request, which starts as an HTTP `GET` carrying query parameters. This is expected.
- send the same identity in both query parameters and headers
- keep `client_id` stable across reconnects
- always reconnect with the same provider scope and owner scope
### 2. Wait for `registered` event
AISBF immediately sends a registration acknowledgment event on successful admission.
...
...
@@ -135,11 +180,21 @@ Store:
-`client_id`
-`username`
-`scope_name`
-`owner_user_id`
-`expires_at`
Notes:
- this event means the socket is admitted and the session row exists
- it does not yet mean hardware/capabilities metadata has been uploaded
- the client should send the explicit `register` operation immediately after this event
### 3. Send explicit `register` operation
After the `registered` event, CoderAI must send a `register` message describing its capabilities, hardware inventory, and advertised endpoints.
AISBF currently processes `register` as a normal inbound WebSocket message and responds with `status=ok` using the same `request_id`.
### 4. Enter long-lived receive loop
Then keep listening for incoming broker requests from AISBF.
...
...
@@ -233,6 +288,60 @@ CoderAI should send this after receiving the initial AISBF `registered` event.
AISBF replies with a success envelope.
### Fields AISBF currently reads from the `register` message
Top-level:
-`v`
-`op` with value `register`
-`request_id`
- optional top-level `registration_token`
- optional top-level `capabilities`
From `payload`:
-`endpoint`
-`transport`
-`registration_token`
-`studio_endpoints`
-`hardware`
-`gpus`
-`gpu_count`
-`total_vram_mb`
-`available_vram_mb`
-`capabilities`
AISBF behavior:
- if `payload.registration_token` or top-level `registration_token` is present and does not match the handshake token, AISBF replies with an error envelope
- if token matches, AISBF persists the metadata onto the broker session
-`payload.capabilities` takes precedence over missing top-level capability data
- if `gpus`, `gpu_count`, `total_vram_mb`, or `available_vram_mb` are omitted at the top level, AISBF falls back to the values inside `payload.hardware`
Minimal acceptable `register` message:
```json
{
"v":1,
"op":"register",
"request_id":"reg-1",
"payload":{
"transport":"websocket",
"registration_token":"<same_registration_token>",
"capabilities":{}
}
}
```
Recommended full `register` message:
- include `endpoint`
- include `transport`
- include `registration_token`
- include `hardware.gpus`, `hardware.gpu_count`, `hardware.total_vram_mb`, `hardware.available_vram_mb`
- include `studio_endpoints`
- include `capabilities`
### Hardware Reporting Requirements
The `register` payload should include the best hardware view available to the running CoderAI process.
...
...
@@ -326,6 +435,37 @@ Heartbeat payloads may also refresh dynamic hardware state such as changing free
}
```
Current AISBF note:
- AISBF acknowledges heartbeat messages and merges the heartbeat `payload` into session metadata
- keep heartbeat payloads small and non-blocking
- use heartbeats for lightweight dynamic updates only; do not block the main receive loop on expensive hardware rescans
## Async Client Requirements
The broker WebSocket integration must be fully asynchronous.
CoderAI client requirements:
- the main receive loop must never block on model loading, inference, GPU inspection, or disk/network I/O
- expensive work should run in background tasks or worker executors while the socket remains responsive to incoming frames and ping/pong traffic
- the client should be able to receive broker requests while also sending progress or result frames for earlier requests
- the client must not serialize all work behind registration or heartbeat handling
AISBF broker behavior:
- AISBF now drains queued outbound broker requests in a background async task while independently reading inbound websocket messages
- this means the CoderAI client should expect inbound requests to arrive even while it is still sending heartbeat or response messages for unrelated work
- operations are correlated strictly by `request_id`; client implementations must not rely on message ordering alone
Recommended client architecture:
1. one async reader task for inbound WebSocket frames
2. one async writer path or send queue for outbound replies/events
3. per-request async tasks for local execution
4. a lightweight periodic heartbeat task
5. explicit request correlation by `request_id`
AISBF merges those updates into the broker session metadata.
Registration tokens are resolved from the owning provider configuration. This means:
- the global admin configures the token for globally configured `coderai` providers
- each user configures the token for their own user-scoped `coderai` providers
- a broker session is only usable by requests belonging to the same owner principal
Broker registration is now scope-aware:
- global providers register with `username=global`
- user-owned providers register with `username=<aisbf_username>`
- the same scoped path must be used by the CoderAI client when connecting over WebSocket
- deployments behind TLS termination or reverse proxies must connect with the externally visible `wss://...` URL and preserve proxy headers so AISBF can remain scheme-aware
The AISBF dashboard now exposes this token directly inside each `coderai` provider configuration:
- token input is stored in `coderai_config.registration_token`
- global admins edit global provider tokens in the admin providers page
- users edit their own provider tokens in the user providers page
- token rotation is available inline and returns a newly generated provider-scoped secret
- broker session status is shown directly in the provider editor, including owner, client id, transport, last seen, and advertised Studio endpoints
CoderAI can keep a persistent outbound connection open to AISBF, register itself, and then receive routed provider operations over that same socket.
## What AISBF now expects
### Provider type
Use provider type:
```json
{
"type":"coderai"
}
```
### Provider config shape
```json
{
"id":"coderai",
"name":"CoderAI Local Bridge",
"endpoint":"http://127.0.0.1:11437",
"type":"coderai",
"api_key_required":false,
"coderai_config":{
"transport":"http",
"http_enabled":true,
"websocket_enabled":true,
"broker_enabled":true,
"broker_mode":false,
"broker_preferred":true,
"discovery_enabled":true,
"client_id":"aisbf-default",
"bridge_path":"/coderai/ws",
"registration_path":"/coderai/register",
"registration_token":"optional-shared-secret",
"bridge_token":"optional-bridge-secret",
"request_timeout":300,
"model_timeout":30
}
}
```
### AISBF behaviors
- For `transport=http`, AISBF uses the OpenAI Python client against `endpoint + /v1`.
- For `transport=websocket`, AISBF uses a WebSocket bridge and sends framed JSON envelopes.
-`proxy` now supports arbitrary forwarded request headers, query params, multipart form payloads, binary/base64 bodies, progress polling endpoints, and non-chat streaming event envelopes for long-running jobs.
- AISBF treats `coderai` like an OpenAI-style Studio adapter family.
- AISBF can also forward arbitrary Studio-native endpoints through `proxy` when the provider transport is WebSocket.
- AISBF validates that broker-enabled `coderai` providers have a non-empty `registration_token`.
- AISBF persists broker session metadata to `~/.aisbf/coderai_broker_sessions.json` so the dashboard can still show the last known broker session after restart, even while disconnected.
## Required CoderAI HTTP endpoints
### 1. OpenAI-compatible endpoints
CoderAI should already expose these when HTTP mode is enabled:
-`GET /v1/models`
-`POST /v1/chat/completions`
- optional additional OpenAI-compatible endpoints that Studio may use directly via generic proxy
The `/v1/models` response should preferably include as much metadata as possible:
or another configured path mirrored in `coderai_config.bridge_path`.
### Headers AISBF sends
-`Authorization: Bearer <bridge_token_or_registration_token_or_api_key>` if available
-`x-coderai-client-id: <client_id>`
-`x-coderai-provider-id: <provider_id>`
### Broker connection query params
When CoderAI dials AISBF broker directly, it should connect using:
-`provider_id=<provider_id>`
-`client_id=<client_id>`
-`username=<username-or-global>`
-`registration_token=<owner-configured-token>`
AISBF broker admission currently resolves connection data in this order:
-`provider_id`: query param, then `x-coderai-provider-id`, then default `coderai`
-`client_id`: query param, then `x-coderai-client-id`, then generated fallback `anon-<timestamp>`
-`username`: query param, then `x-coderai-username`, then the scoped path value
-`registration_token`: query param, then `x-coderai-registration-token`
Important:
- the WebSocket broker flow starts as an HTTP `GET` upgrade request; this is expected
- the broker currently validates `registration_token`, not `Authorization: Bearer ...`, during WebSocket admission
-`client_id` must stay stable and match the `coderai_config.client_id` AISBF uses for that provider, or requests like `models.list` may fail even if the dashboard shows the session as connected
If `payload.registration_token` or top-level `registration_token` is present and does not match the handshake token, AISBF replies with an error envelope.
### Envelope format
AISBF sends one JSON request envelope per operation:
```json
{
"v":1,
"op":"chat.completions",
"request_id":"coderai-1746960000000",
"provider_id":"coderai",
"client_id":"aisbf-default",
"registration_token":"optional-shared-secret",
"payload":{
"model":"llama3.1:8b",
"messages":[
{"role":"user","content":"hello"}
],
"stream":false
}
}
```
### Non-streaming response envelope
```json
{
"v":1,
"request_id":"coderai-1746960000000",
"status":"ok",
"payload":{
"id":"chatcmpl-123",
"object":"chat.completion",
"created":1746960000,
"model":"llama3.1:8b",
"choices":[
{
"index":0,
"message":{"role":"assistant","content":"hello"},
"finish_reason":"stop"
}
],
"usage":{
"prompt_tokens":10,
"completion_tokens":5,
"total_tokens":15
}
}
}
```
### Error response envelope
```json
{
"v":1,
"request_id":"coderai-1746960000000",
"status":"error",
"error":"Model not available",
"code":"model_not_found",
"details":{
"model":"missing-model"
}
}
```
### Streaming response envelopes
For `chat.completions` with `stream=true`, send multiple envelopes.
-`payload.chunk` should be a full SSE fragment already formatted exactly as AISBF should relay it.
- This keeps AISBF transport-simple and lets CoderAI own protocol correctness.
- Include `data: [DONE]\n\n` as one of the streamed chunks when the upstream semantics require it.
## Async broker behavior requirements
The broker client must be fully asynchronous.
- do not block the WebSocket receive loop on hardware probing, model loading, inference, file I/O, or reconnect bookkeeping
- handle inbound requests concurrently and correlate replies by `request_id`
- keep heartbeat handling lightweight
- preserve responsiveness to ping/pong traffic while local work is running
AISBF now independently drains queued outbound broker requests while also reading inbound WebSocket messages, so client implementations should not assume a strict request/response lockstep over the socket.
## Broker session visibility, persistence, and multi-node routing
AISBF now tracks two broker states:
- live connected sessions held in memory for active request routing
- persisted session metadata snapshots stored in `~/.aisbf/coderai_broker_sessions.json`
Persisted metadata is dashboard-facing only. It is used to show the last known session details after restart, but it is not treated as an active transport path until CoderAI reconnects.
For multi-node AISBF deployments behind a reverse proxy / load balancer:
- session status and ownership metadata are stored in the configured AISBF cache backend
- requests are enqueued into cache-backed broker queues keyed by broker session id
- the AISBF node holding the live WebSocket consumes queued requests and forwards them to CoderAI
- replies are written back through cache-backed reply keys so the AISBF node that originated the request can receive the result
Redis is the preferred backend for this distributed mode. SQLite/MySQL can operate as polling-based fallbacks. Memory/file cache backends are not suitable for cross-node broker routing.
Expected behavior:
- after reconnect, the persisted snapshot is refreshed with the new live session details
- after disconnect or AISBF restart, the dashboard may still show the last known client id / endpoint / last seen, but `connected` remains false until a new WebSocket is established
## Bridge operations CoderAI must implement
### `op = "models.list"`
Request:
```json
{
"v":1,
"op":"models.list",
"request_id":"...",
"provider_id":"coderai",
"client_id":"aisbf-default",
"payload":{}
}
```
Response payload should be equivalent to `GET /v1/models`.
### `op = "chat.completions"`
Payload is equivalent to OpenAI `POST /v1/chat/completions` request body.
### `op = "capabilities"`
Response payload should be equivalent to `GET /coderai/capabilities`.
### `op = "register"`
Purpose:
- allow an outbound-only CoderAI agent to announce itself
- report its reachable transports
- report enabled Studio-native endpoints
- report model inventory
- attach metadata to the live AISBF broker session
Streaming and progress responses may emit multiple envelopes with `event` values like `progress`, `output`, `log`, `data`, `chunk`, and finally `done` or `completed`.
Capability advertisements should include endpoint metadata for custom pipelines, including supported methods, streaming mode, expected input/output modalities, and whether multipart or binary transport is required.
## Recommended CoderAI architecture
### Server components
1.**OpenAI compatibility router**
- exposes `/v1/models`, `/v1/chat/completions`, and any other supported OpenAI endpoints
2.**Studio-native router**
- exposes endpoints such as `v1/video/dub`, `v1/audio/tts`, `v1/images/generate`, etc.
3.**Capabilities registry**
- enumerates enabled endpoints
- enumerates loaded models
- computes normalized `studio_capabilities`
4.**WebSocket bridge server**
- accepts AISBF envelopes
- dispatches by `op`
- for `proxy`, internally calls the same handler used by HTTP routes