feat: model "to-download" list, mmproj vision, styled modals, broker + packaging

Web UI / models:
- "To download" wishlist: models known but not on disk and not configured show
  as non-configured to-download rows. Free-disk on an unconfigured model, Remove
  on a model with no files left, and a new "Add to list" button in the download
  window all record into models.json `to_download`; pruned on enable/download.
  New endpoints model-mark-download / model-unmark-download.
- mmproj multimodal components: mmproj GGUFs are classified as components (not
  models), selectable per-GGUF in the model config (auto-selected, enables vision
  capability). VulkanBackend loads them via llama.cpp's MTMDChatHandler (--mmproj
  equivalent), and the chat path now forwards image_url content end-to-end.
- All window.alert() replaced by a shared styled showAlert()/showConfirm() modal
  in base.html (used across every admin template).

Front proxy / broker:
- Fix engine model-assignment NameError (keep -> _keep).
- Brokered GET /coderai/capabilities now answers from the front (whole node) so
  multi-GPU hosts report every card, not a single engine's CUDA-visible one.
- Log a clear reason when the broker is disabled.

Packaging (distributable OCI image):
- Multi-stage venv image + smoke test; bundle ds4/wav2lip/sadtalker + parler;
  whisper-server etc. dereferenced (cp -aL) so no dangling symlinks.
- Dockerfile.update + update_oci_image.sh: ~30s incremental code-only rebuild on
  an immutable coderai:base (no 20GB bundle recopy).
- run_oci.sh: --local/--config-dir + --map to run against existing local config
  and data dirs without a rebuild; --debug[=flags] + --log-file for selectable
  debug flags and a host-tailable file log (launcher tees; supervisord kills the
  process group). tmp_janitor age-prunes the dedicated temp dir.
Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
parent 9d023ec2
...@@ -43,6 +43,10 @@ township_output/ ...@@ -43,6 +43,10 @@ township_output/
.packaging-cache/ .packaging-cache/
tmp/ tmp/
# Exported image tarballs + local OCI run-state (large artifacts)
dist/
coderai-runtime/
# Video editor sessions + generated media (runtime artifacts) # Video editor sessions + generated media (runtime artifacts)
video_editor/sessions/ video_editor/sessions/
tools/coderai_media/ tools/coderai_media/
...@@ -286,3 +286,67 @@ safe. ...@@ -286,3 +286,67 @@ safe.
14. Thermal protection is config-driven and model-agnostic (config.json 14. Thermal protection is config-driven and model-agnostic (config.json
`thermal`). Don't special-case it per model/backend; it only reads temps and `thermal`). Don't special-case it per model/backend; it only reads temps and
sleeps. Honour the enable flags and high/resume hysteresis. sleeps. Honour the enable flags and high/resume hysteresis.
================================================================================
## Distributable Docker image (packaging/linux)
================================================================================
All-in-one image: coderai + tools (editor/videogen/township) behind nginx on a
single port (8776), built from the LOCAL install's venv + binaries.
Multi-stage `Dockerfile.oci-venv`:
- assembler stage stages the local bundle into /opt/coderai (python-build-
standalone interpreter + venv site-packages + ldd'd native libs + parler
overlay + lip-sync venv/repos + py310 + ds4). The ~20 GB bundle COPY lives
ONLY here; the runtime stage COPYs the assembled tree ONCE (no double-store).
- runtime stage: apt (nginx/supervisor/vulkan-tools/ffmpeg/...), COPY the
assembled /opt/coderai, then COPY app code → /opt/coderai/app, launchers →
/usr/local/bin, nginx/supervisor confs. Entry = coderai-entrypoint →
supervisord (nginx + main server + tool UIs).
- Do NOT set PYTHONHOME globally (breaks the system-python supervisord); set
PATH only. Bundle dereferences host symlinks (cp -aL) so binaries like
whisper-server are real files in the image, not dangling links.
Full build (slow, ~15 min — rebuilds the bundle):
packaging/linux/build_oci_image.sh # tags coderai:dist
Smoke test (no weights, checks services + every bundled binary):
DOCKER="sudo docker" GPU="--gpus all" PORT=18082 \
packaging/linux/smoke_test_services.sh coderai:dist
Run against your LIVE local config + data (no rebuild — pure bind-mounts):
packaging/linux/run_oci.sh --nvidia --local \
--map /AI/guffcache --map /AI/huggingface --map /AI/offloads
- The image launcher reads config from /config/coderai and runs
`coderai --config /config/coderai`, rewriting server.host/port in config.json.
- `--local` (= --config-dir ~/.coderai) copies ONLY the *.json config files to
a temp dir and mounts it at /config/coderai, so your real config is untouched
(use --inplace-config to edit it directly).
- `--map HOST[:CONT]` bind-mounts a host dir at the SAME path inside the
container so the ABSOLUTE paths in models.json/config.json (gguf/hf caches,
offloads) resolve unchanged. Without these maps the models won't be found.
- `--debug[=SPEC]` runs coderai with --debug* flags (SPEC default 'all';
e.g. `--debug=engine,requests,ws` --debug-engine/--debug-requests/--debug-ws,
`--debug` always auto-added) and writes a host-tailable file log. `--log-file
PATH` sets the in-container log path (default /cache/logs/coderai.log host
under the cache mount). Driven by env CODERAI_DEBUG + CODERAI_LOG_FILE, read
by the coderai-oci launcher, which tees output so `docker logs` still works.
supervisord [program:coderai] uses stopasgroup/killasgroup so the front's
engine subprocesses + the tee are torn down together. NOTE: the launcher +
supervisord.conf are baked in, so changes need a (fast) update_oci_image.sh.
Incremental update (FAST, ~30 s — code-only changes, NO bundle recopy):
DOCKER="sudo docker" packaging/linux/update_oci_image.sh
- `Dockerfile.update` is `FROM coderai:base` and re-layers ONLY the app code +
launchers + service confs. The heavy bundle layers are inherited unchanged.
- Keeps an immutable `coderai:base` (the bundle) and rebuilds `coderai:dist`
as base + a thin app layer. Every update starts from the SAME base, so app
layers never stack across updates. dist and base SHARE the bundle layers —
keeping both costs only the app layer (a few MB), not a second 23 GB.
- First run seeds coderai:base from the current coderai:dist (docker tag).
- Re-baseline the bundle (new venv/libs/tools): run build_oci_image.sh, then
`docker rmi coderai:base` so the next update re-seeds it from the new dist.
- Use this whenever ONLY codai/ app code (or launchers/confs) changed — a full
build_oci_image.sh is wasteful for that.
- CAUTION: COPY adds/overwrites but does NOT delete files removed from the
repo; the cleanup RUN prunes only known-stale paths (.git/venv*/dist/...). A
source file deleted from codai/ lingers in the overlay until a full rebuild.
...@@ -980,6 +980,14 @@ async def api_download_model( ...@@ -980,6 +980,14 @@ async def api_download_model(
if existing: if existing:
return {"session_id": existing, "attached": True} return {"session_id": existing, "attached": True}
# A download supersedes any "to download" wishlist entry for this model.
if config_manager is not None:
changed = _prune_to_download(model_id)
if file_pattern:
changed = _prune_to_download(file_pattern) or changed
if changed:
config_manager.save_models()
session_id = str(_uuid.uuid4()) session_id = str(_uuid.uuid4())
pq = _q.Queue() pq = _q.Queue()
_download_sessions[session_id] = pq _download_sessions[session_id] = pq
...@@ -1170,6 +1178,58 @@ def _hf_repo_id_from_path(path: str) -> str: ...@@ -1170,6 +1178,58 @@ def _hf_repo_id_from_path(path: str) -> str:
return '' return ''
# Categories that hold real (configured) models in models.json.
_VALID_MODEL_CATS = {
"text_models", "image_models", "audio_models", "gguf_models", "tts_models",
"vision_models", "video_models", "audio_gen_models", "embedding_models",
"spatial_models",
}
def _entry_key(entry) -> str:
"""The identifying path/id of a models.json entry (str or dict)."""
if isinstance(entry, str):
return entry
if isinstance(entry, dict):
return entry.get("path") or entry.get("id") or ""
return ""
def _basename_key(key: str) -> str:
import os as _os
return _os.path.basename(key) if ("/" in key or _os.sep in key) else key
def _is_model_configured(model_id: str) -> bool:
"""True if model_id is already a configured model (matched by id or basename)."""
if config_manager is None:
return False
fname = _basename_key(model_id)
for cat in _VALID_MODEL_CATS:
for m in config_manager.models_data.get(cat, []):
key = _entry_key(m)
if key == model_id or (fname and _basename_key(key) == fname):
return True
return False
def _prune_to_download(model_id: str) -> bool:
"""Drop any 'to download' wishlist entry matching model_id. Returns True if changed."""
if config_manager is None:
return False
lst = config_manager.models_data.get("to_download")
if not lst:
return False
fname = _basename_key(model_id)
kept = [e for e in lst
if not (_entry_key(e) == model_id
or (fname and _basename_key(_entry_key(e)) == fname))]
if len(kept) != len(lst):
config_manager.models_data["to_download"] = kept
return True
return False
def _scan_caches() -> dict: def _scan_caches() -> dict:
import os import os
result: dict = {"hf": [], "gguf": []} result: dict = {"hf": [], "gguf": []}
...@@ -1451,6 +1511,49 @@ def _scan_caches() -> dict: ...@@ -1451,6 +1511,49 @@ def _scan_caches() -> dict:
"configs": all_configs.get(path, []), "configs": all_configs.get(path, []),
}) })
# Surface "to download" wishlist entries: models the user wants listed for
# later download but has NOT configured and are NOT on disk. They appear as
# non-configured rows with a download button (in_config=False, missing=True).
seen_gguf = {m["path"] for m in result["gguf"]} | {m["filename"] for m in result["gguf"]}
seen_hf = {m["id"] for m in result["hf"]}
if config_manager:
for entry in config_manager.models_data.get("to_download", []):
e = entry if isinstance(entry, dict) else {"path": entry}
mid = (e.get("path") or e.get("id") or "").strip()
if not mid or _is_model_configured(mid):
continue
repo = e.get("source_repo") or mid
mtype = e.get("model_type") or "text_models"
is_gguf = (bool(e.get("is_gguf")) or mid.lower().endswith(".gguf")
or "gguf" in mid.lower() or mtype == "gguf_models")
fname = os.path.basename(mid) if ("/" in mid or os.sep in mid) else mid
caps = e.get("capabilities") or detect_model_capabilities(mid).to_list()
if is_gguf:
if mid in seen_gguf or fname in seen_gguf:
continue
result["gguf"].append({
"filename": fname, "path": mid,
"size_gb": 0, "size_bytes": 0,
"in_config": False, "missing": True, "to_download": True,
"source_repo": repo,
"model_type": mtype if mtype != "gguf_models" else "text_models",
"settings": {}, "capabilities": caps,
"incomplete": False, "configs": [],
})
seen_gguf.add(mid); seen_gguf.add(fname)
else:
if mid in seen_hf:
continue
result["hf"].append({
"id": mid, "size_gb": 0, "size_bytes": 0, "revision_count": 0,
"files": [], "file_count": 0,
"in_config": False, "missing": True, "to_download": True,
"source_repo": repo, "model_type": mtype,
"settings": {}, "capabilities": caps,
"incomplete": False, "configs": [],
})
seen_hf.add(mid)
return result return result
...@@ -1729,11 +1832,65 @@ async def api_model_add_known(request: Request, username: str = Depends(require_ ...@@ -1729,11 +1832,65 @@ async def api_model_add_known(request: Request, username: str = Depends(require_
return {"success": True, "already": True} return {"success": True, "already": True}
config_manager.models_data.setdefault(model_type, []).append(entry) config_manager.models_data.setdefault(model_type, []).append(entry)
_prune_to_download(model_id)
config_manager.save_models() config_manager.save_models()
_broker_notify_models_updated(request) _broker_notify_models_updated(request)
return {"success": True} return {"success": True}
@router.post("/admin/api/model-mark-download", summary="List a model for later download")
async def api_model_mark_download(request: Request, username: str = Depends(require_admin)):
"""Record a model in the 'to download' wishlist: it appears in the model list
as a non-configured, to-be-downloaded entry (no files fetched, no serving
config created). Used by 'Free disk' on unconfigured models, 'Remove' on a
model with no files left, and 'Add to list' in the download window."""
if config_manager is None:
raise HTTPException(status_code=503, detail="Config manager not initialized")
data = await request.json()
model_id = (data.get("model_id") or data.get("path") or "").strip()
if not model_id:
raise HTTPException(status_code=400, detail="model_id is required")
source_repo = (data.get("source_repo") or model_id).strip()
model_type = (data.get("model_type") or "").strip()
is_gguf = (bool(data.get("is_gguf")) or model_type == "gguf_models"
or model_id.lower().endswith(".gguf") or "gguf" in model_id.lower())
if is_gguf:
model_type = "gguf_models"
if model_type not in _VALID_MODEL_CATS:
model_type = "text_models"
# Already a real (configured) model — nothing to add.
if _is_model_configured(model_id):
return {"success": True, "already_configured": True}
import os as _os
lst = config_manager.models_data.setdefault("to_download", [])
fname = _basename_key(model_id)
for e in lst:
k = _entry_key(e)
if k == model_id or (fname and _basename_key(k) == fname):
return {"success": True, "already": True}
lst.append({"path": model_id, "source_repo": source_repo,
"model_type": model_type, "is_gguf": is_gguf})
config_manager.save_models()
_broker_notify_models_updated(request)
return {"success": True}
@router.post("/admin/api/model-unmark-download", summary="Remove a model from the download list")
async def api_model_unmark_download(request: Request, username: str = Depends(require_admin)):
"""Drop a model from the 'to download' wishlist (the user no longer wants it
listed). Has no effect on configured models or files on disk."""
if config_manager is None:
raise HTTPException(status_code=503, detail="Config manager not initialized")
data = await request.json()
model_id = (data.get("model_id") or data.get("path") or "").strip()
if not model_id:
raise HTTPException(status_code=400, detail="model_id is required")
if _prune_to_download(model_id):
config_manager.save_models()
_broker_notify_models_updated(request)
return {"success": True}
@router.post("/admin/api/model-enable", summary="Enable a model") @router.post("/admin/api/model-enable", summary="Enable a model")
async def api_model_enable(request: Request, username: str = Depends(require_admin)): async def api_model_enable(request: Request, username: str = Depends(require_admin)):
"""Register a cached model in models.json so CoderAI can use it.""" """Register a cached model in models.json so CoderAI can use it."""
...@@ -1747,8 +1904,13 @@ async def api_model_enable(request: Request, username: str = Depends(require_adm ...@@ -1747,8 +1904,13 @@ async def api_model_enable(request: Request, username: str = Depends(require_adm
if model_type not in valid: if model_type not in valid:
raise HTTPException(status_code=400, detail=f"model_type must be one of {valid}") raise HTTPException(status_code=400, detail=f"model_type must be one of {valid}")
lst = config_manager.models_data.setdefault(model_type, []) lst = config_manager.models_data.setdefault(model_type, [])
changed = False
if path not in lst: if path not in lst:
lst.append(path) lst.append(path)
changed = True
if _prune_to_download(path):
changed = True
if changed:
config_manager.save_models() config_manager.save_models()
_broker_notify_models_updated(request) _broker_notify_models_updated(request)
return {"success": True} return {"success": True}
...@@ -2285,7 +2447,7 @@ async def api_model_configure(request: Request, username: str = Depends(require_ ...@@ -2285,7 +2447,7 @@ async def api_model_configure(request: Request, username: str = Depends(require_
"component_quantization", "output_crf", "force_vram_update", "component_quantization", "output_crf", "force_vram_update",
"balanced_gpu_percent", "acceleration", "balanced_gpu_percent", "acceleration",
"cache_type_k", "cache_type_v", "turboquant", "engine", "cache_type_k", "cache_type_v", "turboquant", "engine",
"quant_backend", "kv_cache_budget_mb", "kv_cache_slots"): "quant_backend", "kv_cache_budget_mb", "kv_cache_slots", "mmproj"):
if key in data: if key in data:
entry[key] = data[key] entry[key] = data[key]
......
...@@ -335,7 +335,7 @@ async function deleteEntry() { ...@@ -335,7 +335,7 @@ async function deleteEntry() {
closeDetail(); closeDetail();
loadArchive(); loadArchive();
} catch(e) { } catch(e) {
alert('Delete failed: ' + e.message); showAlert('Delete failed: ' + e.message);
} }
} }
......
...@@ -104,6 +104,81 @@ function donateCopy(id, btn) { ...@@ -104,6 +104,81 @@ function donateCopy(id, btn) {
</main> </main>
{% endif %} {% endif %}
<!-- Shared confirm / notice modal (replaces window.confirm / window.alert) -->
<div id="confirm-modal" class="modal" onclick="if(event.target===this)document.getElementById('confirm-modal-cancel').click()">
<div class="modal-box" style="max-width:420px">
<div class="modal-head">
<span class="modal-title" id="confirm-modal-title">Confirm</span>
<button class="modal-close" id="confirm-modal-x">&times;</button>
</div>
<div class="modal-body">
<p id="confirm-modal-msg" style="margin:0 0 1.25rem;white-space:pre-wrap"></p>
<div style="display:flex;gap:.5rem;justify-content:flex-end">
<button class="btn btn-ghost" id="confirm-modal-cancel">Cancel</button>
<button class="btn btn-danger" id="confirm-modal-ok">Confirm</button>
</div>
</div>
</div>
</div>
<script>
// Global modal helpers, shared by every admin page. Defined here so templates
// can call showAlert()/showConfirm() instead of window.alert()/window.confirm().
if(typeof window.openModal!=='function') window.openModal=function(id){document.getElementById(id).classList.add('show')};
if(typeof window.closeModal!=='function') window.closeModal=function(id){document.getElementById(id).classList.remove('show')};
window.showConfirm=function(title, msg, okLabel){
return new Promise(resolve => {
document.getElementById('confirm-modal-title').textContent = title;
document.getElementById('confirm-modal-msg').textContent = msg;
const okBtn = document.getElementById('confirm-modal-ok');
const cancelBtn= document.getElementById('confirm-modal-cancel');
const xBtn = document.getElementById('confirm-modal-x');
okBtn.className = 'btn btn-danger';
okBtn.textContent = okLabel || 'Confirm';
cancelBtn.style.display = '';
openModal('confirm-modal');
function cleanup(result){
closeModal('confirm-modal');
okBtn.removeEventListener('click', onOk);
cancelBtn.removeEventListener('click', onCancel);
xBtn.removeEventListener('click', onCancel);
resolve(result);
}
function onOk(){ cleanup(true); }
function onCancel(){ cleanup(false); }
okBtn.addEventListener('click', onOk);
cancelBtn.addEventListener('click', onCancel);
xBtn.addEventListener('click', onCancel);
});
};
// Styled replacement for window.alert(): a single-button notice modal.
window.showAlert=function(msg, title, kind){
return new Promise(resolve => {
if(!title && !kind && /^\s*(error|failed|cannot|could not)\b/i.test(String(msg||''))) kind = 'error';
document.getElementById('confirm-modal-title').textContent =
title || (kind === 'error' ? 'Error' : 'Notice');
document.getElementById('confirm-modal-msg').textContent = msg;
const okBtn = document.getElementById('confirm-modal-ok');
const cancelBtn = document.getElementById('confirm-modal-cancel');
const xBtn = document.getElementById('confirm-modal-x');
okBtn.className = 'btn btn-primary';
okBtn.textContent = 'OK';
cancelBtn.style.display = 'none';
openModal('confirm-modal');
function cleanup(){
closeModal('confirm-modal');
cancelBtn.style.display = '';
okBtn.removeEventListener('click', onOk);
xBtn.removeEventListener('click', onOk);
resolve();
}
function onOk(){ cleanup(); }
okBtn.addEventListener('click', onOk);
xBtn.addEventListener('click', onOk);
});
};
</script>
{% block scripts %}{% endblock %} {% block scripts %}{% endblock %}
</body> </body>
</html> </html>
...@@ -4229,12 +4229,12 @@ async function loadCharProfileIntoSlot(prefix, idx, name) { ...@@ -4229,12 +4229,12 @@ async function loadCharProfileIntoSlot(prefix, idx, name) {
charSlots[prefix][idx].name = charSlots[prefix][idx].name || d.name; charSlots[prefix][idx].name = charSlots[prefix][idx].name || d.name;
charSlots[prefix][idx].images = (d.images||[]).map(img => img.data); charSlots[prefix][idx].images = (d.images||[]).map(img => img.data);
renderCharSlots(prefix); renderCharSlots(prefix);
} catch(e) { alert('Failed to load profile: '+e.message); } } catch(e) { showAlert('Failed to load profile: '+e.message); }
} }
async function saveCharSlotAsProfile(prefix, idx) { async function saveCharSlotAsProfile(prefix, idx) {
const slot = charSlots[prefix]?.[idx]; const slot = charSlots[prefix]?.[idx];
if (!slot || !slot.images.length) { alert('Add at least one image first.'); return; } if (!slot || !slot.images.length) { showAlert('Add at least one image first.'); return; }
const name = slot.name || prompt('Profile name:'); const name = slot.name || prompt('Profile name:');
if (!name) return; if (!name) return;
try { try {
...@@ -4246,8 +4246,8 @@ async function saveCharSlotAsProfile(prefix, idx) { ...@@ -4246,8 +4246,8 @@ async function saveCharSlotAsProfile(prefix, idx) {
charSlots[prefix][idx].name = name; charSlots[prefix][idx].name = name;
await loadCharProfileList(); await loadCharProfileList();
renderCharSlots(prefix); renderCharSlots(prefix);
alert(`Saved profile "${name}"`); showAlert(`Saved profile "${name}"`);
} catch(e) { alert('Save failed: '+e.message); } } catch(e) { showAlert('Save failed: '+e.message); }
} }
// ───────────────────────────────────────────────────────────────── // ─────────────────────────────────────────────────────────────────
...@@ -6051,14 +6051,14 @@ async function profCharView(name) { ...@@ -6051,14 +6051,14 @@ async function profCharView(name) {
try { try {
const d = await fetch(ROOT_PATH + '/admin/api/characters/'+encodeURIComponent(name)).then(r=>r.json()); const d = await fetch(ROOT_PATH + '/admin/api/characters/'+encodeURIComponent(name)).then(r=>r.json());
_openProfModal(`Character: ${d.name}`, d.description||'', d.images||[]); _openProfModal(`Character: ${d.name}`, d.description||'', d.images||[]);
} catch(e) { alert('Failed to load character: ' + e.message); } } catch(e) { showAlert('Failed to load character: ' + e.message); }
} }
async function profCharDelete(name) { async function profCharDelete(name) {
if (!confirm(`Delete character profile "${name}"?`)) return; if (!confirm(`Delete character profile "${name}"?`)) return;
const r = await fetch(ROOT_PATH + '/admin/api/characters/'+encodeURIComponent(name), {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/characters/'+encodeURIComponent(name), {method:'DELETE'});
if (r.ok) await profCharLoad(); if (r.ok) await profCharLoad();
else alert('Delete failed: ' + await r.text()); else showAlert('Delete failed: ' + await r.text());
} }
...@@ -6139,7 +6139,7 @@ async function profVoiceDelete(name) { ...@@ -6139,7 +6139,7 @@ async function profVoiceDelete(name) {
if (!confirm(`Delete voice profile "${name}"?`)) return; if (!confirm(`Delete voice profile "${name}"?`)) return;
const r = await fetch(ROOT_PATH + '/admin/api/voices/'+encodeURIComponent(name), {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/voices/'+encodeURIComponent(name), {method:'DELETE'});
if (r.ok) await profVoiceLoad(); if (r.ok) await profVoiceLoad();
else alert('Delete failed: ' + await r.text()); else showAlert('Delete failed: ' + await r.text());
} }
// ───────────────────────────────────────────────────────────────── // ─────────────────────────────────────────────────────────────────
...@@ -6296,14 +6296,14 @@ async function profEnvView(name) { ...@@ -6296,14 +6296,14 @@ async function profEnvView(name) {
try { try {
const d = await fetch(ROOT_PATH + '/admin/api/environments/'+encodeURIComponent(name)).then(r=>r.json()); const d = await fetch(ROOT_PATH + '/admin/api/environments/'+encodeURIComponent(name)).then(r=>r.json());
_openProfModal(`Environment: ${d.name}`, d.description||'', d.images||[]); _openProfModal(`Environment: ${d.name}`, d.description||'', d.images||[]);
} catch(e) { alert('Failed to load environment: ' + e.message); } } catch(e) { showAlert('Failed to load environment: ' + e.message); }
} }
async function profEnvDelete(name) { async function profEnvDelete(name) {
if (!confirm(`Delete environment profile "${name}"?`)) return; if (!confirm(`Delete environment profile "${name}"?`)) return;
const r = await fetch(ROOT_PATH + '/admin/api/environments/'+encodeURIComponent(name), {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/environments/'+encodeURIComponent(name), {method:'DELETE'});
if (r.ok) await profEnvLoad(); if (r.ok) await profEnvLoad();
else alert('Delete failed: ' + await r.text()); else showAlert('Delete failed: ' + await r.text());
} }
// ───────────────────────────────────────────────────────────────── // ─────────────────────────────────────────────────────────────────
...@@ -6528,7 +6528,7 @@ async function deleteCustomPipeline(id) { ...@@ -6528,7 +6528,7 @@ async function deleteCustomPipeline(id) {
_customPipelines = _customPipelines.filter(p => p.id !== id); _customPipelines = _customPipelines.filter(p => p.id !== id);
if (_editingPipelineId === id) { _editingPipelineId = null; _pbSteps = []; renderBuilderSteps(); } if (_editingPipelineId === id) { _editingPipelineId = null; _pbSteps = []; renderBuilderSteps(); }
renderCustomPipelineCards(); renderCustomPipelineCards();
} catch(e) { alert('Delete failed: '+e.message); } } catch(e) { showAlert('Delete failed: '+e.message); }
} }
function _renderPipelineResult(outId, progId, d) { function _renderPipelineResult(outId, progId, d) {
...@@ -6683,7 +6683,7 @@ async function archiveDelete(filename) { ...@@ -6683,7 +6683,7 @@ async function archiveDelete(filename) {
_archiveFiles = _archiveFiles.filter(f => f.filename !== filename); _archiveFiles = _archiveFiles.filter(f => f.filename !== filename);
renderArchive(); renderArchive();
} catch(e) { } catch(e) {
alert('Delete failed: ' + e.message); showAlert('Delete failed: ' + e.message);
} }
} }
......
This diff is collapsed.
...@@ -244,9 +244,9 @@ async function restartEngine(id, name){ ...@@ -244,9 +244,9 @@ async function restartEngine(id, name){
if (!confirm(`Restart engine "${name}"? In-flight requests on it will fail; the supervisor respawns it immediately.`)) return; if (!confirm(`Restart engine "${name}"? In-flight requests on it will fail; the supervisor respawns it immediately.`)) return;
try { try {
const r = await fetch(ROOT_PATH + '/admin/api/engines/' + id + '/restart', {method:'POST'}); const r = await fetch(ROOT_PATH + '/admin/api/engines/' + id + '/restart', {method:'POST'});
if (!r.ok) { const e = await r.json().catch(()=>({})); alert(e.detail || 'Restart failed'); } if (!r.ok) { const e = await r.json().catch(()=>({})); showAlert(e.detail || 'Restart failed'); }
setTimeout(loadEngines, 800); setTimeout(loadEngines, 800);
} catch(e) { alert(e.message); } } catch(e) { showAlert(e.message); }
} }
let _refreshing = false; let _refreshing = false;
...@@ -338,9 +338,9 @@ async function taskAction(id, action) { ...@@ -338,9 +338,9 @@ async function taskAction(id, action) {
const r = await fetch(ROOT_PATH + '/admin/api/tasks/' + encodeURIComponent(id) + '/' + action, {method:'POST'}); const r = await fetch(ROOT_PATH + '/admin/api/tasks/' + encodeURIComponent(id) + '/' + action, {method:'POST'});
if (!r.ok) { if (!r.ok) {
const e = await r.json().catch(() => ({})); const e = await r.json().catch(() => ({}));
alert(e.detail || (verb + ' failed')); showAlert(e.detail || (verb + ' failed'));
} }
} catch (e) { alert(e.message); } } catch (e) { showAlert(e.message); }
loadTasks(); loadTasks();
} }
...@@ -349,9 +349,9 @@ async function removeTask(id) { ...@@ -349,9 +349,9 @@ async function removeTask(id) {
const r = await fetch(ROOT_PATH + '/admin/api/tasks/' + encodeURIComponent(id), {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/tasks/' + encodeURIComponent(id), {method:'DELETE'});
if (!r.ok) { if (!r.ok) {
const e = await r.json().catch(() => ({})); const e = await r.json().catch(() => ({}));
alert(e.detail || 'Remove failed'); showAlert(e.detail || 'Remove failed');
} }
} catch (e) { alert(e.message); } } catch (e) { showAlert(e.message); }
loadTasks(); loadTasks();
} }
......
...@@ -126,15 +126,15 @@ async function createToken() { ...@@ -126,15 +126,15 @@ async function createToken() {
openModal('show-modal'); openModal('show-modal');
loadTokens(); loadTokens();
} else { } else {
const e = await r.json(); alert(e.detail || 'Failed'); const e = await r.json(); showAlert(e.detail || 'Failed');
} }
} catch (e) { alert(e.message); } } catch (e) { showAlert(e.message); }
} }
async function delToken(id) { async function delToken(id) {
if (!confirm('Delete this token? Clients using it will lose access immediately.')) return; if (!confirm('Delete this token? Clients using it will lose access immediately.')) return;
const r = await fetch(ROOT_PATH + '/admin/api/tokens/'+id, {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/tokens/'+id, {method:'DELETE'});
if (r.ok) loadTokens(); else alert('Failed to delete'); if (r.ok) loadTokens(); else showAlert('Failed to delete');
} }
loadTokens(); loadTokens();
......
...@@ -105,7 +105,7 @@ async function delUser(id, name) { ...@@ -105,7 +105,7 @@ async function delUser(id, name) {
if (!confirm('Delete user "' + name + '"?')) return; if (!confirm('Delete user "' + name + '"?')) return;
const r = await fetch(ROOT_PATH + '/admin/api/users/'+id, {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/users/'+id, {method:'DELETE'});
if (r.ok) location.reload(); if (r.ok) location.reload();
else { const e = await r.json(); alert(e.detail || 'Failed'); } else { const e = await r.json(); showAlert(e.detail || 'Failed'); }
} }
</script> </script>
{% endblock %} {% endblock %}
...@@ -243,6 +243,33 @@ def log_response_payload(payload, streamed=False): ...@@ -243,6 +243,33 @@ def log_response_payload(payload, streamed=False):
router = APIRouter() router = APIRouter()
def _normalize_vision_content(content: list) -> list:
"""Normalize an OpenAI multipart message content list to the shape the
llama.cpp multimodal (mmproj) handler expects: text parts as
``{"type":"text","text":...}`` and images as
``{"type":"image_url","image_url":{"url": ...}}``. The url may be an http(s)
link or a ``data:image/...;base64,...`` URI — both are accepted. Unknown
parts are dropped to a text placeholder so nothing crashes the handler."""
norm = []
for item in content:
if not isinstance(item, dict):
norm.append({"type": "text", "text": str(item)})
continue
t = item.get("type")
if t == "text" and "text" in item:
norm.append({"type": "text", "text": item["text"]})
elif t in ("image_url", "input_image"):
iu = item.get("image_url") if t == "image_url" else item.get("image")
url = iu.get("url") if isinstance(iu, dict) else iu
if url:
norm.append({"type": "image_url", "image_url": {"url": url}})
elif "text" in item:
norm.append({"type": "text", "text": str(item["text"])})
else:
norm.append({"type": "text", "text": f"[{t or 'unknown'} content]"})
return norm
@router.post("/v1/chat/completions", summary="Chat completions") @router.post("/v1/chat/completions", summary="Chat completions")
async def chat_completions(request: ChatCompletionRequest, http_request: Request = None): async def chat_completions(request: ChatCompletionRequest, http_request: Request = None):
"""Chat completions endpoint with streaming and tool support.""" """Chat completions endpoint with streaming and tool support."""
...@@ -519,6 +546,12 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request ...@@ -519,6 +546,12 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request
"Another model may be using all available VRAM.") "Another model may be using all available VRAM.")
current_manager = mm current_manager = mm
# Does the resolved (loaded) model accept images? True only when an mmproj
# projector was loaded into the llama.cpp backend (see VulkanBackend). When
# set, multipart image content is preserved end-to-end instead of being
# flattened to a text placeholder, so the multimodal handler can see it.
_vision_ok = bool(getattr(getattr(current_manager, 'backend', None), 'supports_vision', False))
# Inject system prompt if --system-prompt flag was provided # Inject system prompt if --system-prompt flag was provided
messages = request.messages messages = request.messages
global_system_prompt = get_global_system_prompt() global_system_prompt = get_global_system_prompt()
...@@ -733,19 +766,31 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request ...@@ -733,19 +766,31 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request
if content is None: if content is None:
content = "" content = ""
elif isinstance(content, list): elif isinstance(content, list):
# Handle multipart content array format: [{"type": "text", "text": "..."}] _has_image = _vision_ok and any(
parts = [] isinstance(it, dict) and it.get('type') in ('image_url', 'input_image')
for item in content: for it in content)
if isinstance(item, dict): if _has_image:
if item.get('type') == 'text' and 'text' in item: # Vision (mmproj) model: keep OpenAI multipart content so the
parts.append(item['text']) # llama.cpp multimodal handler receives the images themselves.
content = _normalize_vision_content(content)
else:
# Handle multipart content array format: [{"type": "text", "text": "..."}]
parts = []
for item in content:
if isinstance(item, dict):
if item.get('type') == 'text' and 'text' in item:
parts.append(item['text'])
else:
parts.append(f"[{item.get('type', 'unknown')} content]")
else: else:
parts.append(f"[{item.get('type', 'unknown')} content]") parts.append(str(item))
else: content = '\n'.join(parts)
parts.append(str(item)) # Ensure content is never None - convert to string (but keep multipart
content = '\n'.join(parts) # vision content as a list so the multimodal handler can consume it).
# Ensure content is never None - convert to string if isinstance(content, list):
msg_dict["content"] = str(content) if content is not None else "" msg_dict["content"] = content
else:
msg_dict["content"] = str(content) if content is not None else ""
# Handle tool_calls - convert to proper format if present # Handle tool_calls - convert to proper format if present
if msg.tool_calls: if msg.tool_calls:
# tool_calls should be a list of dicts with 'id', 'type', 'function' keys # tool_calls should be a list of dicts with 'id', 'type', 'function' keys
...@@ -765,8 +810,9 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request ...@@ -765,8 +810,9 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request
# Handle None content # Handle None content
elif m.get("content") is None: elif m.get("content") is None:
messages_dict[i]["content"] = "" messages_dict[i]["content"] = ""
# Handle content that's not a string (shouldn't happen but be safe) # Handle content that's not a string (shouldn't happen but be safe).
elif not isinstance(m["content"], str): # A list is legitimate multipart vision content — leave it intact.
elif not isinstance(m["content"], str) and not isinstance(m["content"], list):
messages_dict[i]["content"] = str(m["content"]) messages_dict[i]["content"] = str(m["content"])
......
This diff is collapsed.
...@@ -81,6 +81,8 @@ class FrontProxy: ...@@ -81,6 +81,8 @@ class FrontProxy:
requests are dispatched to the right engine through the same router/proxy.""" requests are dispatched to the right engine through the same router/proxy."""
cfg = getattr(self.config, "broker", None) cfg = getattr(self.config, "broker", None)
if cfg is None or not getattr(cfg, "enabled", False): if cfg is None or not getattr(cfg, "enabled", False):
print("[front] AISBF broker not started (broker.enabled is false in config)",
flush=True)
return return
try: try:
from codai.broker import build_broker_runtime_config, BrokerConfigError from codai.broker import build_broker_runtime_config, BrokerConfigError
...@@ -142,9 +144,23 @@ class FrontProxy: ...@@ -142,9 +144,23 @@ class FrontProxy:
return ("ok", {"object": "list", "data": [seen[i] for i in order]}) return ("ok", {"object": "list", "data": [seen[i] for i in order]})
async def broker_execute(self, *, method, path, headers, query, body): async def broker_execute(self, *, method, path, headers, query, body):
_clean_path = path.split("?", 1)[0].rstrip("/")
# Brokered capabilities must describe the WHOLE node. Routing this to a
# single engine would report only that engine's CUDA-visible card (its
# torch hardware summary), so a multi-GPU node looks like it has one card.
# Build it here in the (torch-free) front, which enumerates every physical
# GPU via nvidia-smi + sysfs.
if method.upper() == "GET" and _clean_path == "/coderai/capabilities":
from codai.broker.capabilities import (
build_capabilities_document, build_hardware_summary)
import json as _json
doc = build_capabilities_document(hardware=build_hardware_summary())
return {"status_code": 200,
"headers": {"content-type": "application/json"},
"body": _json.dumps(doc).encode()}
# Brokered models.list must reflect the WHOLE node (union across engines), # Brokered models.list must reflect the WHOLE node (union across engines),
# not a single engine's assigned subset. # not a single engine's assigned subset.
if method.upper() == "GET" and path.split("?", 1)[0].rstrip("/") == "/v1/models": if method.upper() == "GET" and _clean_path == "/v1/models":
hdrs = {k: v for k, v in (headers or {}).items() if k.lower() not in _DROP_REQ} hdrs = {k: v for k, v in (headers or {}).items() if k.lower() not in _DROP_REQ}
kind, val = await self.collect_models(hdrs) kind, val = await self.collect_models(hdrs)
if kind == "ok": if kind == "ok":
......
...@@ -709,7 +709,7 @@ def main(): ...@@ -709,7 +709,7 @@ def main():
# Also restrict /v1/models (list_models) to the assigned subset, so the # Also restrict /v1/models (list_models) to the assigned subset, so the
# per-engine model list matches what it actually serves — config_mgr's # per-engine model list matches what it actually serves — config_mgr's
# full models_data is untouched (the admin model list stays complete). # full models_data is untouched (the admin model list stays complete).
multi_model_manager.set_assigned_models(keep) multi_model_manager.set_assigned_models(_keep)
except Exception as _e: except Exception as _e:
print(f"[engine] assignment filter failed ({_e}); registering all models") print(f"[engine] assignment filter failed ({_e}); registering all models")
......
...@@ -943,7 +943,7 @@ class MultiModelManager: ...@@ -943,7 +943,7 @@ class MultiModelManager:
# KV-cache quantization (llama.cpp type_k/type_v) — pass through # KV-cache quantization (llama.cpp type_k/type_v) — pass through
# to the backend, with the raw models.json entry as a fallback. # to the backend, with the raw models.json entry as a fallback.
_raw = config.get('_raw_cfg') if isinstance(config.get('_raw_cfg'), dict) else {} _raw = config.get('_raw_cfg') if isinstance(config.get('_raw_cfg'), dict) else {}
for _kvk in ('cache_type_k', 'cache_type_v'): for _kvk in ('cache_type_k', 'cache_type_v', 'mmproj'):
_kvv = config.get(_kvk) _kvv = config.get(_kvk)
if _kvv is None: if _kvv is None:
_kvv = _raw.get(_kvk) _kvv = _raw.get(_kvk)
...@@ -1062,7 +1062,7 @@ class MultiModelManager: ...@@ -1062,7 +1062,7 @@ class MultiModelManager:
# KV-cache quantization (llama.cpp type_k/type_v) — pass through # KV-cache quantization (llama.cpp type_k/type_v) — pass through
# to the backend, with the raw models.json entry as a fallback. # to the backend, with the raw models.json entry as a fallback.
_raw = config.get('_raw_cfg') if isinstance(config.get('_raw_cfg'), dict) else {} _raw = config.get('_raw_cfg') if isinstance(config.get('_raw_cfg'), dict) else {}
for _kvk in ('cache_type_k', 'cache_type_v'): for _kvk in ('cache_type_k', 'cache_type_v', 'mmproj'):
_kvv = config.get(_kvk) _kvv = config.get(_kvk)
if _kvv is None: if _kvv is None:
_kvv = _raw.get(_kvk) _kvv = _raw.get(_kvk)
......
...@@ -1046,6 +1046,14 @@ def parse_gemma_native_tool_calls(text: str, tool_names=None): ...@@ -1046,6 +1046,14 @@ def parse_gemma_native_tool_calls(text: str, tool_names=None):
if tool_names and name not in tool_names: if tool_names and name not in tool_names:
continue continue
brace = m.end() - 1 # index of '{' brace = m.end() - 1 # index of '{'
# Some models double-wrap the args: call:NAME{{"k":"v"}}. Skip the
# redundant outer brace so the real object is parsed instead of being
# mangled into a single key like '{"k"'.
j = brace + 1
while j < len(text) and text[j] in ' \t\r\n':
j += 1
if j < len(text) and text[j] == '{':
brace = j
try: try:
args, _ = _parse_gemma_loose_object(text, brace) args, _ = _parse_gemma_loose_object(text, brace)
except Exception: except Exception:
......
"""Periodic cleanup of the temporary-working directory.
Several pipelines write scratch files with ``tempfile.NamedTemporaryFile(delete=
False)`` / ``mkdtemp()`` (frame extraction, upscaling, interpolation, dubbing,
voice cloning…). When a generation is interrupted those temp entries are never
removed, so a dedicated ``tmp_dir`` slowly fills up (it had grown to tens of GB).
This background janitor age-prunes that directory: every
``interval_minutes`` it deletes top-level entries whose most-recent mtime is older
than ``max_age_hours``. Age-based pruning means in-flight work (touched recently)
is left alone while abandoned scratch is reclaimed.
Safety: it only ever operates on the *configured* ``tmp_dir`` (a dedicated path).
It refuses to run against a bare system temp dir (/tmp, /var/tmp, …) so it can
never delete other processes' files. Mirrors ``codai.models.ram_monitor`` in
shape: module-level state + ``get_status()``, started once from ``codai.main``.
"""
import os
import shutil
import threading
import time
import logging
from typing import Optional, Dict, Any
_log = logging.getLogger(__name__)
# Paths we must never treat as a prunable dedicated tmp dir.
_FORBIDDEN = {"/", "/tmp", "/var/tmp", "/usr/tmp", "/dev/shm"}
_state_lock = threading.Lock()
_state: Dict[str, Any] = {
"enabled": False,
"tmp_dir": None,
"max_age_hours": None,
"interval_minutes": None,
"last_run_ts": 0.0,
"last_removed": 0,
"total_removed": 0,
"last_freed_bytes": 0,
"runs": 0,
}
_thread: Optional[threading.Thread] = None
_started = False
def get_status() -> Dict[str, Any]:
"""Snapshot for the admin status endpoint / dashboard."""
with _state_lock:
return dict(_state)
def _entry_newest_mtime(path: str) -> float:
"""Most-recent mtime under ``path`` (the entry itself, or the newest file in a
directory tree). Using the newest mtime avoids deleting a directory whose top
folder is old but which still has freshly written files inside."""
try:
newest = os.lstat(path).st_mtime
except OSError:
return 0.0
if os.path.isdir(path) and not os.path.islink(path):
for root, _dirs, files in os.walk(path):
for name in files:
try:
m = os.lstat(os.path.join(root, name)).st_mtime
if m > newest:
newest = m
except OSError:
continue
return newest
def _dir_size(path: str) -> int:
total = 0
if os.path.isdir(path) and not os.path.islink(path):
for root, _dirs, files in os.walk(path):
for name in files:
try:
total += os.lstat(os.path.join(root, name)).st_size
except OSError:
continue
else:
try:
total = os.lstat(path).st_size
except OSError:
total = 0
return total
def _sweep(tmp_dir: str, max_age_seconds: float) -> tuple[int, int]:
"""Remove top-level entries older than the cutoff. Returns (removed, freed)."""
now = time.time()
removed = 0
freed = 0
try:
entries = os.listdir(tmp_dir)
except OSError as e:
_log.debug("tmp janitor: cannot list %s: %s", tmp_dir, e)
return (0, 0)
for name in entries:
path = os.path.join(tmp_dir, name)
try:
if now - _entry_newest_mtime(path) < max_age_seconds:
continue
size = _dir_size(path)
if os.path.isdir(path) and not os.path.islink(path):
shutil.rmtree(path, ignore_errors=True)
else:
os.remove(path)
removed += 1
freed += size
except OSError as e:
_log.debug("tmp janitor: could not remove %s: %s", path, e)
return (removed, freed)
def _run(tmp_dir: str, max_age_hours: float, interval_minutes: float) -> None:
max_age_seconds = max(0.0, max_age_hours) * 3600.0
interval = max(60.0, interval_minutes * 60.0)
while True:
try:
removed, freed = _sweep(tmp_dir, max_age_seconds)
with _state_lock:
_state["last_run_ts"] = time.time()
_state["last_removed"] = removed
_state["total_removed"] += removed
_state["last_freed_bytes"] = freed
_state["runs"] += 1
if removed:
_log.info("tmp janitor: removed %d stale entr%s (%.1f MB) from %s",
removed, "y" if removed == 1 else "ies",
freed / (1024 * 1024), tmp_dir)
except Exception as e: # never let the janitor die
_log.warning("tmp janitor sweep failed: %s", e)
time.sleep(interval)
def start(tmp_dir: Optional[str], enabled: bool = True,
max_age_hours: float = 24.0, interval_minutes: float = 60.0) -> bool:
"""Start the janitor for ``tmp_dir``. No-op (returns False) when disabled, when
no dedicated tmp_dir is configured, or when tmp_dir is a shared system dir."""
global _thread, _started
if _started:
return True
if not enabled or not tmp_dir:
return False
real = os.path.abspath(os.path.expanduser(tmp_dir)).rstrip("/") or "/"
if real in _FORBIDDEN:
_log.info("tmp janitor: refusing to prune shared temp dir %s (set a dedicated tmp_dir)", real)
return False
if not os.path.isdir(real):
try:
os.makedirs(real, exist_ok=True)
except OSError:
return False
with _state_lock:
_state.update({
"enabled": True, "tmp_dir": real,
"max_age_hours": max_age_hours, "interval_minutes": interval_minutes,
})
_thread = threading.Thread(target=_run, args=(real, max_age_hours, interval_minutes),
name="tmp-janitor", daemon=True)
_thread.start()
_started = True
_log.info("tmp janitor: pruning %s every %.0f min (entries older than %.1f h)",
real, interval_minutes, max_age_hours)
return True
def sweep_once(tmp_dir: str, max_age_hours: float = 24.0) -> tuple[int, int]:
"""Run a single prune pass and return (removed, freed_bytes). For cron use."""
real = os.path.abspath(os.path.expanduser(tmp_dir)).rstrip("/") or "/"
if real in _FORBIDDEN or not os.path.isdir(real):
raise SystemExit(f"refusing to prune {real!r} (not a dedicated tmp dir)")
return _sweep(real, max(0.0, max_age_hours) * 3600.0)
if __name__ == "__main__":
# One-shot CLI for cron/systemd-timer use, e.g.:
# */30 * * * * /path/venv/bin/python -m codai.models.tmp_janitor \
# --tmp /storage/coderai/tmp --max-age-hours 24
import argparse
p = argparse.ArgumentParser(description="Prune a dedicated CoderAI temp dir.")
p.add_argument("--tmp", required=True, help="the dedicated tmp_dir to prune")
p.add_argument("--max-age-hours", type=float, default=24.0,
help="delete entries whose newest file is older than this")
a = p.parse_args()
n, b = sweep_once(a.tmp, a.max_age_hours)
print(f"tmp janitor: removed {n} entr{'y' if n == 1 else 'ies'} "
f"({b / (1024 * 1024):.1f} MB) from {a.tmp}")
...@@ -127,9 +127,11 @@ RUN apt-get update && apt-get install -y --no-install-recommends \ ...@@ -127,9 +127,11 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
# The fully assembled CoderAI tree (Python + venvs + tools), copied once. # The fully assembled CoderAI tree (Python + venvs + tools), copied once.
COPY --from=assembler /opt/coderai /opt/coderai COPY --from=assembler /opt/coderai /opt/coderai
# Now the standalone interpreter exists, activate it for the app + launchers. # Put the standalone interpreter first on PATH. Do NOT set PYTHONHOME globally:
ENV PYTHONHOME=/opt/coderai/python \ # supervisord runs on the system python3 (3.12) and a PYTHONHOME pointing at the
PATH=/opt/coderai/python/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin # standalone 3.13 stdlib breaks it ("No module named 'encodings'"). The standalone
# python is relocatable, and the per-service launchers set PYTHONHOME themselves.
ENV PATH=/opt/coderai/python/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
WORKDIR /opt/coderai/app WORKDIR /opt/coderai/app
COPY . /opt/coderai/app COPY . /opt/coderai/app
......
# Incremental update of an already-built coderai image.
#
# Re-layers ONLY the application code, launcher scripts and service configs on
# top of an existing base image (the heavy bundle: python, venvs, native libs,
# lip-sync, ds4, parler). Those base layers are inherited unchanged — there is no
# 20 GB bundle recopy — so this builds in seconds even with an empty build cache.
#
# Driven by packaging/linux/update_oci_image.sh, which keeps an immutable
# `coderai:base` tag so repeated updates always start from the same bundle and
# never stack app layers on top of each other.
ARG BASE_IMAGE=coderai:base
FROM ${BASE_IMAGE}
# Refresh the app tree plus the scripts/configs that live outside it. The big
# /opt/coderai/{python,*-venv,local-libs,Wav2Lip,SadTalker,ds4,py310} trees are
# left as inherited layers. (COPY overwrites/adds; a file deleted from the repo
# is pruned by the cleanup RUN below for the known-stale paths.)
COPY . /opt/coderai/app
COPY packaging/linux/launcher/coderai-oci /usr/local/bin/coderai
COPY packaging/linux/launcher/with-env /usr/local/bin/with-env
COPY packaging/linux/launcher/coderai-entrypoint /usr/local/bin/coderai-entrypoint
COPY packaging/linux/launcher/wav2lip /usr/local/bin/wav2lip
COPY packaging/linux/launcher/sadtalker /usr/local/bin/sadtalker
COPY packaging/linux/nginx.conf /etc/nginx/nginx.conf
COPY packaging/linux/supervisord.conf /etc/supervisor/supervisord.conf
COPY packaging/linux/README-RUN.txt /opt/coderai/README-RUN.txt
RUN set -eux; \
chmod +x /usr/local/bin/coderai /usr/local/bin/with-env /usr/local/bin/coderai-entrypoint \
/usr/local/bin/wav2lip /usr/local/bin/sadtalker /opt/coderai/app/coderai; \
mkdir -p /config /models /cache /opt/coderai/app/models; \
rm -rf \
/opt/coderai/app/.git \
/opt/coderai/app/venv* \
/opt/coderai/app/.venv \
/opt/coderai/app/township_output \
/opt/coderai/app/offload \
/opt/coderai/app/dist \
/opt/coderai/app/.packaging-cache; \
find /opt/coderai/app -type d -name __pycache__ -prune -exec rm -rf '{}' +; \
/opt/coderai/python/bin/python3 -c "import importlib.util, sys; m=[n for n in ('fastapi','uvicorn','torch') if importlib.util.find_spec(n) is None]; sys.exit('base image missing: '+', '.join(m) if m else 0)"
# ENTRYPOINT / EXPOSE / VOLUME / ENV / WORKDIR are inherited from the base image.
...@@ -263,7 +263,10 @@ prepare_venv_bundle() { ...@@ -263,7 +263,10 @@ prepare_venv_bundle() {
if [[ -e "$dest_path" && "$bin_path" -ef "$dest_path" ]]; then if [[ -e "$dest_path" && "$bin_path" -ef "$dest_path" ]]; then
continue continue
fi fi
cp -a --remove-destination "$bin_path" "$dest_path" # -L: dereference symlinks so the REAL binary is bundled. /usr/local/bin
# entries are often symlinks to a build dir (e.g. ~/whisper.cpp/build/bin);
# copying the link verbatim leaves a dangling symlink in the image.
cp -aL --remove-destination "$bin_path" "$dest_path"
done done
if [[ ${#LOCAL_BINARIES[@]} -gt 0 ]]; then if [[ ${#LOCAL_BINARIES[@]} -gt 0 ]]; then
......
#!/usr/bin/env sh
# Top-level entrypoint for the CoderAI distributable image.
# Prepares shared state directories and hands off to supervisord, which runs
# nginx + the main server + the bundled tool web UIs on the single published port.
set -eu
: "${CODERAI_CONFIG_DIR:=/config}"
: "${CODERAI_MODELS_DIR:=/models}"
: "${CODERAI_CACHE_DIR:=/cache}"
# Default parler model id; referenced by supervisord even when the parler program
# is disabled, so it must always be defined.
: "${CODERAI_PARLER_MODEL:=parler-tts/parler-tts-mini-multilingual}"
# Dedicated temp dir on the cache volume, shared by the server and the tool
# processes (so scratch from upscaling/lip-sync/ffmpeg lands in one place). The
# server's built-in janitor age-prunes it; see CODERAI_TMP below.
: "${CODERAI_TMP:=$CODERAI_CACHE_DIR/coderai-tmp}"
export TMPDIR="$CODERAI_TMP" TMP="$CODERAI_TMP" TEMP="$CODERAI_TMP"
# Don't write .pyc into the read-only /opt/coderai tree (esp. when run as --user).
export PYTHONDONTWRITEBYTECODE=1
export CODERAI_CONFIG_DIR CODERAI_MODELS_DIR CODERAI_CACHE_DIR CODERAI_PARLER_MODEL CODERAI_TMP
mkdir -p \
"$CODERAI_CONFIG_DIR/coderai" \
"$CODERAI_MODELS_DIR/coderai" \
"$CODERAI_CACHE_DIR/coderai" \
"$CODERAI_CACHE_DIR/township_output" \
"$CODERAI_CACHE_DIR/videogen_output" \
"$CODERAI_TMP" \
/tmp/nginx-client-body /tmp/nginx-proxy /tmp/nginx-fastcgi \
/tmp/nginx-uwsgi /tmp/nginx-scgi
# Seed the ds4 working dir on the cache volume from the bundled binary + scripts
# (DeepSeek-V4 weights download here at runtime, so it must be writable/persistent).
if [ -d /opt/coderai/ds4 ] && [ ! -e "$CODERAI_CACHE_DIR/ds4/ds4-server" ]; then
mkdir -p "$CODERAI_CACHE_DIR/ds4"
cp -an /opt/coderai/ds4/. "$CODERAI_CACHE_DIR/ds4/" 2>/dev/null || true
fi
# If invoked with arguments, run them directly (debugging / one-off commands)
# instead of the supervised stack.
if [ "$#" -gt 0 ]; then
exec "$@"
fi
# supervisord runs on the system python3; a leaked PYTHONHOME (pointing at the
# standalone 3.13) would break it. The per-service launchers set their own.
unset PYTHONHOME
exec /usr/bin/supervisord -c /etc/supervisor/supervisord.conf
...@@ -72,8 +72,47 @@ if changed: ...@@ -72,8 +72,47 @@ if changed:
PY PY
fi fi
# Point the server at the shared dedicated temp dir so its janitor prunes it. # Optional debug logging. CODERAI_DEBUG selects coderai's --debug* flags:
if [ -n "${CODERAI_TMP:-}" ]; then # all -> every debug flag
exec /opt/coderai/python/bin/python3 /opt/coderai/app/coderai --config "$CONFIG_DIR" --tmp "$CODERAI_TMP" "$@" # 1|true|yes|on -> just --debug
# "engine,ws,..." -> --debug-engine --debug-ws ... (bare names get --debug- prefixed;
# full "--debug-foo" tokens are passed through; comma OR space separated)
DEBUG_ARGS=""
case "${CODERAI_DEBUG:-}" in
"") : ;;
all|ALL|All)
DEBUG_ARGS="--debug --debug-ws --debug-web --debug-thermal --debug-lora --debug-requests --debug-engine --debug-engine-web" ;;
1|true|TRUE|yes|YES|on|ON)
DEBUG_ARGS="--debug" ;;
*)
for _f in $(echo "$CODERAI_DEBUG" | tr ',' ' '); do
case "$_f" in
--*) DEBUG_ARGS="$DEBUG_ARGS $_f" ;;
debug) DEBUG_ARGS="$DEBUG_ARGS --debug" ;;
*) DEBUG_ARGS="$DEBUG_ARGS --debug-$_f" ;;
esac
done ;;
esac
# --debug-* flags need --debug present to take effect; add it if the user picked
# only sub-flags.
case " $DEBUG_ARGS " in *" --debug "*) : ;; *[!\ ]*) DEBUG_ARGS="--debug$DEBUG_ARGS" ;; esac
# Assemble the server argv: --config, optional --tmp, debug flags, then passthrough.
set -- --config "$CONFIG_DIR" "$@"
[ -n "${CODERAI_TMP:-}" ] && set -- "$@" --tmp "$CODERAI_TMP"
CODERAI_BIN="/opt/coderai/python/bin/python3 /opt/coderai/app/coderai"
# Optional host-tailable file log. CODERAI_LOG_FILE should point under a mounted
# volume (e.g. /cache/logs/coderai.log) so it's visible + tailable on the host.
# We tee so output still reaches `docker logs` too. (supervisord runs this script
# with killasgroup, so the coderai front + its engine subprocesses + tee are all
# torn down together on stop.)
if [ -n "${CODERAI_LOG_FILE:-}" ]; then
mkdir -p "$(dirname "$CODERAI_LOG_FILE")" 2>/dev/null || true
echo "[coderai-oci] debug='${CODERAI_DEBUG:-off}' → logging to $CODERAI_LOG_FILE" >&2
# shellcheck disable=SC2086
exec $CODERAI_BIN "$@" $DEBUG_ARGS 2>&1 | tee -a "$CODERAI_LOG_FILE"
fi fi
exec /opt/coderai/python/bin/python3 /opt/coderai/app/coderai --config "$CONFIG_DIR" "$@" # shellcheck disable=SC2086
exec $CODERAI_BIN "$@" $DEBUG_ARGS
#!/usr/bin/env bash
# CLI shim for SadTalker talking-head generation, run in the shared lip-sync venv.
# codai/api/video.py invokes:
# sadtalker --driven_audio AUDIO --source_video VIDEO --result_dir DIR
# SadTalker animates a still image, so a source video is reduced to its first frame.
#
# Checkpoints are NOT baked into the image: on first use they download into the
# writable working dir (a /cache volume in the container) and persist there.
set -euo pipefail
VENV="${CODERAI_LIPSYNC_VENV:-$HOME/.coderai/lipsync_venv}"
SRC="${CODERAI_SADTALKER_SRC:-$HOME/.coderai/SadTalker}" # baked read-only repo code
DIR="${CODERAI_SADTALKER_DIR:-$SRC}" # writable working copy
if [ ! -x "$VENV/bin/python" ]; then
echo "sadtalker: lip-sync venv not found at $VENV" >&2
exit 127
fi
if [ ! -f "$DIR/inference.py" ]; then
mkdir -p "$DIR"
rsync -a --exclude 'checkpoints/*' --exclude 'gfpgan/weights/*' "$SRC/" "$DIR/"
fi
# Download checkpoints on first use (idempotent).
mkdir -p "$DIR/checkpoints" "$DIR/gfpgan/weights"
_dl(){ if [ ! -s "$2" ]; then echo "sadtalker: downloading $(basename "$2") …" >&2;
curl -fSL --retry 3 -o "$2" "$1" || { echo "sadtalker: download failed: $1" >&2; exit 1; }; fi; }
_b="https://github.com/OpenTalker/SadTalker/releases/download/v0.0.2-rc"
_dl "$_b/mapping_00109-model.pth.tar" "$DIR/checkpoints/mapping_00109-model.pth.tar"
_dl "$_b/mapping_00229-model.pth.tar" "$DIR/checkpoints/mapping_00229-model.pth.tar"
_dl "$_b/SadTalker_V0.0.2_256.safetensors" "$DIR/checkpoints/SadTalker_V0.0.2_256.safetensors"
_dl "$_b/SadTalker_V0.0.2_512.safetensors" "$DIR/checkpoints/SadTalker_V0.0.2_512.safetensors"
_dl "https://github.com/xinntao/facexlib/releases/download/v0.1.0/alignment_WFLW_4HG.pth" "$DIR/gfpgan/weights/alignment_WFLW_4HG.pth"
_dl "https://github.com/xinntao/facexlib/releases/download/v0.1.0/detection_Resnet50_Final.pth" "$DIR/gfpgan/weights/detection_Resnet50_Final.pth"
_dl "https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.4.pth" "$DIR/gfpgan/weights/GFPGANv1.4.pth"
_dl "https://github.com/xinntao/facexlib/releases/download/v0.2.2/parsing_parsenet.pth" "$DIR/gfpgan/weights/parsing_parsenet.pth"
driven=""; result=""; source_img=""; source_video=""
extra=()
while [ "$#" -gt 0 ]; do
case "$1" in
--driven_audio) driven="$2"; shift 2;;
--source_video) source_video="$2"; shift 2;;
--source_image) source_img="$2"; shift 2;;
--result_dir) result="$2"; shift 2;;
*) extra+=("$1"); shift;;
esac
done
result="${result:-./results}"
mkdir -p "$result"
cleanup_img=""
if [ -z "$source_img" ] && [ -n "$source_video" ]; then
source_img="$(mktemp --suffix=.png)"
cleanup_img="$source_img"
ffmpeg -y -i "$source_video" -frames:v 1 "$source_img" -loglevel error
fi
work="$(mktemp -d)"
trap 'rm -rf "$work"' EXIT
cd "$work"
export PYTHONPATH="$DIR${PYTHONPATH:+:$PYTHONPATH}"
set +e
"$VENV/bin/python" "$DIR/inference.py" \
--driven_audio "$driven" \
--source_image "$source_img" \
--result_dir "$result" \
--checkpoint_dir "$DIR/checkpoints" \
${extra[@]+"${extra[@]}"}
rc=$?
set -e
[ -n "$cleanup_img" ] && rm -f "$cleanup_img" || true
newest="$(find "$result" -type f -name '*.mp4' -printf '%T@ %p\n' 2>/dev/null | sort -rn | head -1 | cut -d' ' -f2-)"
if [ -n "$newest" ] && [ "$(dirname "$newest")" != "$result" ]; then
cp -f "$newest" "$result/"
fi
exit $rc
#!/usr/bin/env bash
# CLI shim for Wav2Lip lip-sync, run inside the shared lip-sync venv.
# codai/api/video.py invokes: wav2lip --face VIDEO --audio AUDIO --outfile OUT
#
# Checkpoints are NOT baked into the image: on first use they download into the
# writable working dir (a /cache volume in the container) and persist there.
set -euo pipefail
VENV="${CODERAI_LIPSYNC_VENV:-$HOME/.coderai/lipsync_venv}"
SRC="${CODERAI_WAV2LIP_SRC:-$HOME/.coderai/Wav2Lip}" # baked read-only repo code
DIR="${CODERAI_WAV2LIP_DIR:-$SRC}" # writable working copy
if [ ! -x "$VENV/bin/python" ]; then
echo "wav2lip: lip-sync venv not found at $VENV" >&2
exit 127
fi
# Seed a writable copy of the repo code if the working dir isn't populated
# (the image ships the code read-only under /opt; weights are excluded).
if [ ! -f "$DIR/inference.py" ]; then
mkdir -p "$DIR"
rsync -a --exclude 'checkpoints/' --exclude 'face_detection/detection/sfd/*.pth' "$SRC/" "$DIR/"
fi
# Download checkpoints on first use (idempotent: skips non-empty files).
mkdir -p "$DIR/checkpoints" "$DIR/face_detection/detection/sfd"
_dl(){ if [ ! -s "$2" ]; then echo "wav2lip: downloading $(basename "$2") …" >&2;
curl -fSL --retry 3 -o "$2" "$1" || { echo "wav2lip: download failed: $1" >&2; exit 1; }; fi; }
_dl "https://huggingface.co/camenduru/Wav2Lip/resolve/main/checkpoints/wav2lip_gan.pth" \
"$DIR/checkpoints/wav2lip_gan.pth"
_dl "https://www.adrianbulat.com/downloads/python-fan/s3fd-619a316812.pth" \
"$DIR/face_detection/detection/sfd/s3fd.pth"
CKPT="${CODERAI_WAV2LIP_CKPT:-$DIR/checkpoints/wav2lip_gan.pth}"
# Run from a writable scratch dir (inference.py writes ./temp/*), repo on PYTHONPATH.
work="$(mktemp -d)"
trap 'rm -rf "$work"' EXIT
cd "$work"
mkdir -p temp
export PYTHONPATH="$DIR${PYTHONPATH:+:$PYTHONPATH}"
"$VENV/bin/python" "$DIR/inference.py" --checkpoint_path "$CKPT" "$@"
#!/usr/bin/env sh
# Set the CoderAI runtime environment (standalone Python, bundled native libs,
# nvidia wheel libs) then exec the given command. Used by supervisord to launch
# the bundled tool web UIs with the same library environment as the main server.
set -eu
export PYTHONHOME=/opt/coderai/python
export PATH="/opt/coderai/python/bin:$PATH"
NV="/opt/coderai/python/lib/python3.13/site-packages/nvidia"
LIBS="/opt/coderai/python/lib:/opt/coderai/local-libs"
if [ -d "$NV" ]; then
for d in "$NV"/*/lib; do
[ -d "$d" ] && LIBS="$LIBS:$d"
done
fi
export LD_LIBRARY_PATH="$LIBS${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
exec "$@"
# CoderAI single-port reverse proxy (in-container).
# Fronts the main server and the bundled tool web UIs on one published port.
# nginx runs in the foreground under supervisord (daemon off).
# No `user` directive: when the container runs as root, the master stays root and
# spawns workers as nobody; when run with `--user UID`, nginx runs entirely as that
# UID. All writable state below lives under /tmp so non-root runs work unchanged.
worker_processes auto;
daemon off;
pid /tmp/nginx.pid;
error_log /dev/stderr info;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
sendfile on;
server_tokens off;
access_log /dev/stdout;
# Writable temp paths under /tmp so the listed user (root or --user UID) can
# always create them; the defaults under /var/lib/nginx are root-only.
client_body_temp_path /tmp/nginx-client-body;
proxy_temp_path /tmp/nginx-proxy;
fastcgi_temp_path /tmp/nginx-fastcgi;
uwsgi_temp_path /tmp/nginx-uwsgi;
scgi_temp_path /tmp/nginx-scgi;
# AI workloads: large uploads (images/audio/video) and long generations.
client_max_body_size 4096m;
proxy_read_timeout 3600s;
proxy_send_timeout 3600s;
proxy_connect_timeout 75s;
# Shared proxy headers. CoderAI builds public URLs from these
# (codai/api/urlutils.py); the tools honour X-Forwarded-Prefix for sub-paths.
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Host $host;
upstream coderai { server 127.0.0.1:18776; }
upstream editor { server 127.0.0.1:8420; }
upstream videogen { server 127.0.0.1:7790; }
upstream township { server 127.0.0.1:7788; }
server {
listen 8776 default_server;
listen [::]:8776 default_server;
server_name _;
# --- Video editor: https://host:8776/editor/ -------------------------
location /editor/ {
proxy_pass http://editor/; # trailing slash strips the prefix
proxy_set_header X-Forwarded-Prefix /editor;
proxy_request_buffering off; # stream large uploads through
proxy_buffering off; # SSE progress
}
# --- Videogen studio: https://host:8776/videogen/ -------------------
location /videogen/ {
proxy_pass http://videogen/;
proxy_set_header X-Forwarded-Prefix /videogen;
proxy_request_buffering off;
proxy_buffering off;
}
# --- Township fighters: https://host:8776/township/ ----------------
location /township/ {
proxy_pass http://township/;
proxy_set_header X-Forwarded-Prefix /township;
proxy_request_buffering off;
proxy_buffering off;
}
# --- CoderAI server + OpenAI API at the root ------------------------
location / {
proxy_pass http://coderai;
proxy_buffering off; # SSE: chat stream + task progress
}
}
}
...@@ -16,6 +16,17 @@ DATA_ROOT="$PWD/coderai-runtime" ...@@ -16,6 +16,17 @@ DATA_ROOT="$PWD/coderai-runtime"
DETACH=0 DETACH=0
NAME="coderai" NAME="coderai"
EXTRA_ARGS=() EXTRA_ARGS=()
# Optional: map an EXISTING local config dir + real data dirs so the image runs
# against your live config/models without a rebuild (an image is immutable; this
# is purely run-time bind-mounts). See --config-dir / --local / --map below.
CONFIG_DIR_SRC=""
INPLACE_CONFIG=0
MAPS=()
# Optional debug logging: CODERAI_DEBUG selects coderai's --debug* flags inside
# the container; LOG_FILE_CONT is the in-container log path (under a mounted
# volume so it's tailable on the host).
DEBUG_SPEC=""
LOG_FILE_CONT=""
usage() { usage() {
cat <<'EOF' cat <<'EOF'
...@@ -32,8 +43,26 @@ Options: ...@@ -32,8 +43,26 @@ Options:
--data-dir PATH Directory for config/models/cache (default: ./coderai-runtime). --data-dir PATH Directory for config/models/cache (default: ./coderai-runtime).
--name NAME Container name (default: coderai). --name NAME Container name (default: coderai).
-d, --detach Run in background. -d, --detach Run in background.
--config-dir PATH Use an EXISTING config dir (with config.json/models.json),
mounted at /config/coderai. Copied to a temp dir by default
so the image's host/port rewrite leaves your dir untouched.
--local Shortcut for --config-dir ~/.coderai.
--inplace-config Mount --config-dir in place (the image WILL edit host/port).
--map HOST[:CONT] Bind-mount a host dir at the SAME path (or HOST:CONT) inside
the container, so absolute paths in models.json resolve
(e.g. --map /AI/guffcache). Repeatable.
--debug[=SPEC] Run coderai with debug flags. SPEC (default 'all'):
all | engine,requests,ws,web,thermal,lora,engine-web
Also writes a host-tailable file log (see --log-file).
--log-file PATH In-container log path (default /cache/logs/coderai.log,
visible on the host under the cache mount). Implies a file
log even without --debug. tee'd, so `docker logs` still works.
-- ARGS Extra args passed to the container engine before the image name. -- ARGS Extra args passed to the container engine before the image name.
-h, --help Show this help. -h, --help Show this help.
Test against your live config + data (no rebuild):
packaging/linux/run_oci.sh --nvidia --local \
--map /AI/guffcache --map /AI/huggingface --map /AI/offloads
EOF EOF
} }
...@@ -53,6 +82,19 @@ while [[ $# -gt 0 ]]; do ...@@ -53,6 +82,19 @@ while [[ $# -gt 0 ]]; do
--name) --name)
[[ $# -ge 2 ]] || { echo "Error: --name requires a value" >&2; exit 2; } [[ $# -ge 2 ]] || { echo "Error: --name requires a value" >&2; exit 2; }
NAME="$2"; shift 2 ;; NAME="$2"; shift 2 ;;
--config-dir)
[[ $# -ge 2 ]] || { echo "Error: --config-dir requires a path" >&2; exit 2; }
CONFIG_DIR_SRC="$2"; shift 2 ;;
--local) CONFIG_DIR_SRC="$HOME/.coderai"; shift ;;
--inplace-config) INPLACE_CONFIG=1; shift ;;
--map)
[[ $# -ge 2 ]] || { echo "Error: --map requires HOST[:CONT]" >&2; exit 2; }
MAPS+=("$2"); shift 2 ;;
--debug) DEBUG_SPEC="all"; shift ;;
--debug=*) DEBUG_SPEC="${1#*=}"; shift ;;
--log-file)
[[ $# -ge 2 ]] || { echo "Error: --log-file requires a path" >&2; exit 2; }
LOG_FILE_CONT="$2"; shift 2 ;;
-d|--detach) DETACH=1; shift ;; -d|--detach) DETACH=1; shift ;;
--) --)
shift shift
...@@ -90,7 +132,61 @@ volume_suffix="" ...@@ -90,7 +132,61 @@ volume_suffix=""
if [[ "$ENGINE" == "podman" ]]; then if [[ "$ENGINE" == "podman" ]]; then
volume_suffix=":Z" volume_suffix=":Z"
fi fi
args+=(-v "$DATA_ROOT/config:/config$volume_suffix" -v "$DATA_ROOT/models:/models$volume_suffix" -v "$DATA_ROOT/cache:/cache$volume_suffix")
# Config mount: either the fresh scratch dir, or an EXISTING local config dir
# mounted at /config/coderai (where the image launcher reads config.json).
CONFIG_NOTE="$DATA_ROOT/config (fresh)"
if [[ -n "$CONFIG_DIR_SRC" ]]; then
[[ -d "$CONFIG_DIR_SRC" ]] || { echo "Error: --config-dir '$CONFIG_DIR_SRC' not found" >&2; exit 2; }
CONFIG_DIR_SRC="$(cd "$CONFIG_DIR_SRC" && pwd)"
if [[ "$INPLACE_CONFIG" == "1" ]]; then
CFG_MOUNT="$CONFIG_DIR_SRC"
CONFIG_NOTE="$CONFIG_DIR_SRC (in place — image rewrites host/port!)"
else
# Copy ONLY the json config files to a throwaway dir so the image's host/port
# rewrite never touches your real config, and we don't copy big subdirs
# (e.g. ~/.coderai/ds4 weights).
CFG_PARENT="$(mktemp -d "${TMPDIR:-/tmp}/coderai-cfg.XXXXXX")"
CFG_MOUNT="$CFG_PARENT/coderai"
mkdir -p "$CFG_MOUNT"
cp -a "$CONFIG_DIR_SRC"/*.json "$CFG_MOUNT/" 2>/dev/null || true
[[ -f "$CFG_MOUNT/config.json" ]] || { echo "Error: no config.json in '$CONFIG_DIR_SRC'" >&2; exit 2; }
CONFIG_NOTE="$CONFIG_DIR_SRC$CFG_MOUNT (copy; original untouched)"
fi
args+=(-v "$CFG_MOUNT:/config/coderai$volume_suffix" \
-v "$DATA_ROOT/models:/models$volume_suffix" -v "$DATA_ROOT/cache:/cache$volume_suffix")
else
args+=(-v "$DATA_ROOT/config:/config$volume_suffix" -v "$DATA_ROOT/models:/models$volume_suffix" -v "$DATA_ROOT/cache:/cache$volume_suffix")
fi
# 1:1 (or HOST:CONT) data mounts so absolute paths in models.json resolve.
for m in "${MAPS[@]:-}"; do
[[ -n "$m" ]] || continue
host="${m%%:*}"; cont="${m#*:}"; [[ "$m" == *:* ]] || cont="$host"
if [[ -d "$host" ]]; then
args+=(-v "$host:$cont$volume_suffix")
else
echo "Warning: --map source '$host' not found; skipping" >&2
fi
done
# Debug flags + host-tailable file log. A file log is enabled by --debug or
# --log-file; default path lives under /cache so it lands on the host mount.
LOG_HOST_NOTE="(none)"
if [[ -n "$DEBUG_SPEC" || -n "$LOG_FILE_CONT" ]]; then
: "${LOG_FILE_CONT:=/cache/logs/coderai.log}"
[[ -n "$DEBUG_SPEC" ]] && args+=(-e "CODERAI_DEBUG=$DEBUG_SPEC")
args+=(-e "CODERAI_LOG_FILE=$LOG_FILE_CONT")
# Translate the in-container path to the host path for the banner, for the
# standard /config|/models|/cache mounts.
case "$LOG_FILE_CONT" in
/cache/*) LOG_HOST_NOTE="$DATA_ROOT/cache/${LOG_FILE_CONT#/cache/}" ;;
/models/*) LOG_HOST_NOTE="$DATA_ROOT/models/${LOG_FILE_CONT#/models/}" ;;
/config/*) LOG_HOST_NOTE="$DATA_ROOT/config/${LOG_FILE_CONT#/config/}" ;;
*) LOG_HOST_NOTE="$LOG_FILE_CONT (in-container; mount it to see it on the host)" ;;
esac
fi
args+=("${EXTRA_ARGS[@]}" "$IMAGE_TAG") args+=("${EXTRA_ARGS[@]}" "$IMAGE_TAG")
cat <<EOF cat <<EOF
...@@ -100,6 +196,13 @@ Starting CoderAI OCI container ...@@ -100,6 +196,13 @@ Starting CoderAI OCI container
mode: $MODE mode: $MODE
url: http://127.0.0.1:$PORT/admin url: http://127.0.0.1:$PORT/admin
data: $DATA_ROOT data: $DATA_ROOT
config: $CONFIG_NOTE
debug: ${DEBUG_SPEC:-off}
log: $LOG_HOST_NOTE
EOF EOF
if [[ "$LOG_HOST_NOTE" != "(none)" ]]; then
echo " tail it: tail -F '$LOG_HOST_NOTE'"
fi
exec "$ENGINE" "${args[@]}" exec "$ENGINE" "${args[@]}"
#!/usr/bin/env bash
# Smoke test for the all-in-one CoderAI image: brings the container up and checks
# that nginx + the bundled services answer, and that every external binary/worker
# we rely on is present and runnable. Does NOT load models (no weights needed).
#
# Usage: [DOCKER="sudo docker"] [GPU=--gpus=all] ./smoke_test_services.sh [IMAGE]
set -uo pipefail
DOCKER_BIN="${DOCKER:-docker}"
read -r -a DK <<< "$DOCKER_BIN"
IMAGE="${1:-coderai:dist}"
PORT="${PORT:-18080}"
NAME="coderai-smoke-$$"
GPU="${GPU:-}"
TMP="$(mktemp -d)"
fails=0
note(){ printf '%-52s %s\n' "$1" "$2"; }
ok(){ note "$1" "OK"; }
bad(){ note "$1" "FAIL — $2"; fails=$((fails+1)); }
cleanup(){ "${DK[@]}" rm -f "$NAME" >/dev/null 2>&1 || true; rm -rf "$TMP"; }
trap cleanup EXIT
echo "== starting $IMAGE as $NAME (port $PORT) =="
mkdir -p "$TMP/config" "$TMP/models" "$TMP/cache"
# shellcheck disable=SC2086
"${DK[@]}" run -d --name "$NAME" $GPU --ipc=host \
--user "$(id -u):$(id -g)" \
-p "$PORT:8776" \
-v "$TMP/config:/config" -v "$TMP/models:/models" -v "$TMP/cache:/cache" \
"$IMAGE" >/dev/null || { echo "container failed to start"; exit 1; }
echo "== waiting for the front to answer =="
up=0
for _ in $(seq 1 60); do
code="$(curl -s -o /dev/null -w '%{http_code}' "http://127.0.0.1:$PORT/" || true)"
# Any non-5xx HTTP code means the front + coderai are up (the root path itself
# 404s — the UI lives at /admin); 502/503 means the upstream isn't ready yet.
case "$code" in 200|301|302|307|401|403|404) up=1; break;; esac
if ! "${DK[@]}" ps -q --filter "name=$NAME" | grep -q .; then
echo "container exited early; logs:"; "${DK[@]}" logs "$NAME" 2>&1 | tail -40; exit 1
fi
sleep 3
done
[ "$up" = 1 ] && ok "front http://…:$PORT/ responds" || bad "front /" "no response"
echo "== sub-path mounts =="
for p in editor videogen township; do
code="$(curl -s -o /dev/null -w '%{http_code}' "http://127.0.0.1:$PORT/$p/" || true)"
case "$code" in 200|301|302|307) ok "/$p/ ($code)";; *) bad "/$p/" "http $code";; esac
done
echo "== bundled binaries on PATH =="
for b in ffmpeg ffprobe vulkaninfo nginx supervisord whisper-server ds4-server wav2lip sadtalker lspci; do
if "${DK[@]}" exec "$NAME" sh -lc "command -v $b >/dev/null 2>&1"; then ok "bin: $b"; else bad "bin: $b" "missing"; fi
done
echo "== ds4 seeded on the cache volume =="
if "${DK[@]}" exec "$NAME" sh -lc "test -x /cache/ds4/ds4-server"; then ok "/cache/ds4/ds4-server"; else bad "/cache/ds4/ds4-server" "missing"; fi
echo "== shared lip-sync venv (py3.10 + torch) =="
if "${DK[@]}" exec "$NAME" /opt/coderai/lipsync_venv/bin/python -c "import torch,sys; print(sys.version.split()[0], torch.__version__)" >/dev/null 2>&1; then
ok "lipsync venv imports torch"
else
bad "lipsync venv" "python/torch import failed"
fi
# Repo code is bundled; weights are NOT (download on first lip-sync use).
if "${DK[@]}" exec "$NAME" sh -lc "test -f /opt/coderai/Wav2Lip/inference.py && test -f /opt/coderai/SadTalker/inference.py"; then
ok "lip-sync repo code present"
else
bad "lip-sync repo code" "missing"
fi
echo "== parler overlay present =="
if "${DK[@]}" exec "$NAME" sh -lc "test -d /opt/coderai/parler-venv/site-packages"; then ok "parler overlay"; else bad "parler overlay" "missing"; fi
echo
if [ "$fails" = 0 ]; then echo "SMOKE TEST PASSED"; else echo "SMOKE TEST: $fails failure(s)"; "${DK[@]}" logs "$NAME" 2>&1 | tail -30; fi
exit "$fails"
; Process supervisor for the CoderAI distributable image.
; Starts nginx (public :8776) plus the main server and the bundled tool web UIs,
; all bound to localhost behind nginx. Logs go to stdout/stderr so `docker logs`
; shows everything.
[supervisord]
nodaemon=true
logfile=/dev/null
logfile_maxbytes=0
; pid + control socket under /tmp so the container runs as root OR `--user UID`.
pidfile=/tmp/supervisord.pid
[unix_http_server]
file=/tmp/supervisor.sock
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[program:coderai]
; The OCI launcher seeds /config and binds the main server to localhost:18776.
command=/usr/local/bin/coderai
environment=CODERAI_HOST="127.0.0.1",CODERAI_PORT="18776"
autostart=true
autorestart=true
startsecs=5
stopwaitsecs=30
priority=10
; Signal the whole process group so the front's engine subprocesses (and the
; optional `tee` used for file logging) stop/kill together with the launcher.
stopasgroup=true
killasgroup=true
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
[program:nginx]
command=/usr/sbin/nginx -c /etc/nginx/nginx.conf
autostart=true
autorestart=true
startsecs=3
priority=20
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
[program:video_editor]
command=/usr/local/bin/with-env /opt/coderai/python/bin/python3 /opt/coderai/app/tools/video_editor.py
--no-browser --host 127.0.0.1 --port 8420
--base-url http://127.0.0.1:18776
directory=/opt/coderai/app
autostart=true
autorestart=true
startsecs=5
startretries=5
priority=30
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
[program:videogen]
command=/usr/local/bin/with-env /opt/coderai/python/bin/python3 /opt/coderai/app/tools/videogen.py
--host 127.0.0.1 --web-port 7790
--base-url http://127.0.0.1:18776
--out-dir /cache/videogen_output
directory=/opt/coderai/app
autostart=true
autorestart=true
startsecs=5
startretries=5
priority=30
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
[program:township]
command=/usr/local/bin/with-env /opt/coderai/python/bin/python3 /opt/coderai/app/tools/gen_township_fighters.py
--web-port 7788
--base-url http://127.0.0.1:18776
--out-dir /cache/township_output
directory=/opt/coderai/app
autostart=true
autorestart=true
startsecs=5
startretries=5
priority=30
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
; Parler-TTS runs in its OWN bundled venv (transformers 4.46, pinned). Its
; site-packages is prepended to PYTHONPATH so it shadows the main stack; torch and
; the rest resolve from the standalone Python's site-packages underneathexactly
; the local --system-site-packages layering. Internal-only (not proxied by nginx);
; coderai reaches it via a TTS model config { "service_url": "http://127.0.0.1:8123" }.
; Disabled by default: set autostart=true (or start it from supervisorctl) once a
; parler model is configured. Won't be fatal if the model isn't present.
[program:parler]
command=/usr/local/bin/with-env /opt/coderai/python/bin/python3 /opt/coderai/app/tools/parler_tts_service.py
--model %(ENV_CODERAI_PARLER_MODEL)s --port 8123
environment=PYTHONPATH="/opt/coderai/parler-venv/site-packages"
directory=/opt/coderai/app
autostart=false
autorestart=true
startsecs=10
startretries=3
priority=40
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
#!/usr/bin/env bash
# Fast incremental image update: re-layer ONLY the coderai app code + launcher
# scripts + service configs on top of an already-built image. No 20 GB bundle
# recopy — seconds, not the ~15 min of a full build_oci_image.sh run.
#
# It keeps an immutable `coderai:base` tag (the heavy bundle) and rebuilds the
# shipped `coderai:dist` as base + a thin app layer. Because every update starts
# from the SAME base, app layers never stack up over repeated updates.
#
# Usage:
# [DOCKER="sudo docker"] ./update_oci_image.sh
# BASE_IMAGE=coderai:base TAG=coderai:dist DOCKER="sudo docker" ./update_oci_image.sh
#
# First run seeds coderai:base from the current coderai:dist. To re-baseline the
# bundle (new venv/libs/tools), run build_oci_image.sh and then:
# docker rmi coderai:base # drop the stale base; next update re-seeds it
set -euo pipefail
HERE="$(cd "$(dirname "$0")" && pwd)"
REPO_ROOT="$(cd "$HERE/../.." && pwd)"
DOCKER_BIN="${DOCKER:-docker}"
read -r -a DK <<< "$DOCKER_BIN"
BASE_IMAGE="${BASE_IMAGE:-coderai:base}"
TAG="${TAG:-coderai:dist}"
SEED_FROM="${SEED_FROM:-coderai:dist}"
img_exists(){ "${DK[@]}" image inspect "$1" >/dev/null 2>&1; }
# Seed the immutable base from a previously built full image if it doesn't exist.
if ! img_exists "$BASE_IMAGE"; then
if img_exists "$SEED_FROM"; then
echo "== seeding immutable base '$BASE_IMAGE' from '$SEED_FROM' =="
"${DK[@]}" tag "$SEED_FROM" "$BASE_IMAGE"
else
echo "Base '$BASE_IMAGE' and seed '$SEED_FROM' both missing." >&2
echo "Run packaging/linux/build_oci_image.sh for a full build first." >&2
exit 1
fi
fi
echo "== updating '$TAG' from base '$BASE_IMAGE' (app code only) =="
t0=$(date +%s)
"${DK[@]}" build \
-f "$HERE/Dockerfile.update" \
--build-arg BASE_IMAGE="$BASE_IMAGE" \
-t "$TAG" "$REPO_ROOT"
echo "== done in $(( $(date +%s) - t0 ))s: '$TAG' (base '$BASE_IMAGE' unchanged) =="
echo " Tip: 'docker image prune -f' to drop the now-dangling previous '$TAG' layer."
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment