feat: model "to-download" list, mmproj vision, styled modals, broker + packaging

Web UI / models:
- "To download" wishlist: models known but not on disk and not configured show
  as non-configured to-download rows. Free-disk on an unconfigured model, Remove
  on a model with no files left, and a new "Add to list" button in the download
  window all record into models.json `to_download`; pruned on enable/download.
  New endpoints model-mark-download / model-unmark-download.
- mmproj multimodal components: mmproj GGUFs are classified as components (not
  models), selectable per-GGUF in the model config (auto-selected, enables vision
  capability). VulkanBackend loads them via llama.cpp's MTMDChatHandler (--mmproj
  equivalent), and the chat path now forwards image_url content end-to-end.
- All window.alert() replaced by a shared styled showAlert()/showConfirm() modal
  in base.html (used across every admin template).

Front proxy / broker:
- Fix engine model-assignment NameError (keep -> _keep).
- Brokered GET /coderai/capabilities now answers from the front (whole node) so
  multi-GPU hosts report every card, not a single engine's CUDA-visible one.
- Log a clear reason when the broker is disabled.

Packaging (distributable OCI image):
- Multi-stage venv image + smoke test; bundle ds4/wav2lip/sadtalker + parler;
  whisper-server etc. dereferenced (cp -aL) so no dangling symlinks.
- Dockerfile.update + update_oci_image.sh: ~30s incremental code-only rebuild on
  an immutable coderai:base (no 20GB bundle recopy).
- run_oci.sh: --local/--config-dir + --map to run against existing local config
  and data dirs without a rebuild; --debug[=flags] + --log-file for selectable
  debug flags and a host-tailable file log (launcher tees; supervisord kills the
  process group). tmp_janitor age-prunes the dedicated temp dir.
Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
parent 9d023ec2
...@@ -43,6 +43,10 @@ township_output/ ...@@ -43,6 +43,10 @@ township_output/
.packaging-cache/ .packaging-cache/
tmp/ tmp/
# Exported image tarballs + local OCI run-state (large artifacts)
dist/
coderai-runtime/
# Video editor sessions + generated media (runtime artifacts) # Video editor sessions + generated media (runtime artifacts)
video_editor/sessions/ video_editor/sessions/
tools/coderai_media/ tools/coderai_media/
...@@ -286,3 +286,67 @@ safe. ...@@ -286,3 +286,67 @@ safe.
14. Thermal protection is config-driven and model-agnostic (config.json 14. Thermal protection is config-driven and model-agnostic (config.json
`thermal`). Don't special-case it per model/backend; it only reads temps and `thermal`). Don't special-case it per model/backend; it only reads temps and
sleeps. Honour the enable flags and high/resume hysteresis. sleeps. Honour the enable flags and high/resume hysteresis.
================================================================================
## Distributable Docker image (packaging/linux)
================================================================================
All-in-one image: coderai + tools (editor/videogen/township) behind nginx on a
single port (8776), built from the LOCAL install's venv + binaries.
Multi-stage `Dockerfile.oci-venv`:
- assembler stage stages the local bundle into /opt/coderai (python-build-
standalone interpreter + venv site-packages + ldd'd native libs + parler
overlay + lip-sync venv/repos + py310 + ds4). The ~20 GB bundle COPY lives
ONLY here; the runtime stage COPYs the assembled tree ONCE (no double-store).
- runtime stage: apt (nginx/supervisor/vulkan-tools/ffmpeg/...), COPY the
assembled /opt/coderai, then COPY app code → /opt/coderai/app, launchers →
/usr/local/bin, nginx/supervisor confs. Entry = coderai-entrypoint →
supervisord (nginx + main server + tool UIs).
- Do NOT set PYTHONHOME globally (breaks the system-python supervisord); set
PATH only. Bundle dereferences host symlinks (cp -aL) so binaries like
whisper-server are real files in the image, not dangling links.
Full build (slow, ~15 min — rebuilds the bundle):
packaging/linux/build_oci_image.sh # tags coderai:dist
Smoke test (no weights, checks services + every bundled binary):
DOCKER="sudo docker" GPU="--gpus all" PORT=18082 \
packaging/linux/smoke_test_services.sh coderai:dist
Run against your LIVE local config + data (no rebuild — pure bind-mounts):
packaging/linux/run_oci.sh --nvidia --local \
--map /AI/guffcache --map /AI/huggingface --map /AI/offloads
- The image launcher reads config from /config/coderai and runs
`coderai --config /config/coderai`, rewriting server.host/port in config.json.
- `--local` (= --config-dir ~/.coderai) copies ONLY the *.json config files to
a temp dir and mounts it at /config/coderai, so your real config is untouched
(use --inplace-config to edit it directly).
- `--map HOST[:CONT]` bind-mounts a host dir at the SAME path inside the
container so the ABSOLUTE paths in models.json/config.json (gguf/hf caches,
offloads) resolve unchanged. Without these maps the models won't be found.
- `--debug[=SPEC]` runs coderai with --debug* flags (SPEC default 'all';
e.g. `--debug=engine,requests,ws` --debug-engine/--debug-requests/--debug-ws,
`--debug` always auto-added) and writes a host-tailable file log. `--log-file
PATH` sets the in-container log path (default /cache/logs/coderai.log host
under the cache mount). Driven by env CODERAI_DEBUG + CODERAI_LOG_FILE, read
by the coderai-oci launcher, which tees output so `docker logs` still works.
supervisord [program:coderai] uses stopasgroup/killasgroup so the front's
engine subprocesses + the tee are torn down together. NOTE: the launcher +
supervisord.conf are baked in, so changes need a (fast) update_oci_image.sh.
Incremental update (FAST, ~30 s — code-only changes, NO bundle recopy):
DOCKER="sudo docker" packaging/linux/update_oci_image.sh
- `Dockerfile.update` is `FROM coderai:base` and re-layers ONLY the app code +
launchers + service confs. The heavy bundle layers are inherited unchanged.
- Keeps an immutable `coderai:base` (the bundle) and rebuilds `coderai:dist`
as base + a thin app layer. Every update starts from the SAME base, so app
layers never stack across updates. dist and base SHARE the bundle layers —
keeping both costs only the app layer (a few MB), not a second 23 GB.
- First run seeds coderai:base from the current coderai:dist (docker tag).
- Re-baseline the bundle (new venv/libs/tools): run build_oci_image.sh, then
`docker rmi coderai:base` so the next update re-seeds it from the new dist.
- Use this whenever ONLY codai/ app code (or launchers/confs) changed — a full
build_oci_image.sh is wasteful for that.
- CAUTION: COPY adds/overwrites but does NOT delete files removed from the
repo; the cleanup RUN prunes only known-stale paths (.git/venv*/dist/...). A
source file deleted from codai/ lingers in the overlay until a full rebuild.
...@@ -980,6 +980,14 @@ async def api_download_model( ...@@ -980,6 +980,14 @@ async def api_download_model(
if existing: if existing:
return {"session_id": existing, "attached": True} return {"session_id": existing, "attached": True}
# A download supersedes any "to download" wishlist entry for this model.
if config_manager is not None:
changed = _prune_to_download(model_id)
if file_pattern:
changed = _prune_to_download(file_pattern) or changed
if changed:
config_manager.save_models()
session_id = str(_uuid.uuid4()) session_id = str(_uuid.uuid4())
pq = _q.Queue() pq = _q.Queue()
_download_sessions[session_id] = pq _download_sessions[session_id] = pq
...@@ -1170,6 +1178,58 @@ def _hf_repo_id_from_path(path: str) -> str: ...@@ -1170,6 +1178,58 @@ def _hf_repo_id_from_path(path: str) -> str:
return '' return ''
# Categories that hold real (configured) models in models.json.
_VALID_MODEL_CATS = {
"text_models", "image_models", "audio_models", "gguf_models", "tts_models",
"vision_models", "video_models", "audio_gen_models", "embedding_models",
"spatial_models",
}
def _entry_key(entry) -> str:
"""The identifying path/id of a models.json entry (str or dict)."""
if isinstance(entry, str):
return entry
if isinstance(entry, dict):
return entry.get("path") or entry.get("id") or ""
return ""
def _basename_key(key: str) -> str:
import os as _os
return _os.path.basename(key) if ("/" in key or _os.sep in key) else key
def _is_model_configured(model_id: str) -> bool:
"""True if model_id is already a configured model (matched by id or basename)."""
if config_manager is None:
return False
fname = _basename_key(model_id)
for cat in _VALID_MODEL_CATS:
for m in config_manager.models_data.get(cat, []):
key = _entry_key(m)
if key == model_id or (fname and _basename_key(key) == fname):
return True
return False
def _prune_to_download(model_id: str) -> bool:
"""Drop any 'to download' wishlist entry matching model_id. Returns True if changed."""
if config_manager is None:
return False
lst = config_manager.models_data.get("to_download")
if not lst:
return False
fname = _basename_key(model_id)
kept = [e for e in lst
if not (_entry_key(e) == model_id
or (fname and _basename_key(_entry_key(e)) == fname))]
if len(kept) != len(lst):
config_manager.models_data["to_download"] = kept
return True
return False
def _scan_caches() -> dict: def _scan_caches() -> dict:
import os import os
result: dict = {"hf": [], "gguf": []} result: dict = {"hf": [], "gguf": []}
...@@ -1451,6 +1511,49 @@ def _scan_caches() -> dict: ...@@ -1451,6 +1511,49 @@ def _scan_caches() -> dict:
"configs": all_configs.get(path, []), "configs": all_configs.get(path, []),
}) })
# Surface "to download" wishlist entries: models the user wants listed for
# later download but has NOT configured and are NOT on disk. They appear as
# non-configured rows with a download button (in_config=False, missing=True).
seen_gguf = {m["path"] for m in result["gguf"]} | {m["filename"] for m in result["gguf"]}
seen_hf = {m["id"] for m in result["hf"]}
if config_manager:
for entry in config_manager.models_data.get("to_download", []):
e = entry if isinstance(entry, dict) else {"path": entry}
mid = (e.get("path") or e.get("id") or "").strip()
if not mid or _is_model_configured(mid):
continue
repo = e.get("source_repo") or mid
mtype = e.get("model_type") or "text_models"
is_gguf = (bool(e.get("is_gguf")) or mid.lower().endswith(".gguf")
or "gguf" in mid.lower() or mtype == "gguf_models")
fname = os.path.basename(mid) if ("/" in mid or os.sep in mid) else mid
caps = e.get("capabilities") or detect_model_capabilities(mid).to_list()
if is_gguf:
if mid in seen_gguf or fname in seen_gguf:
continue
result["gguf"].append({
"filename": fname, "path": mid,
"size_gb": 0, "size_bytes": 0,
"in_config": False, "missing": True, "to_download": True,
"source_repo": repo,
"model_type": mtype if mtype != "gguf_models" else "text_models",
"settings": {}, "capabilities": caps,
"incomplete": False, "configs": [],
})
seen_gguf.add(mid); seen_gguf.add(fname)
else:
if mid in seen_hf:
continue
result["hf"].append({
"id": mid, "size_gb": 0, "size_bytes": 0, "revision_count": 0,
"files": [], "file_count": 0,
"in_config": False, "missing": True, "to_download": True,
"source_repo": repo, "model_type": mtype,
"settings": {}, "capabilities": caps,
"incomplete": False, "configs": [],
})
seen_hf.add(mid)
return result return result
...@@ -1729,6 +1832,60 @@ async def api_model_add_known(request: Request, username: str = Depends(require_ ...@@ -1729,6 +1832,60 @@ async def api_model_add_known(request: Request, username: str = Depends(require_
return {"success": True, "already": True} return {"success": True, "already": True}
config_manager.models_data.setdefault(model_type, []).append(entry) config_manager.models_data.setdefault(model_type, []).append(entry)
_prune_to_download(model_id)
config_manager.save_models()
_broker_notify_models_updated(request)
return {"success": True}
@router.post("/admin/api/model-mark-download", summary="List a model for later download")
async def api_model_mark_download(request: Request, username: str = Depends(require_admin)):
"""Record a model in the 'to download' wishlist: it appears in the model list
as a non-configured, to-be-downloaded entry (no files fetched, no serving
config created). Used by 'Free disk' on unconfigured models, 'Remove' on a
model with no files left, and 'Add to list' in the download window."""
if config_manager is None:
raise HTTPException(status_code=503, detail="Config manager not initialized")
data = await request.json()
model_id = (data.get("model_id") or data.get("path") or "").strip()
if not model_id:
raise HTTPException(status_code=400, detail="model_id is required")
source_repo = (data.get("source_repo") or model_id).strip()
model_type = (data.get("model_type") or "").strip()
is_gguf = (bool(data.get("is_gguf")) or model_type == "gguf_models"
or model_id.lower().endswith(".gguf") or "gguf" in model_id.lower())
if is_gguf:
model_type = "gguf_models"
if model_type not in _VALID_MODEL_CATS:
model_type = "text_models"
# Already a real (configured) model — nothing to add.
if _is_model_configured(model_id):
return {"success": True, "already_configured": True}
import os as _os
lst = config_manager.models_data.setdefault("to_download", [])
fname = _basename_key(model_id)
for e in lst:
k = _entry_key(e)
if k == model_id or (fname and _basename_key(k) == fname):
return {"success": True, "already": True}
lst.append({"path": model_id, "source_repo": source_repo,
"model_type": model_type, "is_gguf": is_gguf})
config_manager.save_models()
_broker_notify_models_updated(request)
return {"success": True}
@router.post("/admin/api/model-unmark-download", summary="Remove a model from the download list")
async def api_model_unmark_download(request: Request, username: str = Depends(require_admin)):
"""Drop a model from the 'to download' wishlist (the user no longer wants it
listed). Has no effect on configured models or files on disk."""
if config_manager is None:
raise HTTPException(status_code=503, detail="Config manager not initialized")
data = await request.json()
model_id = (data.get("model_id") or data.get("path") or "").strip()
if not model_id:
raise HTTPException(status_code=400, detail="model_id is required")
if _prune_to_download(model_id):
config_manager.save_models() config_manager.save_models()
_broker_notify_models_updated(request) _broker_notify_models_updated(request)
return {"success": True} return {"success": True}
...@@ -1747,8 +1904,13 @@ async def api_model_enable(request: Request, username: str = Depends(require_adm ...@@ -1747,8 +1904,13 @@ async def api_model_enable(request: Request, username: str = Depends(require_adm
if model_type not in valid: if model_type not in valid:
raise HTTPException(status_code=400, detail=f"model_type must be one of {valid}") raise HTTPException(status_code=400, detail=f"model_type must be one of {valid}")
lst = config_manager.models_data.setdefault(model_type, []) lst = config_manager.models_data.setdefault(model_type, [])
changed = False
if path not in lst: if path not in lst:
lst.append(path) lst.append(path)
changed = True
if _prune_to_download(path):
changed = True
if changed:
config_manager.save_models() config_manager.save_models()
_broker_notify_models_updated(request) _broker_notify_models_updated(request)
return {"success": True} return {"success": True}
...@@ -2285,7 +2447,7 @@ async def api_model_configure(request: Request, username: str = Depends(require_ ...@@ -2285,7 +2447,7 @@ async def api_model_configure(request: Request, username: str = Depends(require_
"component_quantization", "output_crf", "force_vram_update", "component_quantization", "output_crf", "force_vram_update",
"balanced_gpu_percent", "acceleration", "balanced_gpu_percent", "acceleration",
"cache_type_k", "cache_type_v", "turboquant", "engine", "cache_type_k", "cache_type_v", "turboquant", "engine",
"quant_backend", "kv_cache_budget_mb", "kv_cache_slots"): "quant_backend", "kv_cache_budget_mb", "kv_cache_slots", "mmproj"):
if key in data: if key in data:
entry[key] = data[key] entry[key] = data[key]
......
...@@ -335,7 +335,7 @@ async function deleteEntry() { ...@@ -335,7 +335,7 @@ async function deleteEntry() {
closeDetail(); closeDetail();
loadArchive(); loadArchive();
} catch(e) { } catch(e) {
alert('Delete failed: ' + e.message); showAlert('Delete failed: ' + e.message);
} }
} }
......
...@@ -104,6 +104,81 @@ function donateCopy(id, btn) { ...@@ -104,6 +104,81 @@ function donateCopy(id, btn) {
</main> </main>
{% endif %} {% endif %}
<!-- Shared confirm / notice modal (replaces window.confirm / window.alert) -->
<div id="confirm-modal" class="modal" onclick="if(event.target===this)document.getElementById('confirm-modal-cancel').click()">
<div class="modal-box" style="max-width:420px">
<div class="modal-head">
<span class="modal-title" id="confirm-modal-title">Confirm</span>
<button class="modal-close" id="confirm-modal-x">&times;</button>
</div>
<div class="modal-body">
<p id="confirm-modal-msg" style="margin:0 0 1.25rem;white-space:pre-wrap"></p>
<div style="display:flex;gap:.5rem;justify-content:flex-end">
<button class="btn btn-ghost" id="confirm-modal-cancel">Cancel</button>
<button class="btn btn-danger" id="confirm-modal-ok">Confirm</button>
</div>
</div>
</div>
</div>
<script>
// Global modal helpers, shared by every admin page. Defined here so templates
// can call showAlert()/showConfirm() instead of window.alert()/window.confirm().
if(typeof window.openModal!=='function') window.openModal=function(id){document.getElementById(id).classList.add('show')};
if(typeof window.closeModal!=='function') window.closeModal=function(id){document.getElementById(id).classList.remove('show')};
window.showConfirm=function(title, msg, okLabel){
return new Promise(resolve => {
document.getElementById('confirm-modal-title').textContent = title;
document.getElementById('confirm-modal-msg').textContent = msg;
const okBtn = document.getElementById('confirm-modal-ok');
const cancelBtn= document.getElementById('confirm-modal-cancel');
const xBtn = document.getElementById('confirm-modal-x');
okBtn.className = 'btn btn-danger';
okBtn.textContent = okLabel || 'Confirm';
cancelBtn.style.display = '';
openModal('confirm-modal');
function cleanup(result){
closeModal('confirm-modal');
okBtn.removeEventListener('click', onOk);
cancelBtn.removeEventListener('click', onCancel);
xBtn.removeEventListener('click', onCancel);
resolve(result);
}
function onOk(){ cleanup(true); }
function onCancel(){ cleanup(false); }
okBtn.addEventListener('click', onOk);
cancelBtn.addEventListener('click', onCancel);
xBtn.addEventListener('click', onCancel);
});
};
// Styled replacement for window.alert(): a single-button notice modal.
window.showAlert=function(msg, title, kind){
return new Promise(resolve => {
if(!title && !kind && /^\s*(error|failed|cannot|could not)\b/i.test(String(msg||''))) kind = 'error';
document.getElementById('confirm-modal-title').textContent =
title || (kind === 'error' ? 'Error' : 'Notice');
document.getElementById('confirm-modal-msg').textContent = msg;
const okBtn = document.getElementById('confirm-modal-ok');
const cancelBtn = document.getElementById('confirm-modal-cancel');
const xBtn = document.getElementById('confirm-modal-x');
okBtn.className = 'btn btn-primary';
okBtn.textContent = 'OK';
cancelBtn.style.display = 'none';
openModal('confirm-modal');
function cleanup(){
closeModal('confirm-modal');
cancelBtn.style.display = '';
okBtn.removeEventListener('click', onOk);
xBtn.removeEventListener('click', onOk);
resolve();
}
function onOk(){ cleanup(); }
okBtn.addEventListener('click', onOk);
xBtn.addEventListener('click', onOk);
});
};
</script>
{% block scripts %}{% endblock %} {% block scripts %}{% endblock %}
</body> </body>
</html> </html>
...@@ -4229,12 +4229,12 @@ async function loadCharProfileIntoSlot(prefix, idx, name) { ...@@ -4229,12 +4229,12 @@ async function loadCharProfileIntoSlot(prefix, idx, name) {
charSlots[prefix][idx].name = charSlots[prefix][idx].name || d.name; charSlots[prefix][idx].name = charSlots[prefix][idx].name || d.name;
charSlots[prefix][idx].images = (d.images||[]).map(img => img.data); charSlots[prefix][idx].images = (d.images||[]).map(img => img.data);
renderCharSlots(prefix); renderCharSlots(prefix);
} catch(e) { alert('Failed to load profile: '+e.message); } } catch(e) { showAlert('Failed to load profile: '+e.message); }
} }
async function saveCharSlotAsProfile(prefix, idx) { async function saveCharSlotAsProfile(prefix, idx) {
const slot = charSlots[prefix]?.[idx]; const slot = charSlots[prefix]?.[idx];
if (!slot || !slot.images.length) { alert('Add at least one image first.'); return; } if (!slot || !slot.images.length) { showAlert('Add at least one image first.'); return; }
const name = slot.name || prompt('Profile name:'); const name = slot.name || prompt('Profile name:');
if (!name) return; if (!name) return;
try { try {
...@@ -4246,8 +4246,8 @@ async function saveCharSlotAsProfile(prefix, idx) { ...@@ -4246,8 +4246,8 @@ async function saveCharSlotAsProfile(prefix, idx) {
charSlots[prefix][idx].name = name; charSlots[prefix][idx].name = name;
await loadCharProfileList(); await loadCharProfileList();
renderCharSlots(prefix); renderCharSlots(prefix);
alert(`Saved profile "${name}"`); showAlert(`Saved profile "${name}"`);
} catch(e) { alert('Save failed: '+e.message); } } catch(e) { showAlert('Save failed: '+e.message); }
} }
// ───────────────────────────────────────────────────────────────── // ─────────────────────────────────────────────────────────────────
...@@ -6051,14 +6051,14 @@ async function profCharView(name) { ...@@ -6051,14 +6051,14 @@ async function profCharView(name) {
try { try {
const d = await fetch(ROOT_PATH + '/admin/api/characters/'+encodeURIComponent(name)).then(r=>r.json()); const d = await fetch(ROOT_PATH + '/admin/api/characters/'+encodeURIComponent(name)).then(r=>r.json());
_openProfModal(`Character: ${d.name}`, d.description||'', d.images||[]); _openProfModal(`Character: ${d.name}`, d.description||'', d.images||[]);
} catch(e) { alert('Failed to load character: ' + e.message); } } catch(e) { showAlert('Failed to load character: ' + e.message); }
} }
async function profCharDelete(name) { async function profCharDelete(name) {
if (!confirm(`Delete character profile "${name}"?`)) return; if (!confirm(`Delete character profile "${name}"?`)) return;
const r = await fetch(ROOT_PATH + '/admin/api/characters/'+encodeURIComponent(name), {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/characters/'+encodeURIComponent(name), {method:'DELETE'});
if (r.ok) await profCharLoad(); if (r.ok) await profCharLoad();
else alert('Delete failed: ' + await r.text()); else showAlert('Delete failed: ' + await r.text());
} }
...@@ -6139,7 +6139,7 @@ async function profVoiceDelete(name) { ...@@ -6139,7 +6139,7 @@ async function profVoiceDelete(name) {
if (!confirm(`Delete voice profile "${name}"?`)) return; if (!confirm(`Delete voice profile "${name}"?`)) return;
const r = await fetch(ROOT_PATH + '/admin/api/voices/'+encodeURIComponent(name), {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/voices/'+encodeURIComponent(name), {method:'DELETE'});
if (r.ok) await profVoiceLoad(); if (r.ok) await profVoiceLoad();
else alert('Delete failed: ' + await r.text()); else showAlert('Delete failed: ' + await r.text());
} }
// ───────────────────────────────────────────────────────────────── // ─────────────────────────────────────────────────────────────────
...@@ -6296,14 +6296,14 @@ async function profEnvView(name) { ...@@ -6296,14 +6296,14 @@ async function profEnvView(name) {
try { try {
const d = await fetch(ROOT_PATH + '/admin/api/environments/'+encodeURIComponent(name)).then(r=>r.json()); const d = await fetch(ROOT_PATH + '/admin/api/environments/'+encodeURIComponent(name)).then(r=>r.json());
_openProfModal(`Environment: ${d.name}`, d.description||'', d.images||[]); _openProfModal(`Environment: ${d.name}`, d.description||'', d.images||[]);
} catch(e) { alert('Failed to load environment: ' + e.message); } } catch(e) { showAlert('Failed to load environment: ' + e.message); }
} }
async function profEnvDelete(name) { async function profEnvDelete(name) {
if (!confirm(`Delete environment profile "${name}"?`)) return; if (!confirm(`Delete environment profile "${name}"?`)) return;
const r = await fetch(ROOT_PATH + '/admin/api/environments/'+encodeURIComponent(name), {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/environments/'+encodeURIComponent(name), {method:'DELETE'});
if (r.ok) await profEnvLoad(); if (r.ok) await profEnvLoad();
else alert('Delete failed: ' + await r.text()); else showAlert('Delete failed: ' + await r.text());
} }
// ───────────────────────────────────────────────────────────────── // ─────────────────────────────────────────────────────────────────
...@@ -6528,7 +6528,7 @@ async function deleteCustomPipeline(id) { ...@@ -6528,7 +6528,7 @@ async function deleteCustomPipeline(id) {
_customPipelines = _customPipelines.filter(p => p.id !== id); _customPipelines = _customPipelines.filter(p => p.id !== id);
if (_editingPipelineId === id) { _editingPipelineId = null; _pbSteps = []; renderBuilderSteps(); } if (_editingPipelineId === id) { _editingPipelineId = null; _pbSteps = []; renderBuilderSteps(); }
renderCustomPipelineCards(); renderCustomPipelineCards();
} catch(e) { alert('Delete failed: '+e.message); } } catch(e) { showAlert('Delete failed: '+e.message); }
} }
function _renderPipelineResult(outId, progId, d) { function _renderPipelineResult(outId, progId, d) {
...@@ -6683,7 +6683,7 @@ async function archiveDelete(filename) { ...@@ -6683,7 +6683,7 @@ async function archiveDelete(filename) {
_archiveFiles = _archiveFiles.filter(f => f.filename !== filename); _archiveFiles = _archiveFiles.filter(f => f.filename !== filename);
renderArchive(); renderArchive();
} catch(e) { } catch(e) {
alert('Delete failed: ' + e.message); showAlert('Delete failed: ' + e.message);
} }
} }
......
This diff is collapsed.
...@@ -244,9 +244,9 @@ async function restartEngine(id, name){ ...@@ -244,9 +244,9 @@ async function restartEngine(id, name){
if (!confirm(`Restart engine "${name}"? In-flight requests on it will fail; the supervisor respawns it immediately.`)) return; if (!confirm(`Restart engine "${name}"? In-flight requests on it will fail; the supervisor respawns it immediately.`)) return;
try { try {
const r = await fetch(ROOT_PATH + '/admin/api/engines/' + id + '/restart', {method:'POST'}); const r = await fetch(ROOT_PATH + '/admin/api/engines/' + id + '/restart', {method:'POST'});
if (!r.ok) { const e = await r.json().catch(()=>({})); alert(e.detail || 'Restart failed'); } if (!r.ok) { const e = await r.json().catch(()=>({})); showAlert(e.detail || 'Restart failed'); }
setTimeout(loadEngines, 800); setTimeout(loadEngines, 800);
} catch(e) { alert(e.message); } } catch(e) { showAlert(e.message); }
} }
let _refreshing = false; let _refreshing = false;
...@@ -338,9 +338,9 @@ async function taskAction(id, action) { ...@@ -338,9 +338,9 @@ async function taskAction(id, action) {
const r = await fetch(ROOT_PATH + '/admin/api/tasks/' + encodeURIComponent(id) + '/' + action, {method:'POST'}); const r = await fetch(ROOT_PATH + '/admin/api/tasks/' + encodeURIComponent(id) + '/' + action, {method:'POST'});
if (!r.ok) { if (!r.ok) {
const e = await r.json().catch(() => ({})); const e = await r.json().catch(() => ({}));
alert(e.detail || (verb + ' failed')); showAlert(e.detail || (verb + ' failed'));
} }
} catch (e) { alert(e.message); } } catch (e) { showAlert(e.message); }
loadTasks(); loadTasks();
} }
...@@ -349,9 +349,9 @@ async function removeTask(id) { ...@@ -349,9 +349,9 @@ async function removeTask(id) {
const r = await fetch(ROOT_PATH + '/admin/api/tasks/' + encodeURIComponent(id), {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/tasks/' + encodeURIComponent(id), {method:'DELETE'});
if (!r.ok) { if (!r.ok) {
const e = await r.json().catch(() => ({})); const e = await r.json().catch(() => ({}));
alert(e.detail || 'Remove failed'); showAlert(e.detail || 'Remove failed');
} }
} catch (e) { alert(e.message); } } catch (e) { showAlert(e.message); }
loadTasks(); loadTasks();
} }
......
...@@ -126,15 +126,15 @@ async function createToken() { ...@@ -126,15 +126,15 @@ async function createToken() {
openModal('show-modal'); openModal('show-modal');
loadTokens(); loadTokens();
} else { } else {
const e = await r.json(); alert(e.detail || 'Failed'); const e = await r.json(); showAlert(e.detail || 'Failed');
} }
} catch (e) { alert(e.message); } } catch (e) { showAlert(e.message); }
} }
async function delToken(id) { async function delToken(id) {
if (!confirm('Delete this token? Clients using it will lose access immediately.')) return; if (!confirm('Delete this token? Clients using it will lose access immediately.')) return;
const r = await fetch(ROOT_PATH + '/admin/api/tokens/'+id, {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/tokens/'+id, {method:'DELETE'});
if (r.ok) loadTokens(); else alert('Failed to delete'); if (r.ok) loadTokens(); else showAlert('Failed to delete');
} }
loadTokens(); loadTokens();
......
...@@ -105,7 +105,7 @@ async function delUser(id, name) { ...@@ -105,7 +105,7 @@ async function delUser(id, name) {
if (!confirm('Delete user "' + name + '"?')) return; if (!confirm('Delete user "' + name + '"?')) return;
const r = await fetch(ROOT_PATH + '/admin/api/users/'+id, {method:'DELETE'}); const r = await fetch(ROOT_PATH + '/admin/api/users/'+id, {method:'DELETE'});
if (r.ok) location.reload(); if (r.ok) location.reload();
else { const e = await r.json(); alert(e.detail || 'Failed'); } else { const e = await r.json(); showAlert(e.detail || 'Failed'); }
} }
</script> </script>
{% endblock %} {% endblock %}
...@@ -243,6 +243,33 @@ def log_response_payload(payload, streamed=False): ...@@ -243,6 +243,33 @@ def log_response_payload(payload, streamed=False):
router = APIRouter() router = APIRouter()
def _normalize_vision_content(content: list) -> list:
"""Normalize an OpenAI multipart message content list to the shape the
llama.cpp multimodal (mmproj) handler expects: text parts as
``{"type":"text","text":...}`` and images as
``{"type":"image_url","image_url":{"url": ...}}``. The url may be an http(s)
link or a ``data:image/...;base64,...`` URI — both are accepted. Unknown
parts are dropped to a text placeholder so nothing crashes the handler."""
norm = []
for item in content:
if not isinstance(item, dict):
norm.append({"type": "text", "text": str(item)})
continue
t = item.get("type")
if t == "text" and "text" in item:
norm.append({"type": "text", "text": item["text"]})
elif t in ("image_url", "input_image"):
iu = item.get("image_url") if t == "image_url" else item.get("image")
url = iu.get("url") if isinstance(iu, dict) else iu
if url:
norm.append({"type": "image_url", "image_url": {"url": url}})
elif "text" in item:
norm.append({"type": "text", "text": str(item["text"])})
else:
norm.append({"type": "text", "text": f"[{t or 'unknown'} content]"})
return norm
@router.post("/v1/chat/completions", summary="Chat completions") @router.post("/v1/chat/completions", summary="Chat completions")
async def chat_completions(request: ChatCompletionRequest, http_request: Request = None): async def chat_completions(request: ChatCompletionRequest, http_request: Request = None):
"""Chat completions endpoint with streaming and tool support.""" """Chat completions endpoint with streaming and tool support."""
...@@ -519,6 +546,12 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request ...@@ -519,6 +546,12 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request
"Another model may be using all available VRAM.") "Another model may be using all available VRAM.")
current_manager = mm current_manager = mm
# Does the resolved (loaded) model accept images? True only when an mmproj
# projector was loaded into the llama.cpp backend (see VulkanBackend). When
# set, multipart image content is preserved end-to-end instead of being
# flattened to a text placeholder, so the multimodal handler can see it.
_vision_ok = bool(getattr(getattr(current_manager, 'backend', None), 'supports_vision', False))
# Inject system prompt if --system-prompt flag was provided # Inject system prompt if --system-prompt flag was provided
messages = request.messages messages = request.messages
global_system_prompt = get_global_system_prompt() global_system_prompt = get_global_system_prompt()
...@@ -733,6 +766,14 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request ...@@ -733,6 +766,14 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request
if content is None: if content is None:
content = "" content = ""
elif isinstance(content, list): elif isinstance(content, list):
_has_image = _vision_ok and any(
isinstance(it, dict) and it.get('type') in ('image_url', 'input_image')
for it in content)
if _has_image:
# Vision (mmproj) model: keep OpenAI multipart content so the
# llama.cpp multimodal handler receives the images themselves.
content = _normalize_vision_content(content)
else:
# Handle multipart content array format: [{"type": "text", "text": "..."}] # Handle multipart content array format: [{"type": "text", "text": "..."}]
parts = [] parts = []
for item in content: for item in content:
...@@ -744,7 +785,11 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request ...@@ -744,7 +785,11 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request
else: else:
parts.append(str(item)) parts.append(str(item))
content = '\n'.join(parts) content = '\n'.join(parts)
# Ensure content is never None - convert to string # Ensure content is never None - convert to string (but keep multipart
# vision content as a list so the multimodal handler can consume it).
if isinstance(content, list):
msg_dict["content"] = content
else:
msg_dict["content"] = str(content) if content is not None else "" msg_dict["content"] = str(content) if content is not None else ""
# Handle tool_calls - convert to proper format if present # Handle tool_calls - convert to proper format if present
if msg.tool_calls: if msg.tool_calls:
...@@ -765,8 +810,9 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request ...@@ -765,8 +810,9 @@ async def chat_completions(request: ChatCompletionRequest, http_request: Request
# Handle None content # Handle None content
elif m.get("content") is None: elif m.get("content") is None:
messages_dict[i]["content"] = "" messages_dict[i]["content"] = ""
# Handle content that's not a string (shouldn't happen but be safe) # Handle content that's not a string (shouldn't happen but be safe).
elif not isinstance(m["content"], str): # A list is legitimate multipart vision content — leave it intact.
elif not isinstance(m["content"], str) and not isinstance(m["content"], list):
messages_dict[i]["content"] = str(m["content"]) messages_dict[i]["content"] = str(m["content"])
......
...@@ -20,6 +20,7 @@ ...@@ -20,6 +20,7 @@
import os import os
import json import json
import threading import threading
import time
from typing import AsyncIterator, Optional, Union, List, Dict, Any from typing import AsyncIterator, Optional, Union, List, Dict, Any
from pathlib import Path from pathlib import Path
...@@ -116,6 +117,74 @@ _KV_TYPE_ALIASES = { ...@@ -116,6 +117,74 @@ _KV_TYPE_ALIASES = {
# Sub-8-bit KV types that llama.cpp can only use with flash attention enabled. # Sub-8-bit KV types that llama.cpp can only use with flash attention enabled.
_KV_NEEDS_FLASH = {'q5_0', 'q5_1', 'q5', 'q4_0', 'q4_1', 'q4', 'iq4_nl'} _KV_NEEDS_FLASH = {'q5_0', 'q5_1', 'q5', 'q4_0', 'q4_1', 'q4', 'iq4_nl'}
_GGUF_META_CACHE: dict = {}
def _gguf_block_count(path) -> int:
"""Layer (block) count from a GGUF header (``*.block_count``), 0 if unknown.
Reads only the metadata KV section (no tensors). Cached per path."""
if not path:
return 0
if path in _GGUF_META_CACHE:
return _GGUF_META_CACHE[path]
import struct
result = 0
try:
with open(path, 'rb') as f:
if f.read(4) != b'GGUF':
_GGUF_META_CACHE[path] = 0
return 0
struct.unpack('<I', f.read(4)) # version
struct.unpack('<Q', f.read(8)) # tensor count
n_kv = struct.unpack('<Q', f.read(8))[0]
def rd_str():
ln = struct.unpack('<Q', f.read(8))[0]
return f.read(ln).decode('utf-8', 'replace')
def rd_val(vt):
if vt == 0: return struct.unpack('<B', f.read(1))[0]
if vt == 1: return struct.unpack('<b', f.read(1))[0]
if vt == 2: return struct.unpack('<H', f.read(2))[0]
if vt == 3: return struct.unpack('<h', f.read(2))[0]
if vt == 4: return struct.unpack('<I', f.read(4))[0]
if vt == 5: return struct.unpack('<i', f.read(4))[0]
if vt == 6: return struct.unpack('<f', f.read(4))[0]
if vt == 7: return struct.unpack('<?', f.read(1))[0]
if vt == 8: return rd_str()
if vt == 10: return struct.unpack('<Q', f.read(8))[0]
if vt == 11: return struct.unpack('<q', f.read(8))[0]
if vt == 12: return struct.unpack('<d', f.read(8))[0]
if vt == 9:
et = struct.unpack('<I', f.read(4))[0]
cnt = struct.unpack('<Q', f.read(8))[0]
return [rd_val(et) for _ in range(cnt)]
raise ValueError(f"unknown gguf value type {vt}")
for _ in range(n_kv):
key = rd_str()
val = rd_val(struct.unpack('<I', f.read(4))[0])
if key.endswith('.block_count'):
try:
result = int(val)
except (TypeError, ValueError):
result = 0
break
except Exception:
result = 0
_GGUF_META_CACHE[path] = result
return result
def _free_vram_gb(device: int = 0) -> float:
"""Free VRAM (GB) on the given CUDA device, 0.0 if unavailable."""
try:
import torch
free, _total = torch.cuda.mem_get_info(device)
return free / (1024 ** 3)
except Exception:
return 0.0
def _ggml_kv_type(name): def _ggml_kv_type(name):
"""Map a KV-cache quant name to the llama.cpp GGML type int, or None. """Map a KV-cache quant name to the llama.cpp GGML type int, or None.
...@@ -231,6 +300,7 @@ class VulkanBackend(ModelBackend): ...@@ -231,6 +300,7 @@ class VulkanBackend(ModelBackend):
self.main_gpu = 0 # Default to first GPU self.main_gpu = 0 # Default to first GPU
self.chat_template = None # Detected chat template name self.chat_template = None # Detected chat template name
self.hf_tokenizer = None # HuggingFace tokenizer for apply_chat_template self.hf_tokenizer = None # HuggingFace tokenizer for apply_chat_template
self.supports_vision = False # set True when an mmproj projector is loaded
self.force_cuda = original_backend in ("nvidia", "cuda") # Force CUDA if original was nvidia self.force_cuda = original_backend in ("nvidia", "cuda") # Force CUDA if original was nvidia
if self.force_cuda: if self.force_cuda:
print("DEBUG: GGUF model will use CUDA backend (forced by --backend nvidia)") print("DEBUG: GGUF model will use CUDA backend (forced by --backend nvidia)")
...@@ -713,6 +783,33 @@ class VulkanBackend(ModelBackend): ...@@ -713,6 +783,33 @@ class VulkanBackend(ModelBackend):
self.n_gpu_layers = -1 self.n_gpu_layers = -1
elif n_gpu_layers != -1: elif n_gpu_layers != -1:
self.n_gpu_layers = n_gpu_layers self.n_gpu_layers = n_gpu_layers
else:
# Auto (n_gpu_layers == -1): if the whole model won't fit in free VRAM,
# place as many layers on GPU as fit and leave the rest on CPU instead
# of trying to load everything and OOMing ("failed to create
# llama_context"). llama.cpp has no auto-fit, so we size it ourselves.
try:
_exp = kwargs.get('expected_vram_gb')
_nlayers = _gguf_block_count(model_path)
_free = _free_vram_gb(self.main_gpu if isinstance(self.main_gpu, int) else 0)
if _exp and _exp > 0 and _nlayers and _free > 0 and _exp > _free * 0.95:
# Scale layers on GPU by the VRAM ratio (weights + KV roughly
# scale per-layer). The estimate tends to undercount the KV
# cache at large n_ctx, and a few GB of compute/output buffers
# stay on GPU regardless — so reserve a context-scaled headroom
# and inflate the need, to err toward fitting (CPU layers are
# slow but a failed load is worse).
_headroom = 2.0 + (self.n_ctx or 0) / 12000.0 # ~2 GB + ~1 GB per 12k ctx
_usable = max(0.0, _free - _headroom)
_fit = int(_nlayers * _usable / (_exp * 1.20))
_fit = max(0, min(_nlayers - 1, _fit))
self.n_gpu_layers = _fit
print(f" Auto-offload: model needs ~{_exp:.1f} GB but only "
f"{_free:.1f} GB free — placing {_fit}/{_nlayers} layers on "
f"GPU, {_nlayers - _fit} on CPU (slower). Lower n_ctx or use a "
f"smaller model to keep it fully on GPU.", flush=True)
except Exception as _off_e:
print(f" (auto-offload sizing skipped: {_off_e})", flush=True)
# Configure context size # Configure context size
if no_ram: if no_ram:
...@@ -783,6 +880,35 @@ class VulkanBackend(ModelBackend): ...@@ -783,6 +880,35 @@ class VulkanBackend(ModelBackend):
print(f" KV cache: type_k={_ck or 'f16'} type_v={_cv or 'f16'}" print(f" KV cache: type_k={_ck or 'f16'} type_v={_cv or 'f16'}"
f"{' (flash_attn on)' if _flash else ''}") f"{' (flash_attn on)' if _flash else ''}")
# Multimodal projector (mmproj): pairs a CLIP/vision projector GGUF with
# this text model so it can accept images — the llama.cpp `--mmproj`
# equivalent, which adds vision capability (e.g. gemma). Uses llama.cpp's
# unified mtmd handler, which auto-detects the projector type from the file.
self.supports_vision = False
_mmproj = kwargs.get('mmproj', _raw_cfg.get('mmproj'))
if _mmproj:
_mmproj_path = os.path.expanduser(str(_mmproj))
if not os.path.isfile(_mmproj_path):
# Bare filename / moved cache → look beside the model file.
_cand = os.path.join(os.path.dirname(model_path),
os.path.basename(_mmproj_path))
if os.path.isfile(_cand):
_mmproj_path = _cand
if os.path.isfile(_mmproj_path):
try:
from llama_cpp.llama_chat_format import MTMDChatHandler
llama_kwargs['chat_handler'] = MTMDChatHandler(
clip_model_path=_mmproj_path,
verbose=False,
use_gpu=(self.n_gpu_layers != 0),
)
self.supports_vision = True
print(f" mmproj : {os.path.basename(_mmproj_path)} (vision enabled)")
except Exception as _e:
print(f" mmproj : failed to load projector ({_e}); continuing text-only")
else:
print(f" mmproj : configured path not found ({_mmproj}); skipping")
# Force CUDA if requested # Force CUDA if requested
if self.force_cuda: if self.force_cuda:
# Set environment variable to force CUDA # Set environment variable to force CUDA
...@@ -797,12 +923,44 @@ class VulkanBackend(ModelBackend): ...@@ -797,12 +923,44 @@ class VulkanBackend(ModelBackend):
print(f" GPU offload : {'supported' if gpu_supported else 'NOT supported by this build'}") print(f" GPU offload : {'supported' if gpu_supported else 'NOT supported by this build'}")
_log_cb = _install_layer_log_callback() _log_cb = _install_layer_log_callback()
# Progress feedback during the (otherwise silent) tensor load. llama.cpp's
# progress_callback isn't exposed by the Llama wrapper, so inject it by
# patching the default model-params factory for the duration of construction.
# Kept alive in a local for the whole load (avoids a ctypes use-after-free).
_prog = {'last': -5, 't0': time.time()}
@_llama_cpp.llama_progress_callback
def _progress_cb(progress, user_data):
try:
pct = int(progress * 100)
if pct >= _prog['last'] + 5 or pct >= 100:
_prog['last'] = pct
print(f" Loading model into VRAM/RAM: {pct}%"
f" ({time.time() - _prog['t0']:.0f}s)", flush=True)
except Exception:
pass
return True
# Patch the SUBMODULE attribute — llama.py does `import llama_cpp.llama_cpp
# as llama_cpp` and builds model_params from it, so the top-level package
# attribute is not what it looks up.
_params_mod = getattr(_llama_cpp, 'llama_cpp', _llama_cpp)
_orig_params = _params_mod.llama_model_default_params
def _params_with_progress():
p = _orig_params()
try:
p.progress_callback = _progress_cb
except Exception:
pass
return p
_params_mod.llama_model_default_params = _params_with_progress
try: try:
self.model = Llama(**llama_kwargs) self.model = Llama(**llama_kwargs)
except Exception as e: except Exception as e:
print(f"Error loading GGUF model: {e}") print(f"Error loading GGUF model: {e}")
raise raise
finally: finally:
_params_mod.llama_model_default_params = _orig_params
# Quiet logging after load — but DO NOT drop to NULL + GC the callback. # Quiet logging after load — but DO NOT drop to NULL + GC the callback.
# ggml keeps the log-callback pointer and may still invoke it during # ggml keeps the log-callback pointer and may still invoke it during
# generation (e.g. gemma's iSWA hybrid cache logs every step), so a # generation (e.g. gemma's iSWA hybrid cache logs every step), so a
......
...@@ -81,6 +81,8 @@ class FrontProxy: ...@@ -81,6 +81,8 @@ class FrontProxy:
requests are dispatched to the right engine through the same router/proxy.""" requests are dispatched to the right engine through the same router/proxy."""
cfg = getattr(self.config, "broker", None) cfg = getattr(self.config, "broker", None)
if cfg is None or not getattr(cfg, "enabled", False): if cfg is None or not getattr(cfg, "enabled", False):
print("[front] AISBF broker not started (broker.enabled is false in config)",
flush=True)
return return
try: try:
from codai.broker import build_broker_runtime_config, BrokerConfigError from codai.broker import build_broker_runtime_config, BrokerConfigError
...@@ -142,9 +144,23 @@ class FrontProxy: ...@@ -142,9 +144,23 @@ class FrontProxy:
return ("ok", {"object": "list", "data": [seen[i] for i in order]}) return ("ok", {"object": "list", "data": [seen[i] for i in order]})
async def broker_execute(self, *, method, path, headers, query, body): async def broker_execute(self, *, method, path, headers, query, body):
_clean_path = path.split("?", 1)[0].rstrip("/")
# Brokered capabilities must describe the WHOLE node. Routing this to a
# single engine would report only that engine's CUDA-visible card (its
# torch hardware summary), so a multi-GPU node looks like it has one card.
# Build it here in the (torch-free) front, which enumerates every physical
# GPU via nvidia-smi + sysfs.
if method.upper() == "GET" and _clean_path == "/coderai/capabilities":
from codai.broker.capabilities import (
build_capabilities_document, build_hardware_summary)
import json as _json
doc = build_capabilities_document(hardware=build_hardware_summary())
return {"status_code": 200,
"headers": {"content-type": "application/json"},
"body": _json.dumps(doc).encode()}
# Brokered models.list must reflect the WHOLE node (union across engines), # Brokered models.list must reflect the WHOLE node (union across engines),
# not a single engine's assigned subset. # not a single engine's assigned subset.
if method.upper() == "GET" and path.split("?", 1)[0].rstrip("/") == "/v1/models": if method.upper() == "GET" and _clean_path == "/v1/models":
hdrs = {k: v for k, v in (headers or {}).items() if k.lower() not in _DROP_REQ} hdrs = {k: v for k, v in (headers or {}).items() if k.lower() not in _DROP_REQ}
kind, val = await self.collect_models(hdrs) kind, val = await self.collect_models(hdrs)
if kind == "ok": if kind == "ok":
......
...@@ -709,7 +709,7 @@ def main(): ...@@ -709,7 +709,7 @@ def main():
# Also restrict /v1/models (list_models) to the assigned subset, so the # Also restrict /v1/models (list_models) to the assigned subset, so the
# per-engine model list matches what it actually serves — config_mgr's # per-engine model list matches what it actually serves — config_mgr's
# full models_data is untouched (the admin model list stays complete). # full models_data is untouched (the admin model list stays complete).
multi_model_manager.set_assigned_models(keep) multi_model_manager.set_assigned_models(_keep)
except Exception as _e: except Exception as _e:
print(f"[engine] assignment filter failed ({_e}); registering all models") print(f"[engine] assignment filter failed ({_e}); registering all models")
......
...@@ -943,7 +943,7 @@ class MultiModelManager: ...@@ -943,7 +943,7 @@ class MultiModelManager:
# KV-cache quantization (llama.cpp type_k/type_v) — pass through # KV-cache quantization (llama.cpp type_k/type_v) — pass through
# to the backend, with the raw models.json entry as a fallback. # to the backend, with the raw models.json entry as a fallback.
_raw = config.get('_raw_cfg') if isinstance(config.get('_raw_cfg'), dict) else {} _raw = config.get('_raw_cfg') if isinstance(config.get('_raw_cfg'), dict) else {}
for _kvk in ('cache_type_k', 'cache_type_v'): for _kvk in ('cache_type_k', 'cache_type_v', 'mmproj'):
_kvv = config.get(_kvk) _kvv = config.get(_kvk)
if _kvv is None: if _kvv is None:
_kvv = _raw.get(_kvk) _kvv = _raw.get(_kvk)
...@@ -1062,7 +1062,7 @@ class MultiModelManager: ...@@ -1062,7 +1062,7 @@ class MultiModelManager:
# KV-cache quantization (llama.cpp type_k/type_v) — pass through # KV-cache quantization (llama.cpp type_k/type_v) — pass through
# to the backend, with the raw models.json entry as a fallback. # to the backend, with the raw models.json entry as a fallback.
_raw = config.get('_raw_cfg') if isinstance(config.get('_raw_cfg'), dict) else {} _raw = config.get('_raw_cfg') if isinstance(config.get('_raw_cfg'), dict) else {}
for _kvk in ('cache_type_k', 'cache_type_v'): for _kvk in ('cache_type_k', 'cache_type_v', 'mmproj'):
_kvv = config.get(_kvk) _kvv = config.get(_kvk)
if _kvv is None: if _kvv is None:
_kvv = _raw.get(_kvk) _kvv = _raw.get(_kvk)
......
...@@ -1046,6 +1046,14 @@ def parse_gemma_native_tool_calls(text: str, tool_names=None): ...@@ -1046,6 +1046,14 @@ def parse_gemma_native_tool_calls(text: str, tool_names=None):
if tool_names and name not in tool_names: if tool_names and name not in tool_names:
continue continue
brace = m.end() - 1 # index of '{' brace = m.end() - 1 # index of '{'
# Some models double-wrap the args: call:NAME{{"k":"v"}}. Skip the
# redundant outer brace so the real object is parsed instead of being
# mangled into a single key like '{"k"'.
j = brace + 1
while j < len(text) and text[j] in ' \t\r\n':
j += 1
if j < len(text) and text[j] == '{':
brace = j
try: try:
args, _ = _parse_gemma_loose_object(text, brace) args, _ = _parse_gemma_loose_object(text, brace)
except Exception: except Exception:
......
"""Periodic cleanup of the temporary-working directory.
Several pipelines write scratch files with ``tempfile.NamedTemporaryFile(delete=
False)`` / ``mkdtemp()`` (frame extraction, upscaling, interpolation, dubbing,
voice cloning…). When a generation is interrupted those temp entries are never
removed, so a dedicated ``tmp_dir`` slowly fills up (it had grown to tens of GB).
This background janitor age-prunes that directory: every
``interval_minutes`` it deletes top-level entries whose most-recent mtime is older
than ``max_age_hours``. Age-based pruning means in-flight work (touched recently)
is left alone while abandoned scratch is reclaimed.
Safety: it only ever operates on the *configured* ``tmp_dir`` (a dedicated path).
It refuses to run against a bare system temp dir (/tmp, /var/tmp, …) so it can
never delete other processes' files. Mirrors ``codai.models.ram_monitor`` in
shape: module-level state + ``get_status()``, started once from ``codai.main``.
"""
import os
import shutil
import threading
import time
import logging
from typing import Optional, Dict, Any
_log = logging.getLogger(__name__)
# Paths we must never treat as a prunable dedicated tmp dir.
_FORBIDDEN = {"/", "/tmp", "/var/tmp", "/usr/tmp", "/dev/shm"}
_state_lock = threading.Lock()
_state: Dict[str, Any] = {
"enabled": False,
"tmp_dir": None,
"max_age_hours": None,
"interval_minutes": None,
"last_run_ts": 0.0,
"last_removed": 0,
"total_removed": 0,
"last_freed_bytes": 0,
"runs": 0,
}
_thread: Optional[threading.Thread] = None
_started = False
def get_status() -> Dict[str, Any]:
"""Snapshot for the admin status endpoint / dashboard."""
with _state_lock:
return dict(_state)
def _entry_newest_mtime(path: str) -> float:
"""Most-recent mtime under ``path`` (the entry itself, or the newest file in a
directory tree). Using the newest mtime avoids deleting a directory whose top
folder is old but which still has freshly written files inside."""
try:
newest = os.lstat(path).st_mtime
except OSError:
return 0.0
if os.path.isdir(path) and not os.path.islink(path):
for root, _dirs, files in os.walk(path):
for name in files:
try:
m = os.lstat(os.path.join(root, name)).st_mtime
if m > newest:
newest = m
except OSError:
continue
return newest
def _dir_size(path: str) -> int:
total = 0
if os.path.isdir(path) and not os.path.islink(path):
for root, _dirs, files in os.walk(path):
for name in files:
try:
total += os.lstat(os.path.join(root, name)).st_size
except OSError:
continue
else:
try:
total = os.lstat(path).st_size
except OSError:
total = 0
return total
def _sweep(tmp_dir: str, max_age_seconds: float) -> tuple[int, int]:
"""Remove top-level entries older than the cutoff. Returns (removed, freed)."""
now = time.time()
removed = 0
freed = 0
try:
entries = os.listdir(tmp_dir)
except OSError as e:
_log.debug("tmp janitor: cannot list %s: %s", tmp_dir, e)
return (0, 0)
for name in entries:
path = os.path.join(tmp_dir, name)
try:
if now - _entry_newest_mtime(path) < max_age_seconds:
continue
size = _dir_size(path)
if os.path.isdir(path) and not os.path.islink(path):
shutil.rmtree(path, ignore_errors=True)
else:
os.remove(path)
removed += 1
freed += size
except OSError as e:
_log.debug("tmp janitor: could not remove %s: %s", path, e)
return (removed, freed)
def _run(tmp_dir: str, max_age_hours: float, interval_minutes: float) -> None:
max_age_seconds = max(0.0, max_age_hours) * 3600.0
interval = max(60.0, interval_minutes * 60.0)
while True:
try:
removed, freed = _sweep(tmp_dir, max_age_seconds)
with _state_lock:
_state["last_run_ts"] = time.time()
_state["last_removed"] = removed
_state["total_removed"] += removed
_state["last_freed_bytes"] = freed
_state["runs"] += 1
if removed:
_log.info("tmp janitor: removed %d stale entr%s (%.1f MB) from %s",
removed, "y" if removed == 1 else "ies",
freed / (1024 * 1024), tmp_dir)
except Exception as e: # never let the janitor die
_log.warning("tmp janitor sweep failed: %s", e)
time.sleep(interval)
def start(tmp_dir: Optional[str], enabled: bool = True,
max_age_hours: float = 24.0, interval_minutes: float = 60.0) -> bool:
"""Start the janitor for ``tmp_dir``. No-op (returns False) when disabled, when
no dedicated tmp_dir is configured, or when tmp_dir is a shared system dir."""
global _thread, _started
if _started:
return True
if not enabled or not tmp_dir:
return False
real = os.path.abspath(os.path.expanduser(tmp_dir)).rstrip("/") or "/"
if real in _FORBIDDEN:
_log.info("tmp janitor: refusing to prune shared temp dir %s (set a dedicated tmp_dir)", real)
return False
if not os.path.isdir(real):
try:
os.makedirs(real, exist_ok=True)
except OSError:
return False
with _state_lock:
_state.update({
"enabled": True, "tmp_dir": real,
"max_age_hours": max_age_hours, "interval_minutes": interval_minutes,
})
_thread = threading.Thread(target=_run, args=(real, max_age_hours, interval_minutes),
name="tmp-janitor", daemon=True)
_thread.start()
_started = True
_log.info("tmp janitor: pruning %s every %.0f min (entries older than %.1f h)",
real, interval_minutes, max_age_hours)
return True
def sweep_once(tmp_dir: str, max_age_hours: float = 24.0) -> tuple[int, int]:
"""Run a single prune pass and return (removed, freed_bytes). For cron use."""
real = os.path.abspath(os.path.expanduser(tmp_dir)).rstrip("/") or "/"
if real in _FORBIDDEN or not os.path.isdir(real):
raise SystemExit(f"refusing to prune {real!r} (not a dedicated tmp dir)")
return _sweep(real, max(0.0, max_age_hours) * 3600.0)
if __name__ == "__main__":
# One-shot CLI for cron/systemd-timer use, e.g.:
# */30 * * * * /path/venv/bin/python -m codai.models.tmp_janitor \
# --tmp /storage/coderai/tmp --max-age-hours 24
import argparse
p = argparse.ArgumentParser(description="Prune a dedicated CoderAI temp dir.")
p.add_argument("--tmp", required=True, help="the dedicated tmp_dir to prune")
p.add_argument("--max-age-hours", type=float, default=24.0,
help="delete entries whose newest file is older than this")
a = p.parse_args()
n, b = sweep_once(a.tmp, a.max_age_hours)
print(f"tmp janitor: removed {n} entr{'y' if n == 1 else 'ies'} "
f"({b / (1024 * 1024):.1f} MB) from {a.tmp}")
...@@ -127,9 +127,11 @@ RUN apt-get update && apt-get install -y --no-install-recommends \ ...@@ -127,9 +127,11 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
# The fully assembled CoderAI tree (Python + venvs + tools), copied once. # The fully assembled CoderAI tree (Python + venvs + tools), copied once.
COPY --from=assembler /opt/coderai /opt/coderai COPY --from=assembler /opt/coderai /opt/coderai
# Now the standalone interpreter exists, activate it for the app + launchers. # Put the standalone interpreter first on PATH. Do NOT set PYTHONHOME globally:
ENV PYTHONHOME=/opt/coderai/python \ # supervisord runs on the system python3 (3.12) and a PYTHONHOME pointing at the
PATH=/opt/coderai/python/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin # standalone 3.13 stdlib breaks it ("No module named 'encodings'"). The standalone
# python is relocatable, and the per-service launchers set PYTHONHOME themselves.
ENV PATH=/opt/coderai/python/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
WORKDIR /opt/coderai/app WORKDIR /opt/coderai/app
COPY . /opt/coderai/app COPY . /opt/coderai/app
......
# Incremental update of an already-built coderai image.
#
# Re-layers ONLY the application code, launcher scripts and service configs on
# top of an existing base image (the heavy bundle: python, venvs, native libs,
# lip-sync, ds4, parler). Those base layers are inherited unchanged — there is no
# 20 GB bundle recopy — so this builds in seconds even with an empty build cache.
#
# Driven by packaging/linux/update_oci_image.sh, which keeps an immutable
# `coderai:base` tag so repeated updates always start from the same bundle and
# never stack app layers on top of each other.
ARG BASE_IMAGE=coderai:base
FROM ${BASE_IMAGE}
# Refresh the app tree plus the scripts/configs that live outside it. The big
# /opt/coderai/{python,*-venv,local-libs,Wav2Lip,SadTalker,ds4,py310} trees are
# left as inherited layers. (COPY overwrites/adds; a file deleted from the repo
# is pruned by the cleanup RUN below for the known-stale paths.)
COPY . /opt/coderai/app
COPY packaging/linux/launcher/coderai-oci /usr/local/bin/coderai
COPY packaging/linux/launcher/with-env /usr/local/bin/with-env
COPY packaging/linux/launcher/coderai-entrypoint /usr/local/bin/coderai-entrypoint
COPY packaging/linux/launcher/wav2lip /usr/local/bin/wav2lip
COPY packaging/linux/launcher/sadtalker /usr/local/bin/sadtalker
COPY packaging/linux/nginx.conf /etc/nginx/nginx.conf
COPY packaging/linux/supervisord.conf /etc/supervisor/supervisord.conf
COPY packaging/linux/README-RUN.txt /opt/coderai/README-RUN.txt
RUN set -eux; \
chmod +x /usr/local/bin/coderai /usr/local/bin/with-env /usr/local/bin/coderai-entrypoint \
/usr/local/bin/wav2lip /usr/local/bin/sadtalker /opt/coderai/app/coderai; \
mkdir -p /config /models /cache /opt/coderai/app/models; \
rm -rf \
/opt/coderai/app/.git \
/opt/coderai/app/venv* \
/opt/coderai/app/.venv \
/opt/coderai/app/township_output \
/opt/coderai/app/offload \
/opt/coderai/app/dist \
/opt/coderai/app/.packaging-cache; \
find /opt/coderai/app -type d -name __pycache__ -prune -exec rm -rf '{}' +; \
/opt/coderai/python/bin/python3 -c "import importlib.util, sys; m=[n for n in ('fastapi','uvicorn','torch') if importlib.util.find_spec(n) is None]; sys.exit('base image missing: '+', '.join(m) if m else 0)"
# ENTRYPOINT / EXPOSE / VOLUME / ENV / WORKDIR are inherited from the base image.
...@@ -263,7 +263,10 @@ prepare_venv_bundle() { ...@@ -263,7 +263,10 @@ prepare_venv_bundle() {
if [[ -e "$dest_path" && "$bin_path" -ef "$dest_path" ]]; then if [[ -e "$dest_path" && "$bin_path" -ef "$dest_path" ]]; then
continue continue
fi fi
cp -a --remove-destination "$bin_path" "$dest_path" # -L: dereference symlinks so the REAL binary is bundled. /usr/local/bin
# entries are often symlinks to a build dir (e.g. ~/whisper.cpp/build/bin);
# copying the link verbatim leaves a dangling symlink in the image.
cp -aL --remove-destination "$bin_path" "$dest_path"
done done
if [[ ${#LOCAL_BINARIES[@]} -gt 0 ]]; then if [[ ${#LOCAL_BINARIES[@]} -gt 0 ]]; then
......
#!/usr/bin/env sh
# Top-level entrypoint for the CoderAI distributable image.
# Prepares shared state directories and hands off to supervisord, which runs
# nginx + the main server + the bundled tool web UIs on the single published port.
set -eu
: "${CODERAI_CONFIG_DIR:=/config}"
: "${CODERAI_MODELS_DIR:=/models}"
: "${CODERAI_CACHE_DIR:=/cache}"
# Default parler model id; referenced by supervisord even when the parler program
# is disabled, so it must always be defined.
: "${CODERAI_PARLER_MODEL:=parler-tts/parler-tts-mini-multilingual}"
# Dedicated temp dir on the cache volume, shared by the server and the tool
# processes (so scratch from upscaling/lip-sync/ffmpeg lands in one place). The
# server's built-in janitor age-prunes it; see CODERAI_TMP below.
: "${CODERAI_TMP:=$CODERAI_CACHE_DIR/coderai-tmp}"
export TMPDIR="$CODERAI_TMP" TMP="$CODERAI_TMP" TEMP="$CODERAI_TMP"
# Don't write .pyc into the read-only /opt/coderai tree (esp. when run as --user).
export PYTHONDONTWRITEBYTECODE=1
export CODERAI_CONFIG_DIR CODERAI_MODELS_DIR CODERAI_CACHE_DIR CODERAI_PARLER_MODEL CODERAI_TMP
mkdir -p \
"$CODERAI_CONFIG_DIR/coderai" \
"$CODERAI_MODELS_DIR/coderai" \
"$CODERAI_CACHE_DIR/coderai" \
"$CODERAI_CACHE_DIR/township_output" \
"$CODERAI_CACHE_DIR/videogen_output" \
"$CODERAI_TMP" \
/tmp/nginx-client-body /tmp/nginx-proxy /tmp/nginx-fastcgi \
/tmp/nginx-uwsgi /tmp/nginx-scgi
# Seed the ds4 working dir on the cache volume from the bundled binary + scripts
# (DeepSeek-V4 weights download here at runtime, so it must be writable/persistent).
if [ -d /opt/coderai/ds4 ] && [ ! -e "$CODERAI_CACHE_DIR/ds4/ds4-server" ]; then
mkdir -p "$CODERAI_CACHE_DIR/ds4"
cp -an /opt/coderai/ds4/. "$CODERAI_CACHE_DIR/ds4/" 2>/dev/null || true
fi
# If invoked with arguments, run them directly (debugging / one-off commands)
# instead of the supervised stack.
if [ "$#" -gt 0 ]; then
exec "$@"
fi
# supervisord runs on the system python3; a leaked PYTHONHOME (pointing at the
# standalone 3.13) would break it. The per-service launchers set their own.
unset PYTHONHOME
exec /usr/bin/supervisord -c /etc/supervisor/supervisord.conf
...@@ -72,8 +72,47 @@ if changed: ...@@ -72,8 +72,47 @@ if changed:
PY PY
fi fi
# Point the server at the shared dedicated temp dir so its janitor prunes it. # Optional debug logging. CODERAI_DEBUG selects coderai's --debug* flags:
if [ -n "${CODERAI_TMP:-}" ]; then # all -> every debug flag
exec /opt/coderai/python/bin/python3 /opt/coderai/app/coderai --config "$CONFIG_DIR" --tmp "$CODERAI_TMP" "$@" # 1|true|yes|on -> just --debug
# "engine,ws,..." -> --debug-engine --debug-ws ... (bare names get --debug- prefixed;
# full "--debug-foo" tokens are passed through; comma OR space separated)
DEBUG_ARGS=""
case "${CODERAI_DEBUG:-}" in
"") : ;;
all|ALL|All)
DEBUG_ARGS="--debug --debug-ws --debug-web --debug-thermal --debug-lora --debug-requests --debug-engine --debug-engine-web" ;;
1|true|TRUE|yes|YES|on|ON)
DEBUG_ARGS="--debug" ;;
*)
for _f in $(echo "$CODERAI_DEBUG" | tr ',' ' '); do
case "$_f" in
--*) DEBUG_ARGS="$DEBUG_ARGS $_f" ;;
debug) DEBUG_ARGS="$DEBUG_ARGS --debug" ;;
*) DEBUG_ARGS="$DEBUG_ARGS --debug-$_f" ;;
esac
done ;;
esac
# --debug-* flags need --debug present to take effect; add it if the user picked
# only sub-flags.
case " $DEBUG_ARGS " in *" --debug "*) : ;; *[!\ ]*) DEBUG_ARGS="--debug$DEBUG_ARGS" ;; esac
# Assemble the server argv: --config, optional --tmp, debug flags, then passthrough.
set -- --config "$CONFIG_DIR" "$@"
[ -n "${CODERAI_TMP:-}" ] && set -- "$@" --tmp "$CODERAI_TMP"
CODERAI_BIN="/opt/coderai/python/bin/python3 /opt/coderai/app/coderai"
# Optional host-tailable file log. CODERAI_LOG_FILE should point under a mounted
# volume (e.g. /cache/logs/coderai.log) so it's visible + tailable on the host.
# We tee so output still reaches `docker logs` too. (supervisord runs this script
# with killasgroup, so the coderai front + its engine subprocesses + tee are all
# torn down together on stop.)
if [ -n "${CODERAI_LOG_FILE:-}" ]; then
mkdir -p "$(dirname "$CODERAI_LOG_FILE")" 2>/dev/null || true
echo "[coderai-oci] debug='${CODERAI_DEBUG:-off}' → logging to $CODERAI_LOG_FILE" >&2
# shellcheck disable=SC2086
exec $CODERAI_BIN "$@" $DEBUG_ARGS 2>&1 | tee -a "$CODERAI_LOG_FILE"
fi fi
exec /opt/coderai/python/bin/python3 /opt/coderai/app/coderai --config "$CONFIG_DIR" "$@" # shellcheck disable=SC2086
exec $CODERAI_BIN "$@" $DEBUG_ARGS
#!/usr/bin/env bash
# CLI shim for SadTalker talking-head generation, run in the shared lip-sync venv.
# codai/api/video.py invokes:
# sadtalker --driven_audio AUDIO --source_video VIDEO --result_dir DIR
# SadTalker animates a still image, so a source video is reduced to its first frame.
#
# Checkpoints are NOT baked into the image: on first use they download into the
# writable working dir (a /cache volume in the container) and persist there.
set -euo pipefail
VENV="${CODERAI_LIPSYNC_VENV:-$HOME/.coderai/lipsync_venv}"
SRC="${CODERAI_SADTALKER_SRC:-$HOME/.coderai/SadTalker}" # baked read-only repo code
DIR="${CODERAI_SADTALKER_DIR:-$SRC}" # writable working copy
if [ ! -x "$VENV/bin/python" ]; then
echo "sadtalker: lip-sync venv not found at $VENV" >&2
exit 127
fi
if [ ! -f "$DIR/inference.py" ]; then
mkdir -p "$DIR"
rsync -a --exclude 'checkpoints/*' --exclude 'gfpgan/weights/*' "$SRC/" "$DIR/"
fi
# Download checkpoints on first use (idempotent).
mkdir -p "$DIR/checkpoints" "$DIR/gfpgan/weights"
_dl(){ if [ ! -s "$2" ]; then echo "sadtalker: downloading $(basename "$2") …" >&2;
curl -fSL --retry 3 -o "$2" "$1" || { echo "sadtalker: download failed: $1" >&2; exit 1; }; fi; }
_b="https://github.com/OpenTalker/SadTalker/releases/download/v0.0.2-rc"
_dl "$_b/mapping_00109-model.pth.tar" "$DIR/checkpoints/mapping_00109-model.pth.tar"
_dl "$_b/mapping_00229-model.pth.tar" "$DIR/checkpoints/mapping_00229-model.pth.tar"
_dl "$_b/SadTalker_V0.0.2_256.safetensors" "$DIR/checkpoints/SadTalker_V0.0.2_256.safetensors"
_dl "$_b/SadTalker_V0.0.2_512.safetensors" "$DIR/checkpoints/SadTalker_V0.0.2_512.safetensors"
_dl "https://github.com/xinntao/facexlib/releases/download/v0.1.0/alignment_WFLW_4HG.pth" "$DIR/gfpgan/weights/alignment_WFLW_4HG.pth"
_dl "https://github.com/xinntao/facexlib/releases/download/v0.1.0/detection_Resnet50_Final.pth" "$DIR/gfpgan/weights/detection_Resnet50_Final.pth"
_dl "https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.4.pth" "$DIR/gfpgan/weights/GFPGANv1.4.pth"
_dl "https://github.com/xinntao/facexlib/releases/download/v0.2.2/parsing_parsenet.pth" "$DIR/gfpgan/weights/parsing_parsenet.pth"
driven=""; result=""; source_img=""; source_video=""
extra=()
while [ "$#" -gt 0 ]; do
case "$1" in
--driven_audio) driven="$2"; shift 2;;
--source_video) source_video="$2"; shift 2;;
--source_image) source_img="$2"; shift 2;;
--result_dir) result="$2"; shift 2;;
*) extra+=("$1"); shift;;
esac
done
result="${result:-./results}"
mkdir -p "$result"
cleanup_img=""
if [ -z "$source_img" ] && [ -n "$source_video" ]; then
source_img="$(mktemp --suffix=.png)"
cleanup_img="$source_img"
ffmpeg -y -i "$source_video" -frames:v 1 "$source_img" -loglevel error
fi
work="$(mktemp -d)"
trap 'rm -rf "$work"' EXIT
cd "$work"
export PYTHONPATH="$DIR${PYTHONPATH:+:$PYTHONPATH}"
set +e
"$VENV/bin/python" "$DIR/inference.py" \
--driven_audio "$driven" \
--source_image "$source_img" \
--result_dir "$result" \
--checkpoint_dir "$DIR/checkpoints" \
${extra[@]+"${extra[@]}"}
rc=$?
set -e
[ -n "$cleanup_img" ] && rm -f "$cleanup_img" || true
newest="$(find "$result" -type f -name '*.mp4' -printf '%T@ %p\n' 2>/dev/null | sort -rn | head -1 | cut -d' ' -f2-)"
if [ -n "$newest" ] && [ "$(dirname "$newest")" != "$result" ]; then
cp -f "$newest" "$result/"
fi
exit $rc
#!/usr/bin/env bash
# CLI shim for Wav2Lip lip-sync, run inside the shared lip-sync venv.
# codai/api/video.py invokes: wav2lip --face VIDEO --audio AUDIO --outfile OUT
#
# Checkpoints are NOT baked into the image: on first use they download into the
# writable working dir (a /cache volume in the container) and persist there.
set -euo pipefail
VENV="${CODERAI_LIPSYNC_VENV:-$HOME/.coderai/lipsync_venv}"
SRC="${CODERAI_WAV2LIP_SRC:-$HOME/.coderai/Wav2Lip}" # baked read-only repo code
DIR="${CODERAI_WAV2LIP_DIR:-$SRC}" # writable working copy
if [ ! -x "$VENV/bin/python" ]; then
echo "wav2lip: lip-sync venv not found at $VENV" >&2
exit 127
fi
# Seed a writable copy of the repo code if the working dir isn't populated
# (the image ships the code read-only under /opt; weights are excluded).
if [ ! -f "$DIR/inference.py" ]; then
mkdir -p "$DIR"
rsync -a --exclude 'checkpoints/' --exclude 'face_detection/detection/sfd/*.pth' "$SRC/" "$DIR/"
fi
# Download checkpoints on first use (idempotent: skips non-empty files).
mkdir -p "$DIR/checkpoints" "$DIR/face_detection/detection/sfd"
_dl(){ if [ ! -s "$2" ]; then echo "wav2lip: downloading $(basename "$2") …" >&2;
curl -fSL --retry 3 -o "$2" "$1" || { echo "wav2lip: download failed: $1" >&2; exit 1; }; fi; }
_dl "https://huggingface.co/camenduru/Wav2Lip/resolve/main/checkpoints/wav2lip_gan.pth" \
"$DIR/checkpoints/wav2lip_gan.pth"
_dl "https://www.adrianbulat.com/downloads/python-fan/s3fd-619a316812.pth" \
"$DIR/face_detection/detection/sfd/s3fd.pth"
CKPT="${CODERAI_WAV2LIP_CKPT:-$DIR/checkpoints/wav2lip_gan.pth}"
# Run from a writable scratch dir (inference.py writes ./temp/*), repo on PYTHONPATH.
work="$(mktemp -d)"
trap 'rm -rf "$work"' EXIT
cd "$work"
mkdir -p temp
export PYTHONPATH="$DIR${PYTHONPATH:+:$PYTHONPATH}"
"$VENV/bin/python" "$DIR/inference.py" --checkpoint_path "$CKPT" "$@"
#!/usr/bin/env sh
# Set the CoderAI runtime environment (standalone Python, bundled native libs,
# nvidia wheel libs) then exec the given command. Used by supervisord to launch
# the bundled tool web UIs with the same library environment as the main server.
set -eu
export PYTHONHOME=/opt/coderai/python
export PATH="/opt/coderai/python/bin:$PATH"
NV="/opt/coderai/python/lib/python3.13/site-packages/nvidia"
LIBS="/opt/coderai/python/lib:/opt/coderai/local-libs"
if [ -d "$NV" ]; then
for d in "$NV"/*/lib; do
[ -d "$d" ] && LIBS="$LIBS:$d"
done
fi
export LD_LIBRARY_PATH="$LIBS${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
exec "$@"
# CoderAI single-port reverse proxy (in-container).
# Fronts the main server and the bundled tool web UIs on one published port.
# nginx runs in the foreground under supervisord (daemon off).
# No `user` directive: when the container runs as root, the master stays root and
# spawns workers as nobody; when run with `--user UID`, nginx runs entirely as that
# UID. All writable state below lives under /tmp so non-root runs work unchanged.
worker_processes auto;
daemon off;
pid /tmp/nginx.pid;
error_log /dev/stderr info;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
sendfile on;
server_tokens off;
access_log /dev/stdout;
# Writable temp paths under /tmp so the listed user (root or --user UID) can
# always create them; the defaults under /var/lib/nginx are root-only.
client_body_temp_path /tmp/nginx-client-body;
proxy_temp_path /tmp/nginx-proxy;
fastcgi_temp_path /tmp/nginx-fastcgi;
uwsgi_temp_path /tmp/nginx-uwsgi;
scgi_temp_path /tmp/nginx-scgi;
# AI workloads: large uploads (images/audio/video) and long generations.
client_max_body_size 4096m;
proxy_read_timeout 3600s;
proxy_send_timeout 3600s;
proxy_connect_timeout 75s;
# Shared proxy headers. CoderAI builds public URLs from these
# (codai/api/urlutils.py); the tools honour X-Forwarded-Prefix for sub-paths.
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Host $host;
upstream coderai { server 127.0.0.1:18776; }
upstream editor { server 127.0.0.1:8420; }
upstream videogen { server 127.0.0.1:7790; }
upstream township { server 127.0.0.1:7788; }
server {
listen 8776 default_server;
listen [::]:8776 default_server;
server_name _;
# --- Video editor: https://host:8776/editor/ -------------------------
location /editor/ {
proxy_pass http://editor/; # trailing slash strips the prefix
proxy_set_header X-Forwarded-Prefix /editor;
proxy_request_buffering off; # stream large uploads through
proxy_buffering off; # SSE progress
}
# --- Videogen studio: https://host:8776/videogen/ -------------------
location /videogen/ {
proxy_pass http://videogen/;
proxy_set_header X-Forwarded-Prefix /videogen;
proxy_request_buffering off;
proxy_buffering off;
}
# --- Township fighters: https://host:8776/township/ ----------------
location /township/ {
proxy_pass http://township/;
proxy_set_header X-Forwarded-Prefix /township;
proxy_request_buffering off;
proxy_buffering off;
}
# --- CoderAI server + OpenAI API at the root ------------------------
location / {
proxy_pass http://coderai;
proxy_buffering off; # SSE: chat stream + task progress
}
}
}
...@@ -16,6 +16,17 @@ DATA_ROOT="$PWD/coderai-runtime" ...@@ -16,6 +16,17 @@ DATA_ROOT="$PWD/coderai-runtime"
DETACH=0 DETACH=0
NAME="coderai" NAME="coderai"
EXTRA_ARGS=() EXTRA_ARGS=()
# Optional: map an EXISTING local config dir + real data dirs so the image runs
# against your live config/models without a rebuild (an image is immutable; this
# is purely run-time bind-mounts). See --config-dir / --local / --map below.
CONFIG_DIR_SRC=""
INPLACE_CONFIG=0
MAPS=()
# Optional debug logging: CODERAI_DEBUG selects coderai's --debug* flags inside
# the container; LOG_FILE_CONT is the in-container log path (under a mounted
# volume so it's tailable on the host).
DEBUG_SPEC=""
LOG_FILE_CONT=""
usage() { usage() {
cat <<'EOF' cat <<'EOF'
...@@ -32,8 +43,26 @@ Options: ...@@ -32,8 +43,26 @@ Options:
--data-dir PATH Directory for config/models/cache (default: ./coderai-runtime). --data-dir PATH Directory for config/models/cache (default: ./coderai-runtime).
--name NAME Container name (default: coderai). --name NAME Container name (default: coderai).
-d, --detach Run in background. -d, --detach Run in background.
--config-dir PATH Use an EXISTING config dir (with config.json/models.json),
mounted at /config/coderai. Copied to a temp dir by default
so the image's host/port rewrite leaves your dir untouched.
--local Shortcut for --config-dir ~/.coderai.
--inplace-config Mount --config-dir in place (the image WILL edit host/port).
--map HOST[:CONT] Bind-mount a host dir at the SAME path (or HOST:CONT) inside
the container, so absolute paths in models.json resolve
(e.g. --map /AI/guffcache). Repeatable.
--debug[=SPEC] Run coderai with debug flags. SPEC (default 'all'):
all | engine,requests,ws,web,thermal,lora,engine-web
Also writes a host-tailable file log (see --log-file).
--log-file PATH In-container log path (default /cache/logs/coderai.log,
visible on the host under the cache mount). Implies a file
log even without --debug. tee'd, so `docker logs` still works.
-- ARGS Extra args passed to the container engine before the image name. -- ARGS Extra args passed to the container engine before the image name.
-h, --help Show this help. -h, --help Show this help.
Test against your live config + data (no rebuild):
packaging/linux/run_oci.sh --nvidia --local \
--map /AI/guffcache --map /AI/huggingface --map /AI/offloads
EOF EOF
} }
...@@ -53,6 +82,19 @@ while [[ $# -gt 0 ]]; do ...@@ -53,6 +82,19 @@ while [[ $# -gt 0 ]]; do
--name) --name)
[[ $# -ge 2 ]] || { echo "Error: --name requires a value" >&2; exit 2; } [[ $# -ge 2 ]] || { echo "Error: --name requires a value" >&2; exit 2; }
NAME="$2"; shift 2 ;; NAME="$2"; shift 2 ;;
--config-dir)
[[ $# -ge 2 ]] || { echo "Error: --config-dir requires a path" >&2; exit 2; }
CONFIG_DIR_SRC="$2"; shift 2 ;;
--local) CONFIG_DIR_SRC="$HOME/.coderai"; shift ;;
--inplace-config) INPLACE_CONFIG=1; shift ;;
--map)
[[ $# -ge 2 ]] || { echo "Error: --map requires HOST[:CONT]" >&2; exit 2; }
MAPS+=("$2"); shift 2 ;;
--debug) DEBUG_SPEC="all"; shift ;;
--debug=*) DEBUG_SPEC="${1#*=}"; shift ;;
--log-file)
[[ $# -ge 2 ]] || { echo "Error: --log-file requires a path" >&2; exit 2; }
LOG_FILE_CONT="$2"; shift 2 ;;
-d|--detach) DETACH=1; shift ;; -d|--detach) DETACH=1; shift ;;
--) --)
shift shift
...@@ -90,7 +132,61 @@ volume_suffix="" ...@@ -90,7 +132,61 @@ volume_suffix=""
if [[ "$ENGINE" == "podman" ]]; then if [[ "$ENGINE" == "podman" ]]; then
volume_suffix=":Z" volume_suffix=":Z"
fi fi
args+=(-v "$DATA_ROOT/config:/config$volume_suffix" -v "$DATA_ROOT/models:/models$volume_suffix" -v "$DATA_ROOT/cache:/cache$volume_suffix")
# Config mount: either the fresh scratch dir, or an EXISTING local config dir
# mounted at /config/coderai (where the image launcher reads config.json).
CONFIG_NOTE="$DATA_ROOT/config (fresh)"
if [[ -n "$CONFIG_DIR_SRC" ]]; then
[[ -d "$CONFIG_DIR_SRC" ]] || { echo "Error: --config-dir '$CONFIG_DIR_SRC' not found" >&2; exit 2; }
CONFIG_DIR_SRC="$(cd "$CONFIG_DIR_SRC" && pwd)"
if [[ "$INPLACE_CONFIG" == "1" ]]; then
CFG_MOUNT="$CONFIG_DIR_SRC"
CONFIG_NOTE="$CONFIG_DIR_SRC (in place — image rewrites host/port!)"
else
# Copy ONLY the json config files to a throwaway dir so the image's host/port
# rewrite never touches your real config, and we don't copy big subdirs
# (e.g. ~/.coderai/ds4 weights).
CFG_PARENT="$(mktemp -d "${TMPDIR:-/tmp}/coderai-cfg.XXXXXX")"
CFG_MOUNT="$CFG_PARENT/coderai"
mkdir -p "$CFG_MOUNT"
cp -a "$CONFIG_DIR_SRC"/*.json "$CFG_MOUNT/" 2>/dev/null || true
[[ -f "$CFG_MOUNT/config.json" ]] || { echo "Error: no config.json in '$CONFIG_DIR_SRC'" >&2; exit 2; }
CONFIG_NOTE="$CONFIG_DIR_SRC$CFG_MOUNT (copy; original untouched)"
fi
args+=(-v "$CFG_MOUNT:/config/coderai$volume_suffix" \
-v "$DATA_ROOT/models:/models$volume_suffix" -v "$DATA_ROOT/cache:/cache$volume_suffix")
else
args+=(-v "$DATA_ROOT/config:/config$volume_suffix" -v "$DATA_ROOT/models:/models$volume_suffix" -v "$DATA_ROOT/cache:/cache$volume_suffix")
fi
# 1:1 (or HOST:CONT) data mounts so absolute paths in models.json resolve.
for m in "${MAPS[@]:-}"; do
[[ -n "$m" ]] || continue
host="${m%%:*}"; cont="${m#*:}"; [[ "$m" == *:* ]] || cont="$host"
if [[ -d "$host" ]]; then
args+=(-v "$host:$cont$volume_suffix")
else
echo "Warning: --map source '$host' not found; skipping" >&2
fi
done
# Debug flags + host-tailable file log. A file log is enabled by --debug or
# --log-file; default path lives under /cache so it lands on the host mount.
LOG_HOST_NOTE="(none)"
if [[ -n "$DEBUG_SPEC" || -n "$LOG_FILE_CONT" ]]; then
: "${LOG_FILE_CONT:=/cache/logs/coderai.log}"
[[ -n "$DEBUG_SPEC" ]] && args+=(-e "CODERAI_DEBUG=$DEBUG_SPEC")
args+=(-e "CODERAI_LOG_FILE=$LOG_FILE_CONT")
# Translate the in-container path to the host path for the banner, for the
# standard /config|/models|/cache mounts.
case "$LOG_FILE_CONT" in
/cache/*) LOG_HOST_NOTE="$DATA_ROOT/cache/${LOG_FILE_CONT#/cache/}" ;;
/models/*) LOG_HOST_NOTE="$DATA_ROOT/models/${LOG_FILE_CONT#/models/}" ;;
/config/*) LOG_HOST_NOTE="$DATA_ROOT/config/${LOG_FILE_CONT#/config/}" ;;
*) LOG_HOST_NOTE="$LOG_FILE_CONT (in-container; mount it to see it on the host)" ;;
esac
fi
args+=("${EXTRA_ARGS[@]}" "$IMAGE_TAG") args+=("${EXTRA_ARGS[@]}" "$IMAGE_TAG")
cat <<EOF cat <<EOF
...@@ -100,6 +196,13 @@ Starting CoderAI OCI container ...@@ -100,6 +196,13 @@ Starting CoderAI OCI container
mode: $MODE mode: $MODE
url: http://127.0.0.1:$PORT/admin url: http://127.0.0.1:$PORT/admin
data: $DATA_ROOT data: $DATA_ROOT
config: $CONFIG_NOTE
debug: ${DEBUG_SPEC:-off}
log: $LOG_HOST_NOTE
EOF EOF
if [[ "$LOG_HOST_NOTE" != "(none)" ]]; then
echo " tail it: tail -F '$LOG_HOST_NOTE'"
fi
exec "$ENGINE" "${args[@]}" exec "$ENGINE" "${args[@]}"
#!/usr/bin/env bash
# Smoke test for the all-in-one CoderAI image: brings the container up and checks
# that nginx + the bundled services answer, and that every external binary/worker
# we rely on is present and runnable. Does NOT load models (no weights needed).
#
# Usage: [DOCKER="sudo docker"] [GPU=--gpus=all] ./smoke_test_services.sh [IMAGE]
set -uo pipefail
DOCKER_BIN="${DOCKER:-docker}"
read -r -a DK <<< "$DOCKER_BIN"
IMAGE="${1:-coderai:dist}"
PORT="${PORT:-18080}"
NAME="coderai-smoke-$$"
GPU="${GPU:-}"
TMP="$(mktemp -d)"
fails=0
note(){ printf '%-52s %s\n' "$1" "$2"; }
ok(){ note "$1" "OK"; }
bad(){ note "$1" "FAIL — $2"; fails=$((fails+1)); }
cleanup(){ "${DK[@]}" rm -f "$NAME" >/dev/null 2>&1 || true; rm -rf "$TMP"; }
trap cleanup EXIT
echo "== starting $IMAGE as $NAME (port $PORT) =="
mkdir -p "$TMP/config" "$TMP/models" "$TMP/cache"
# shellcheck disable=SC2086
"${DK[@]}" run -d --name "$NAME" $GPU --ipc=host \
--user "$(id -u):$(id -g)" \
-p "$PORT:8776" \
-v "$TMP/config:/config" -v "$TMP/models:/models" -v "$TMP/cache:/cache" \
"$IMAGE" >/dev/null || { echo "container failed to start"; exit 1; }
echo "== waiting for the front to answer =="
up=0
for _ in $(seq 1 60); do
code="$(curl -s -o /dev/null -w '%{http_code}' "http://127.0.0.1:$PORT/" || true)"
# Any non-5xx HTTP code means the front + coderai are up (the root path itself
# 404s — the UI lives at /admin); 502/503 means the upstream isn't ready yet.
case "$code" in 200|301|302|307|401|403|404) up=1; break;; esac
if ! "${DK[@]}" ps -q --filter "name=$NAME" | grep -q .; then
echo "container exited early; logs:"; "${DK[@]}" logs "$NAME" 2>&1 | tail -40; exit 1
fi
sleep 3
done
[ "$up" = 1 ] && ok "front http://…:$PORT/ responds" || bad "front /" "no response"
echo "== sub-path mounts =="
for p in editor videogen township; do
code="$(curl -s -o /dev/null -w '%{http_code}' "http://127.0.0.1:$PORT/$p/" || true)"
case "$code" in 200|301|302|307) ok "/$p/ ($code)";; *) bad "/$p/" "http $code";; esac
done
echo "== bundled binaries on PATH =="
for b in ffmpeg ffprobe vulkaninfo nginx supervisord whisper-server ds4-server wav2lip sadtalker lspci; do
if "${DK[@]}" exec "$NAME" sh -lc "command -v $b >/dev/null 2>&1"; then ok "bin: $b"; else bad "bin: $b" "missing"; fi
done
echo "== ds4 seeded on the cache volume =="
if "${DK[@]}" exec "$NAME" sh -lc "test -x /cache/ds4/ds4-server"; then ok "/cache/ds4/ds4-server"; else bad "/cache/ds4/ds4-server" "missing"; fi
echo "== shared lip-sync venv (py3.10 + torch) =="
if "${DK[@]}" exec "$NAME" /opt/coderai/lipsync_venv/bin/python -c "import torch,sys; print(sys.version.split()[0], torch.__version__)" >/dev/null 2>&1; then
ok "lipsync venv imports torch"
else
bad "lipsync venv" "python/torch import failed"
fi
# Repo code is bundled; weights are NOT (download on first lip-sync use).
if "${DK[@]}" exec "$NAME" sh -lc "test -f /opt/coderai/Wav2Lip/inference.py && test -f /opt/coderai/SadTalker/inference.py"; then
ok "lip-sync repo code present"
else
bad "lip-sync repo code" "missing"
fi
echo "== parler overlay present =="
if "${DK[@]}" exec "$NAME" sh -lc "test -d /opt/coderai/parler-venv/site-packages"; then ok "parler overlay"; else bad "parler overlay" "missing"; fi
echo
if [ "$fails" = 0 ]; then echo "SMOKE TEST PASSED"; else echo "SMOKE TEST: $fails failure(s)"; "${DK[@]}" logs "$NAME" 2>&1 | tail -30; fi
exit "$fails"
; Process supervisor for the CoderAI distributable image.
; Starts nginx (public :8776) plus the main server and the bundled tool web UIs,
; all bound to localhost behind nginx. Logs go to stdout/stderr so `docker logs`
; shows everything.
[supervisord]
nodaemon=true
logfile=/dev/null
logfile_maxbytes=0
; pid + control socket under /tmp so the container runs as root OR `--user UID`.
pidfile=/tmp/supervisord.pid
[unix_http_server]
file=/tmp/supervisor.sock
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[program:coderai]
; The OCI launcher seeds /config and binds the main server to localhost:18776.
command=/usr/local/bin/coderai
environment=CODERAI_HOST="127.0.0.1",CODERAI_PORT="18776"
autostart=true
autorestart=true
startsecs=5
stopwaitsecs=30
priority=10
; Signal the whole process group so the front's engine subprocesses (and the
; optional `tee` used for file logging) stop/kill together with the launcher.
stopasgroup=true
killasgroup=true
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
[program:nginx]
command=/usr/sbin/nginx -c /etc/nginx/nginx.conf
autostart=true
autorestart=true
startsecs=3
priority=20
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
[program:video_editor]
command=/usr/local/bin/with-env /opt/coderai/python/bin/python3 /opt/coderai/app/tools/video_editor.py
--no-browser --host 127.0.0.1 --port 8420
--base-url http://127.0.0.1:18776
directory=/opt/coderai/app
autostart=true
autorestart=true
startsecs=5
startretries=5
priority=30
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
[program:videogen]
command=/usr/local/bin/with-env /opt/coderai/python/bin/python3 /opt/coderai/app/tools/videogen.py
--host 127.0.0.1 --web-port 7790
--base-url http://127.0.0.1:18776
--out-dir /cache/videogen_output
directory=/opt/coderai/app
autostart=true
autorestart=true
startsecs=5
startretries=5
priority=30
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
[program:township]
command=/usr/local/bin/with-env /opt/coderai/python/bin/python3 /opt/coderai/app/tools/gen_township_fighters.py
--web-port 7788
--base-url http://127.0.0.1:18776
--out-dir /cache/township_output
directory=/opt/coderai/app
autostart=true
autorestart=true
startsecs=5
startretries=5
priority=30
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
; Parler-TTS runs in its OWN bundled venv (transformers 4.46, pinned). Its
; site-packages is prepended to PYTHONPATH so it shadows the main stack; torch and
; the rest resolve from the standalone Python's site-packages underneathexactly
; the local --system-site-packages layering. Internal-only (not proxied by nginx);
; coderai reaches it via a TTS model config { "service_url": "http://127.0.0.1:8123" }.
; Disabled by default: set autostart=true (or start it from supervisorctl) once a
; parler model is configured. Won't be fatal if the model isn't present.
[program:parler]
command=/usr/local/bin/with-env /opt/coderai/python/bin/python3 /opt/coderai/app/tools/parler_tts_service.py
--model %(ENV_CODERAI_PARLER_MODEL)s --port 8123
environment=PYTHONPATH="/opt/coderai/parler-venv/site-packages"
directory=/opt/coderai/app
autostart=false
autorestart=true
startsecs=10
startretries=3
priority=40
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
#!/usr/bin/env bash
# Fast incremental image update: re-layer ONLY the coderai app code + launcher
# scripts + service configs on top of an already-built image. No 20 GB bundle
# recopy — seconds, not the ~15 min of a full build_oci_image.sh run.
#
# It keeps an immutable `coderai:base` tag (the heavy bundle) and rebuilds the
# shipped `coderai:dist` as base + a thin app layer. Because every update starts
# from the SAME base, app layers never stack up over repeated updates.
#
# Usage:
# [DOCKER="sudo docker"] ./update_oci_image.sh
# BASE_IMAGE=coderai:base TAG=coderai:dist DOCKER="sudo docker" ./update_oci_image.sh
#
# First run seeds coderai:base from the current coderai:dist. To re-baseline the
# bundle (new venv/libs/tools), run build_oci_image.sh and then:
# docker rmi coderai:base # drop the stale base; next update re-seeds it
set -euo pipefail
HERE="$(cd "$(dirname "$0")" && pwd)"
REPO_ROOT="$(cd "$HERE/../.." && pwd)"
DOCKER_BIN="${DOCKER:-docker}"
read -r -a DK <<< "$DOCKER_BIN"
BASE_IMAGE="${BASE_IMAGE:-coderai:base}"
TAG="${TAG:-coderai:dist}"
SEED_FROM="${SEED_FROM:-coderai:dist}"
img_exists(){ "${DK[@]}" image inspect "$1" >/dev/null 2>&1; }
# Seed the immutable base from a previously built full image if it doesn't exist.
if ! img_exists "$BASE_IMAGE"; then
if img_exists "$SEED_FROM"; then
echo "== seeding immutable base '$BASE_IMAGE' from '$SEED_FROM' =="
"${DK[@]}" tag "$SEED_FROM" "$BASE_IMAGE"
else
echo "Base '$BASE_IMAGE' and seed '$SEED_FROM' both missing." >&2
echo "Run packaging/linux/build_oci_image.sh for a full build first." >&2
exit 1
fi
fi
echo "== updating '$TAG' from base '$BASE_IMAGE' (app code only) =="
t0=$(date +%s)
"${DK[@]}" build \
-f "$HERE/Dockerfile.update" \
--build-arg BASE_IMAGE="$BASE_IMAGE" \
-t "$TAG" "$REPO_ROOT"
echo "== done in $(( $(date +%s) - t0 ))s: '$TAG' (base '$BASE_IMAGE' unchanged) =="
echo " Tip: 'docker image prune -f' to drop the now-dangling previous '$TAG' layer."
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment