Commit 766fef3c authored by Stefy Lanza (nextime / spora )'s avatar Stefy Lanza (nextime / spora )

Merge feat/township-match-upload: to-download list, mmproj vision, styled...

Merge feat/township-match-upload: to-download list, mmproj vision, styled modals, broker + packaging
parents 56291911 cbf7f147
......@@ -21,6 +21,18 @@ township_output
dist
dist-package
*.log
tmp
debug.log
CoderAI.gif
# Produced artifacts and tool session/output dirs (mounted as volumes at runtime,
# never baked into the image)
video_editor/sessions
video_editor.config.json
tools/videogen_output
tools/township_output
tools/coderai_media
samples
# Build outputs
build
......
......@@ -17,6 +17,15 @@ __pycache__/
# Debug logs
debug.log
/logs/
# Runtime model cache (downloads, self-quantized checkpoints, job state).
# Root-anchored so it never shadows the tracked codai/models/ source package.
/models/
# Third-party source clone of the GPTQ quantizer — installed into the venv from
# source; the working tree is not part of this repo (it has its own .git).
/GPTQModel/
# Test files
test_*.py
......@@ -33,3 +42,11 @@ township_output/
# Packaging build cache + runtime temp (large artifacts)
.packaging-cache/
tmp/
# Exported image tarballs + local OCI run-state (large artifacts)
dist/
coderai-runtime/
# Video editor sessions + generated media (runtime artifacts)
video_editor/sessions/
tools/coderai_media/
......@@ -286,3 +286,67 @@ safe.
14. Thermal protection is config-driven and model-agnostic (config.json
`thermal`). Don't special-case it per model/backend; it only reads temps and
sleeps. Honour the enable flags and high/resume hysteresis.
================================================================================
## Distributable Docker image (packaging/linux)
================================================================================
All-in-one image: coderai + tools (editor/videogen/township) behind nginx on a
single port (8776), built from the LOCAL install's venv + binaries.
Multi-stage `Dockerfile.oci-venv`:
- assembler stage stages the local bundle into /opt/coderai (python-build-
standalone interpreter + venv site-packages + ldd'd native libs + parler
overlay + lip-sync venv/repos + py310 + ds4). The ~20 GB bundle COPY lives
ONLY here; the runtime stage COPYs the assembled tree ONCE (no double-store).
- runtime stage: apt (nginx/supervisor/vulkan-tools/ffmpeg/...), COPY the
assembled /opt/coderai, then COPY app code → /opt/coderai/app, launchers →
/usr/local/bin, nginx/supervisor confs. Entry = coderai-entrypoint →
supervisord (nginx + main server + tool UIs).
- Do NOT set PYTHONHOME globally (breaks the system-python supervisord); set
PATH only. Bundle dereferences host symlinks (cp -aL) so binaries like
whisper-server are real files in the image, not dangling links.
Full build (slow, ~15 min — rebuilds the bundle):
packaging/linux/build_oci_image.sh # tags coderai:dist
Smoke test (no weights, checks services + every bundled binary):
DOCKER="sudo docker" GPU="--gpus all" PORT=18082 \
packaging/linux/smoke_test_services.sh coderai:dist
Run against your LIVE local config + data (no rebuild — pure bind-mounts):
packaging/linux/run_oci.sh --nvidia --local \
--map /AI/guffcache --map /AI/huggingface --map /AI/offloads
- The image launcher reads config from /config/coderai and runs
`coderai --config /config/coderai`, rewriting server.host/port in config.json.
- `--local` (= --config-dir ~/.coderai) copies ONLY the *.json config files to
a temp dir and mounts it at /config/coderai, so your real config is untouched
(use --inplace-config to edit it directly).
- `--map HOST[:CONT]` bind-mounts a host dir at the SAME path inside the
container so the ABSOLUTE paths in models.json/config.json (gguf/hf caches,
offloads) resolve unchanged. Without these maps the models won't be found.
- `--debug[=SPEC]` runs coderai with --debug* flags (SPEC default 'all';
e.g. `--debug=engine,requests,ws` --debug-engine/--debug-requests/--debug-ws,
`--debug` always auto-added) and writes a host-tailable file log. `--log-file
PATH` sets the in-container log path (default /cache/logs/coderai.log host
under the cache mount). Driven by env CODERAI_DEBUG + CODERAI_LOG_FILE, read
by the coderai-oci launcher, which tees output so `docker logs` still works.
supervisord [program:coderai] uses stopasgroup/killasgroup so the front's
engine subprocesses + the tee are torn down together. NOTE: the launcher +
supervisord.conf are baked in, so changes need a (fast) update_oci_image.sh.
Incremental update (FAST, ~30 s — code-only changes, NO bundle recopy):
DOCKER="sudo docker" packaging/linux/update_oci_image.sh
- `Dockerfile.update` is `FROM coderai:base` and re-layers ONLY the app code +
launchers + service confs. The heavy bundle layers are inherited unchanged.
- Keeps an immutable `coderai:base` (the bundle) and rebuilds `coderai:dist`
as base + a thin app layer. Every update starts from the SAME base, so app
layers never stack across updates. dist and base SHARE the bundle layers —
keeping both costs only the app layer (a few MB), not a second 23 GB.
- First run seeds coderai:base from the current coderai:dist (docker tag).
- Re-baseline the bundle (new venv/libs/tools): run build_oci_image.sh, then
`docker rmi coderai:base` so the next update re-seeds it from the new dist.
- Use this whenever ONLY codai/ app code (or launchers/confs) changed — a full
build_oci_image.sh is wasteful for that.
- CAUTION: COPY adds/overwrites but does NOT delete files removed from the
repo; the cleanup RUN prunes only known-stale paths (.git/venv*/dist/...). A
source file deleted from codai/ lingers in the overlay until a full rebuild.
......@@ -4,7 +4,7 @@
![CoderAI](CoderAI.gif)
An OpenAI-compatible API server to run models on your local GPU with web administration dashboard, supporting multiple GPU backends: NVIDIA (CUDA), AMD (Vulkan), and Intel (Vulkan). Configuration-driven architecture with per-model settings and full multi-modal support.
A multimodal and multi-backend local model orchestrator with an OpenAI-compatible API server to run models on local GPUs, supporting multiple GPU backends: NVIDIA (CUDA), AMD (Vulkan), and Intel (Vulkan). Configuration-driven architecture with per-model settings and full multi-modal support.
## Features
......
......@@ -35,12 +35,13 @@ BACKEND="${1:-all}"
FLASH=false
CUSTOM_VENV=""
PACKAGE=false
DS4=false
# Parse arguments
i=1
for arg in "$@"; do
case $arg in
--flash)
--flash)
FLASH=true
;;
--venv)
......@@ -50,6 +51,9 @@ for arg in "$@"; do
--package)
PACKAGE=true
;;
--ds4)
DS4=true
;;
esac
i=$((i + 1))
done
......@@ -68,6 +72,7 @@ if [[ "$BACKEND" != "nvidia" && "$BACKEND" != "vulkan" && "$BACKEND" != "vulkan-
echo ""
echo "Options:"
echo " --flash - Install Flash Attention 2 for faster inference (NVIDIA only)"
echo " --ds4 - Clone + build the ds4 (DeepSeek V4) native engine"
exit 1
fi
......@@ -755,6 +760,35 @@ package_app() {
echo -e "${YELLOW}Note: The target machine must still provide compatible system GPU/runtime libraries.${NC}"
}
# Optionally clone + build ds4 (DeepSeek V4 native engine). Opt-in via --ds4.
# coderai can also auto-build this at runtime on first use, but doing it here lets
# the OCI/Docker packaging bundle the prebuilt ds4-server binary.
build_ds4() {
local DS4_DIR="${CODERAI_DS4_DIR:-$HOME/.coderai/ds4}"
echo -e "${YELLOW}Building ds4 (DeepSeek V4 engine) → $DS4_DIR ...${NC}"
if [ ! -e "$DS4_DIR/Makefile" ]; then
mkdir -p "$(dirname "$DS4_DIR")"
git clone --depth 1 https://github.com/antirez/ds4 "$DS4_DIR" || {
echo -e "${YELLOW}Warning: could not clone ds4; skipping.${NC}"; return 0; }
fi
local TARGET="cpu"
if command -v nvcc &> /dev/null || [ -d "/usr/local/cuda" ]; then
TARGET="cuda-generic"
elif [ "$(uname -s)" = "Darwin" ]; then
TARGET="" # bare `make` builds the macOS Metal backend
fi
( cd "$DS4_DIR" && make $TARGET ) || {
echo -e "${YELLOW}Warning: ds4 build failed; it can still be built at runtime.${NC}"; return 0; }
if [ -x "$DS4_DIR/ds4-server" ]; then
echo -e "${GREEN}✓ ds4-server built at $DS4_DIR/ds4-server${NC}"
echo -e "${YELLOW}Note: DeepSeek V4 weights are downloaded on first use (multi-GB).${NC}"
fi
}
if [ "$DS4" = true ]; then
build_ds4
fi
# Create .backend file to track which backend was used
echo "$BACKEND" > .backend
......
This diff is collapsed.
......@@ -335,7 +335,7 @@ async function deleteEntry() {
closeDetail();
loadArchive();
} catch(e) {
alert('Delete failed: ' + e.message);
showAlert('Delete failed: ' + e.message);
}
}
......
......@@ -104,6 +104,81 @@ function donateCopy(id, btn) {
</main>
{% endif %}
<!-- Shared confirm / notice modal (replaces window.confirm / window.alert) -->
<div id="confirm-modal" class="modal" onclick="if(event.target===this)document.getElementById('confirm-modal-cancel').click()">
<div class="modal-box" style="max-width:420px">
<div class="modal-head">
<span class="modal-title" id="confirm-modal-title">Confirm</span>
<button class="modal-close" id="confirm-modal-x">&times;</button>
</div>
<div class="modal-body">
<p id="confirm-modal-msg" style="margin:0 0 1.25rem;white-space:pre-wrap"></p>
<div style="display:flex;gap:.5rem;justify-content:flex-end">
<button class="btn btn-ghost" id="confirm-modal-cancel">Cancel</button>
<button class="btn btn-danger" id="confirm-modal-ok">Confirm</button>
</div>
</div>
</div>
</div>
<script>
// Global modal helpers, shared by every admin page. Defined here so templates
// can call showAlert()/showConfirm() instead of window.alert()/window.confirm().
if(typeof window.openModal!=='function') window.openModal=function(id){document.getElementById(id).classList.add('show')};
if(typeof window.closeModal!=='function') window.closeModal=function(id){document.getElementById(id).classList.remove('show')};
window.showConfirm=function(title, msg, okLabel){
return new Promise(resolve => {
document.getElementById('confirm-modal-title').textContent = title;
document.getElementById('confirm-modal-msg').textContent = msg;
const okBtn = document.getElementById('confirm-modal-ok');
const cancelBtn= document.getElementById('confirm-modal-cancel');
const xBtn = document.getElementById('confirm-modal-x');
okBtn.className = 'btn btn-danger';
okBtn.textContent = okLabel || 'Confirm';
cancelBtn.style.display = '';
openModal('confirm-modal');
function cleanup(result){
closeModal('confirm-modal');
okBtn.removeEventListener('click', onOk);
cancelBtn.removeEventListener('click', onCancel);
xBtn.removeEventListener('click', onCancel);
resolve(result);
}
function onOk(){ cleanup(true); }
function onCancel(){ cleanup(false); }
okBtn.addEventListener('click', onOk);
cancelBtn.addEventListener('click', onCancel);
xBtn.addEventListener('click', onCancel);
});
};
// Styled replacement for window.alert(): a single-button notice modal.
window.showAlert=function(msg, title, kind){
return new Promise(resolve => {
if(!title && !kind && /^\s*(error|failed|cannot|could not)\b/i.test(String(msg||''))) kind = 'error';
document.getElementById('confirm-modal-title').textContent =
title || (kind === 'error' ? 'Error' : 'Notice');
document.getElementById('confirm-modal-msg').textContent = msg;
const okBtn = document.getElementById('confirm-modal-ok');
const cancelBtn = document.getElementById('confirm-modal-cancel');
const xBtn = document.getElementById('confirm-modal-x');
okBtn.className = 'btn btn-primary';
okBtn.textContent = 'OK';
cancelBtn.style.display = 'none';
openModal('confirm-modal');
function cleanup(){
closeModal('confirm-modal');
cancelBtn.style.display = '';
okBtn.removeEventListener('click', onOk);
xBtn.removeEventListener('click', onOk);
resolve();
}
function onOk(){ cleanup(); }
okBtn.addEventListener('click', onOk);
xBtn.addEventListener('click', onOk);
});
};
</script>
{% block scripts %}{% endblock %}
</body>
</html>
......@@ -2372,7 +2372,7 @@ const STUDIO_CAPABILITIES = {
optional:[],
notes:[
'Requires <code>insightface</code> and <code>onnxruntime</code>: <code>pip install insightface onnxruntime</code>.',
'The <b>inswapper_128.onnx</b> model is <b>auto-downloaded</b> from HuggingFace on first use (<a href="/admin/models?tab=search&q=inswapper&pipeline=&gguf=no-gguf" class="cap-find-link">deepinsight/inswapper<span class="cap-find-icon">↗</span></a>).',
'The <b>inswapper_128.onnx</b> model is <b>auto-downloaded</b> from HuggingFace on first use (<a href="' + (window.ROOT_PATH||'') + '/admin/models?tab=search&q=inswapper&pipeline=&gguf=no-gguf" class="cap-find-link">deepinsight/inswapper<span class="cap-find-icon">↗</span></a>).',
'No AI model selection needed — this feature uses its own dedicated backend.',
],
backendPath: ROOT_PATH + '/v1/images/faceswap',
......@@ -2386,7 +2386,7 @@ const STUDIO_CAPABILITIES = {
optional:[],
notes:[
'Requires <code>insightface</code> and <code>onnxruntime</code>: <code>pip install insightface onnxruntime</code>.',
'The <b>inswapper_128.onnx</b> model is <b>auto-downloaded</b> from HuggingFace on first use (<a href="/admin/models?tab=search&q=inswapper&pipeline=&gguf=no-gguf" class="cap-find-link">deepinsight/inswapper<span class="cap-find-icon">↗</span></a>).',
'The <b>inswapper_128.onnx</b> model is <b>auto-downloaded</b> from HuggingFace on first use (<a href="' + (window.ROOT_PATH||'') + '/admin/models?tab=search&q=inswapper&pipeline=&gguf=no-gguf" class="cap-find-link">deepinsight/inswapper<span class="cap-find-icon">↗</span></a>).',
'No AI model selection needed — this feature uses its own dedicated backend.',
],
backendPath: ROOT_PATH + '/v1/images/faceswap',
......@@ -2461,14 +2461,14 @@ function capSearchUrl(cap) {
const s = CAP_TO_HF_SEARCH[cap];
if (!s) return null;
const p = new URLSearchParams({ tab:'search', q: s.q, pipeline: s.pipeline, gguf: s.gguf });
return '/admin/models?' + p.toString();
return (window.ROOT_PATH || '') + '/admin/models?' + p.toString();
}
function capMissingHtml(caps, label) {
if (!caps.length) return '';
const links = caps.map(cap => {
const chip = `<span class="cap-chip dim">${cap.replace(/_/g,' ')}</span>`;
if (_localCapSet.has(cap)) {
const url = `/admin/models?local_cap=${encodeURIComponent(cap)}`;
const url = `${window.ROOT_PATH || ''}/admin/models?local_cap=${encodeURIComponent(cap)}`;
return `<a href="${url}" class="cap-find-link" title="You have a local model with ${cap.replace(/_/g,' ')} — click to configure it">${chip}<span class="cap-find-icon" style="color:#6ecf7e">↑ configure</span></a>`;
}
const url = capSearchUrl(cap);
......@@ -4229,12 +4229,12 @@ async function loadCharProfileIntoSlot(prefix, idx, name) {
charSlots[prefix][idx].name = charSlots[prefix][idx].name || d.name;
charSlots[prefix][idx].images = (d.images||[]).map(img => img.data);
renderCharSlots(prefix);
} catch(e) { alert('Failed to load profile: '+e.message); }
} catch(e) { showAlert('Failed to load profile: '+e.message); }
}
async function saveCharSlotAsProfile(prefix, idx) {
const slot = charSlots[prefix]?.[idx];
if (!slot || !slot.images.length) { alert('Add at least one image first.'); return; }
if (!slot || !slot.images.length) { showAlert('Add at least one image first.'); return; }
const name = slot.name || prompt('Profile name:');
if (!name) return;
try {
......@@ -4246,8 +4246,8 @@ async function saveCharSlotAsProfile(prefix, idx) {
charSlots[prefix][idx].name = name;
await loadCharProfileList();
renderCharSlots(prefix);
alert(`Saved profile "${name}"`);
} catch(e) { alert('Save failed: '+e.message); }
showAlert(`Saved profile "${name}"`);
} catch(e) { showAlert('Save failed: '+e.message); }
}
// ─────────────────────────────────────────────────────────────────
......@@ -6051,14 +6051,14 @@ async function profCharView(name) {
try {
const d = await fetch(ROOT_PATH + '/admin/api/characters/'+encodeURIComponent(name)).then(r=>r.json());
_openProfModal(`Character: ${d.name}`, d.description||'', d.images||[]);
} catch(e) { alert('Failed to load character: ' + e.message); }
} catch(e) { showAlert('Failed to load character: ' + e.message); }
}
async function profCharDelete(name) {
if (!confirm(`Delete character profile "${name}"?`)) return;
const r = await fetch(ROOT_PATH + '/admin/api/characters/'+encodeURIComponent(name), {method:'DELETE'});
if (r.ok) await profCharLoad();
else alert('Delete failed: ' + await r.text());
else showAlert('Delete failed: ' + await r.text());
}
......@@ -6139,7 +6139,7 @@ async function profVoiceDelete(name) {
if (!confirm(`Delete voice profile "${name}"?`)) return;
const r = await fetch(ROOT_PATH + '/admin/api/voices/'+encodeURIComponent(name), {method:'DELETE'});
if (r.ok) await profVoiceLoad();
else alert('Delete failed: ' + await r.text());
else showAlert('Delete failed: ' + await r.text());
}
// ─────────────────────────────────────────────────────────────────
......@@ -6296,14 +6296,14 @@ async function profEnvView(name) {
try {
const d = await fetch(ROOT_PATH + '/admin/api/environments/'+encodeURIComponent(name)).then(r=>r.json());
_openProfModal(`Environment: ${d.name}`, d.description||'', d.images||[]);
} catch(e) { alert('Failed to load environment: ' + e.message); }
} catch(e) { showAlert('Failed to load environment: ' + e.message); }
}
async function profEnvDelete(name) {
if (!confirm(`Delete environment profile "${name}"?`)) return;
const r = await fetch(ROOT_PATH + '/admin/api/environments/'+encodeURIComponent(name), {method:'DELETE'});
if (r.ok) await profEnvLoad();
else alert('Delete failed: ' + await r.text());
else showAlert('Delete failed: ' + await r.text());
}
// ─────────────────────────────────────────────────────────────────
......@@ -6528,7 +6528,7 @@ async function deleteCustomPipeline(id) {
_customPipelines = _customPipelines.filter(p => p.id !== id);
if (_editingPipelineId === id) { _editingPipelineId = null; _pbSteps = []; renderBuilderSteps(); }
renderCustomPipelineCards();
} catch(e) { alert('Delete failed: '+e.message); }
} catch(e) { showAlert('Delete failed: '+e.message); }
}
function _renderPipelineResult(outId, progId, d) {
......@@ -6683,7 +6683,7 @@ async function archiveDelete(filename) {
_archiveFiles = _archiveFiles.filter(f => f.filename !== filename);
renderArchive();
} catch(e) {
alert('Delete failed: ' + e.message);
showAlert('Delete failed: ' + e.message);
}
}
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -126,15 +126,15 @@ async function createToken() {
openModal('show-modal');
loadTokens();
} else {
const e = await r.json(); alert(e.detail || 'Failed');
const e = await r.json(); showAlert(e.detail || 'Failed');
}
} catch (e) { alert(e.message); }
} catch (e) { showAlert(e.message); }
}
async function delToken(id) {
if (!confirm('Delete this token? Clients using it will lose access immediately.')) return;
const r = await fetch(ROOT_PATH + '/admin/api/tokens/'+id, {method:'DELETE'});
if (r.ok) loadTokens(); else alert('Failed to delete');
if (r.ok) loadTokens(); else showAlert('Failed to delete');
}
loadTokens();
......
......@@ -105,7 +105,7 @@ async function delUser(id, name) {
if (!confirm('Delete user "' + name + '"?')) return;
const r = await fetch(ROOT_PATH + '/admin/api/users/'+id, {method:'DELETE'});
if (r.ok) location.reload();
else { const e = await r.json(); alert(e.detail || 'Failed'); }
else { const e = await r.json(); showAlert(e.detail || 'Failed'); }
}
</script>
{% endblock %}
......@@ -160,6 +160,32 @@ except ImportError:
pass
class _InternalAuthMiddleware:
"""Reject any HTTP request that doesn't carry the front's internal token.
Active only when CODERAI_INTERNAL_TOKEN is set (i.e. this process is an engine
spawned by the front). It binds 127.0.0.1, but this also blocks anything else on
localhost from talking to the engine directly and bypassing the front. In
single-process mode the token is unset and this is a no-op."""
def __init__(self, app):
self._app = app
self._token = os.environ.get("CODERAI_INTERNAL_TOKEN")
async def __call__(self, scope, receive, send):
if self._token and scope.get("type") == "http":
headers = dict(scope.get("headers", []))
got = headers.get(b"x-coderai-internal", b"").decode("latin-1")
if got != self._token:
await send({"type": "http.response.start", "status": 403,
"headers": [(b"content-type", b"application/json")]})
await send({"type": "http.response.body",
"body": b'{"error":"forbidden: engines are reachable only '
b'through the front proxy"}'})
return
await self._app(scope, receive, send)
class _ForwardedPrefixMiddleware:
"""Populate ASGI root_path from X-Forwarded-Prefix / X-Script-Name headers."""
......@@ -180,6 +206,9 @@ class _ForwardedPrefixMiddleware:
app.add_middleware(_ForwardedPrefixMiddleware)
# Added last → outermost: the internal-token gate runs before anything else, so a
# request without the front's token never reaches a route.
app.add_middleware(_InternalAuthMiddleware)
# Mount static files for admin dashboard
from fastapi.staticfiles import StaticFiles
......@@ -193,6 +222,77 @@ from fastapi.responses import FileResponse, Response as _FaviconResponse
_favicon_path = admin_static_dir / "favicon.ico"
@app.get("/healthz", include_in_schema=False)
async def healthz():
"""Cheap liveness probe that touches no torch/model state.
The front proxy's engine supervisor polls this to distinguish a *slow* engine
(busy loading a model — the event loop may be blocked, so this can be late but
will eventually answer) from a *dead* one (connection refused). It must stay
trivial and dependency-free so it returns the instant the loop is free."""
import os as _os
return {"ok": True, "pid": _os.getpid()}
@app.get("/internal/engine-state", include_in_schema=False)
async def internal_engine_state():
"""Auth-free engine introspection for the front proxy's router/aggregator.
Engines bind 127.0.0.1 only, so this is not publicly reachable. Returns which
models are resident (for model→engine routing) and this engine's GPU/VRAM (for
cross-engine status aggregation). Kept cheap so it answers even mid-generation.
"""
import os as _os
try:
loaded = list(multi_model_manager.models.keys())
except Exception:
loaded = []
vram = None
try:
import torch
if torch.cuda.is_available():
# Sum across every CUDA device this engine can see — an engine may own
# more than one GPU (e.g. two NVIDIA cards sharding one large model), so
# reporting only device 0 would under-count its VRAM.
n = torch.cuda.device_count()
used = free = total = 0
devs = []
for i in range(n):
f, t = torch.cuda.mem_get_info(i)
used += (t - f); free += f; total += t
devs.append({"index": i, "name": torch.cuda.get_device_name(i),
"free": round(f / 1e9, 2), "total": round(t / 1e9, 2)})
label = (torch.cuda.get_device_name(0) if n == 1
else f"{n}× CUDA")
vram = {"used": round(used / 1e9, 2), "free": round(free / 1e9, 2),
"total": round(total / 1e9, 2), "gpu": label,
"devices": devs, "device_count": n}
except Exception:
vram = None
# Running tasks so the front can show cross-engine activity without needing a
# session on this engine (sessions live only on the primary).
tasks = []
try:
from codai.tasks import task_registry
tasks = [t for t in task_registry.list()
if t.get("status") in ("running", "queued", "paused")]
except Exception:
tasks = []
# This engine's thermal cooldown state, so the front can show WHICH engine is
# cooling (each engine pauses on its own GPUs; CPU pauses everything).
cooling = None
try:
from codai.models import thermal
cs = thermal.get_cooldown_state()
if cs.get("active"):
cooling = {"gpu": cs.get("gpu"), "cpu": cs.get("cpu"),
"message": cs.get("message")}
except Exception:
cooling = None
return {"ok": True, "pid": _os.getpid(), "loaded_models": loaded,
"vram": vram, "tasks": tasks, "cooling": cooling}
@app.get("/favicon.ico", include_in_schema=False)
async def favicon():
if _favicon_path.exists():
......
This diff is collapsed.
......@@ -106,6 +106,27 @@ async def create_embeddings(request: EmbeddingsRequest, http_request: Request =
"""
OpenAI-compatible embeddings endpoint.
"""
# Register a task so embeddings appear in the unified task list, like every
# other model type. Finished on success or error below.
from codai.tasks import task_registry
_title = request.input if isinstance(request.input, str) else "embeddings"
_tid = task_registry.register(
"embedding", title=str(_title)[:80], model=(request.model or "embedding"))
task_registry.start(_tid)
try:
_resp = await _run_embeddings(request, http_request)
task_registry.finish(_tid, "done")
return _resp
except HTTPException:
task_registry.finish(_tid, "error")
raise
except Exception as e:
task_registry.finish(_tid, "error", str(e)[:200])
raise
async def _run_embeddings(request: EmbeddingsRequest, http_request: Request = None):
"""Core embeddings logic; registered as a task by create_embeddings()."""
model_info = await asyncio.to_thread(
multi_model_manager.request_model, request.model, model_type="embedding")
model_name = model_info.get('model_name')
......
# CoderAI - OpenAI-compatible API server
# Copyright (C) 2026 Stefy Lanza <stefy@nexlab.net>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
"""Fully-managed Parler-TTS worker.
parler-tts pins an old transformers/tokenizers/huggingface-hub that conflict with
the coderai server's stack, so it can't share this venv. Instead coderai owns the
whole lifecycle here: on first use it bootstraps a dedicated venv (installing
parler-tts), launches ``tools/parler_tts_service.py`` in it as a local HTTP
service, health-checks it, and hands back the URL. The matching
``_RemoteParlerBackend.cleanup()`` calls :func:`stop_service`, so the model
manager's normal eviction tears the process down — no manual setup or config.
"""
import os
import socket
import subprocess
import sys
import threading
import time
from pathlib import Path
_REPO_ROOT = Path(__file__).resolve().parents[2]
_SERVICE_SCRIPT = _REPO_ROOT / "tools" / "parler_tts_service.py"
# Dedicated venv for the (incompatible) parler-tts stack. Created with access to
# the base interpreter's packages so torch/numpy aren't re-downloaded; parler's
# pinned transformers installs into the venv and shadows the system one.
_VENV_DIR = Path(os.environ.get("CODERAI_PARLER_VENV")
or os.path.expanduser("~/.coderai/parler_venv"))
_lock = threading.RLock()
_services: dict[str, dict] = {} # model_name -> {"proc","port","url"}
_bootstrapped = False
def _venv_python() -> Path:
return _VENV_DIR / ("Scripts" if os.name == "nt" else "bin") / (
"python.exe" if os.name == "nt" else "python")
def _pip_ok(py: Path) -> bool:
try:
return subprocess.run([str(py), "-c", "import parler_tts, soundfile"],
capture_output=True).returncode == 0
except Exception:
return False
def _venv_is_system_site() -> bool:
"""True if the venv was built with --system-site-packages (can't isolate)."""
try:
return "include-system-site-packages = true" in \
(_VENV_DIR / "pyvenv.cfg").read_text().lower()
except Exception:
return False
def _bootstrap_venv() -> Path:
"""Create a fully-isolated venv and install parler-tts (idempotent).
Isolation is the whole point: parler-tts pins an old transformers/tokenizers
that must NOT be shared with — or shadowed by — the server's stack, so the
venv gets its own copy of everything (torch included). Returns its python."""
global _bootstrapped
py = _venv_python()
if _bootstrapped and py.exists():
return py
# A previously-created shared-site venv leaks the server's transformers in;
# rebuild it isolated.
if py.exists() and _venv_is_system_site():
import shutil
print("[parler] rebuilding venv as fully isolated …", flush=True)
shutil.rmtree(_VENV_DIR, ignore_errors=True)
if not _venv_python().exists():
print(f"[parler] creating isolated venv at {_VENV_DIR} …", flush=True)
_VENV_DIR.parent.mkdir(parents=True, exist_ok=True)
subprocess.run([sys.executable, "-m", "venv", str(_VENV_DIR)], check=True)
py = _venv_python()
if not _pip_ok(py):
print("[parler] installing parler-tts + torch into the isolated venv "
"(first run, downloads several GB, this can take a while) …", flush=True)
subprocess.run([str(py), "-m", "pip", "install",
"git+https://github.com/huggingface/parler-tts.git",
"soundfile"], check=True)
if not _pip_ok(py):
raise RuntimeError("parler-tts install did not yield an importable package")
_bootstrapped = True
return py
def _free_port() -> int:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(("127.0.0.1", 0))
port = s.getsockname()[1]
s.close()
return port
def _pump_logs(proc: subprocess.Popen, tail):
for line in proc.stdout:
line = line.rstrip()
if line:
tail.append(line)
print(f"[parler] {line}", flush=True)
def _health_ok(url: str) -> bool:
import requests
try:
r = requests.get(url + "/health", timeout=3)
return r.ok and bool(r.json().get("ok"))
except Exception:
return False
def ensure_service(model_name: str, ready_timeout: float = 1800.0) -> str:
"""Start (or reuse) the worker for ``model_name`` and return its base URL.
First call bootstraps the venv and downloads the model, so the timeout is
generous. Raises RuntimeError if the service never comes up."""
with _lock:
svc = _services.get(model_name)
if svc and svc["proc"].poll() is None and _health_ok(svc["url"]):
return svc["url"]
if svc and svc["proc"].poll() is not None:
_services.pop(model_name, None) # died — restart below
py = _bootstrap_venv()
port = _free_port()
url = f"http://127.0.0.1:{port}"
env = dict(os.environ)
# The worker must use the model already pulled via coderai's HF download
# interface — it never downloads anything itself. Point it at coderai's
# cache and force offline mode, so a missing model fails fast instead of
# silently fetching.
try:
from codai.models.cache import get_hf_hub_cache_dir
hub = get_hf_hub_cache_dir()
env["HF_HUB_CACHE"] = hub
env["HUGGINGFACE_HUB_CACHE"] = hub
except Exception:
pass
env["HF_HUB_OFFLINE"] = "1"
env["TRANSFORMERS_OFFLINE"] = "1"
proc = subprocess.Popen(
[str(py), str(_SERVICE_SCRIPT), "--model", model_name,
"--host", "127.0.0.1", "--port", str(port)],
stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True,
bufsize=1, env=env, cwd=str(_REPO_ROOT),
)
import collections
tail = collections.deque(maxlen=15)
threading.Thread(target=_pump_logs, args=(proc, tail), daemon=True).start()
_services[model_name] = {"proc": proc, "port": port, "url": url}
def _tail_msg():
joined = " | ".join(list(tail)[-5:]).strip()
if "offline" in joined.lower() or "not" in joined.lower() and "found" in joined.lower():
return (f". The model isn't in coderai's cache — download "
f"'{model_name}' from the model interface first. ({joined})")
return f". Last output: {joined}" if joined else ""
# Wait (outside the lock) for the service to load the model and answer.
deadline = time.time() + ready_timeout
while time.time() < deadline:
if proc.poll() is not None:
raise RuntimeError(
f"Parler worker exited (code {proc.returncode}) before becoming ready"
+ _tail_msg())
if _health_ok(url):
print(f"[parler] service ready for {model_name} at {url}", flush=True)
return url
time.sleep(2)
stop_service(model_name)
raise RuntimeError(f"Parler worker for {model_name} did not become ready in time"
+ _tail_msg())
def stop_service(model_name: str) -> None:
with _lock:
svc = _services.pop(model_name, None)
if not svc:
return
proc = svc["proc"]
if proc.poll() is None:
try:
proc.terminate()
proc.wait(timeout=10)
except Exception:
pass
if proc.poll() is None:
try:
proc.kill()
except Exception:
pass
print(f"[parler] service for {model_name} stopped", flush=True)
def stop_all() -> None:
for name in list(_services.keys()):
stop_service(name)
import atexit as _atexit
_atexit.register(stop_all)
......@@ -45,6 +45,31 @@ global_args = None
global_file_path = None
def _spatial_task(title: str):
"""Decorator: register a spatial/3D endpoint in the unified task list so
every model type is visible there. Finishes done/error around the call."""
import functools
def deco(fn):
@functools.wraps(fn)
async def wrap(*args, **kwargs):
from codai.tasks import task_registry
tid = task_registry.register("spatial", title=title, model="spatial")
task_registry.start(tid)
try:
result = await fn(*args, **kwargs)
task_registry.finish(tid, "done")
return result
except HTTPException:
task_registry.finish(tid, "error")
raise
except Exception as e:
task_registry.finish(tid, "error", str(e)[:200])
raise
return wrap
return deco
def set_global_args(args):
global global_args
global_args = args
......@@ -500,6 +525,7 @@ class ImageTo3DRequest(BaseModel):
@router.post("/v1/images/to3d", summary="Image to 3D model")
@_spatial_task("Image → 3D")
async def image_to_3d(request: ImageTo3DRequest, http_request: Request = None):
"""Convert a 2D image to a 3D representation.
......@@ -568,6 +594,7 @@ class ImageFrom3DRequest(BaseModel):
@router.post("/v1/images/from3d", summary="Render a 3D model to an image")
@_spatial_task("3D → image")
async def image_from_3d(request: ImageFrom3DRequest, http_request: Request = None):
"""Render a 3D model (GLB/OBJ) to a 2D PNG image from a specified camera angle."""
raw = _decode_b64(request.model_data)
......@@ -601,6 +628,7 @@ class VideoTo3DRequest(BaseModel):
@router.post("/v1/video/to3d", summary="Video to 3D model")
@_spatial_task("Video → 3D")
async def video_to_3d(request: VideoTo3DRequest, http_request: Request = None):
"""Convert a 2D video to a 3D video frame-by-frame.
......@@ -642,6 +670,7 @@ class VideoFrom3DRequest(BaseModel):
@router.post("/v1/video/from3d", summary="Render a 3D model to a video")
@_spatial_task("3D → video")
async def video_from_3d(request: VideoFrom3DRequest, http_request: Request = None):
"""Render a 3D model as a 360° turntable video."""
raw = _decode_b64(request.model_data)
......@@ -675,6 +704,7 @@ class Generate3DRequest(BaseModel):
@router.post("/v1/3d/generate", summary="Generate a 3D model from a prompt")
@_spatial_task("Generate 3D")
async def generate_3d(request: Generate3DRequest, http_request: Request = None):
"""Generate a 3D model (GLB) from a text prompt and/or an image.
......
This diff is collapsed.
......@@ -135,6 +135,32 @@ async def create_transcription(
if len(file_content) > _MAX_AUDIO_BYTES:
raise HTTPException(status_code=413, detail="Audio file too large (max 100 MB)")
# Register a task so transcription appears in the unified task list, like
# every other model type. Finished on success or error below.
from codai.tasks import task_registry
_tid = task_registry.register(
"transcription",
title=(file.filename or "audio")[:80],
model=model or "",
)
task_registry.start(_tid)
try:
_resp = await _run_transcription(
file_content, model, language, prompt, response_format, temperature, file)
task_registry.finish(_tid, "done")
return _resp
except HTTPException:
task_registry.finish(_tid, "error")
raise
except Exception as e:
task_registry.finish(_tid, "error", str(e)[:200])
raise
async def _run_transcription(
file_content: bytes, model: str, language, prompt, response_format, temperature, file
):
"""Core transcription logic; registered as a task by create_transcription()."""
# Check if the requested model maps to a configured whisper-server instance first.
# Try alias round-robin resolution before direct ID lookup.
whisper_model_id = multi_model_manager.resolve_whisper_alias_model_id(model)
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
# CoderAI - OpenAI-compatible API server
# Copyright (C) 2026 Stefy Lanza <stefy@nexlab.net>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
"""ds4 (DeepSeek V4) proxy backend.
ds4-server already speaks the OpenAI HTTP API, so this backend is a thin proxy: it
forwards chat/completion requests to the managed ``ds4-server`` subprocess (whose
lifecycle is owned by :mod:`codai.api.ds4_worker`) and adapts the responses to the
:class:`~codai.backends.base.ModelBackend` contract the model manager expects.
Tool/think parsing is handled the same way as the other backends — by
``ModelParserAdapter`` over the returned text — so tools are not forwarded to
ds4-server; the text-level ``DeepSeekParser`` extracts ``<think>`` and tool calls.
"""
import asyncio
import threading
from typing import AsyncGenerator, Dict, List, Optional
from codai.backends.base import ModelBackend
class Ds4Backend(ModelBackend):
"""Proxy backend that routes generation to a managed ds4-server."""
def __init__(self, cfg=None):
# cfg is a codai.config.Ds4Config. When omitted, resolve the active one.
if cfg is None:
from codai.config import Ds4Config
cfg = Ds4Config()
self._cfg = cfg
self._model_id = getattr(cfg, "model_id", "deepseek-v4") or "deepseek-v4"
self._url: Optional[str] = None
self._ctx = int(getattr(cfg, "ctx", 100000) or 100000)
self._last_usage: Dict = {}
# ------------------------------------------------------------------ #
# lifecycle
# ------------------------------------------------------------------ #
def load_model(self, model_name: str, **kwargs) -> None:
from codai.api import ds4_worker
if model_name:
self._model_id = model_name
self._url = ds4_worker.ensure_service(self._cfg)
def get_model_name(self) -> str:
return self._model_id
def get_context_size(self) -> int:
return self._ctx
def get_last_usage(self) -> dict:
return dict(self._last_usage)
def cleanup(self) -> None:
from codai.api import ds4_worker
ds4_worker.stop_service(getattr(self._cfg, "model_id", self._model_id))
self._url = None
# ------------------------------------------------------------------ #
# helpers
# ------------------------------------------------------------------ #
def _base(self) -> str:
if not self._url:
raise RuntimeError("ds4 service not started")
return self._url
def _store_usage(self, usage: dict) -> None:
if usage:
self._last_usage = {
"prompt_tokens": usage.get("prompt_tokens", 0),
"completion_tokens": usage.get("completion_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
}
def format_messages(self, messages) -> str:
# ds4-server applies DeepSeek V4's own chat template server-side; this is only
# used by callers that need a flat prompt string.
parts = []
for m in messages:
role = m.get("role") if isinstance(m, dict) else getattr(m, "role", "")
content = m.get("content") if isinstance(m, dict) else getattr(m, "content", "")
parts.append(f"{role}: {content}")
return "\n".join(parts)
def _chat_payload(self, messages, max_tokens, temperature, top_p, stop, stream):
payload = {
"model": self._model_id,
"messages": messages,
"temperature": temperature,
"top_p": top_p,
"stream": stream,
}
if max_tokens:
payload["max_tokens"] = max_tokens
if stop:
payload["stop"] = stop
return payload
# ------------------------------------------------------------------ #
# chat-level generation (preferred by the manager)
# ------------------------------------------------------------------ #
def generate_chat(self, messages: List[Dict], max_tokens=None, temperature=0.7,
top_p=1.0, stop=None, tools=None, response_format=None):
import requests
payload = self._chat_payload(messages, max_tokens, temperature, top_p, stop, False)
if response_format and response_format.get("type") == "json_object":
payload["response_format"] = {"type": "json_object"}
r = requests.post(self._base() + "/v1/chat/completions", json=payload, timeout=3600)
r.raise_for_status()
data = r.json()
self._store_usage(data.get("usage", {}))
return data["choices"][0]["message"].get("content") or ""
async def generate_chat_stream(self, messages: List[Dict], max_tokens=None,
temperature=0.7, top_p=1.0, stop=None, tools=None,
response_format=None) -> AsyncGenerator[str, None]:
payload = self._chat_payload(messages, max_tokens, temperature, top_p, stop, True)
async for chunk in self._stream(self._base() + "/v1/chat/completions", payload,
delta_key="delta"):
yield chunk
# ------------------------------------------------------------------ #
# plain completion (fallback path)
# ------------------------------------------------------------------ #
def generate(self, prompt: str, max_tokens=None, temperature: float = 0.7,
top_p: float = 1.0, stop=None, repeat_penalty: float = 1.0,
presence_penalty: float = 0.0, frequency_penalty: float = 0.0) -> str:
return self.generate_chat([{"role": "user", "content": prompt}],
max_tokens, temperature, top_p, stop)
async def generate_stream(self, prompt: str, max_tokens=None, temperature: float = 0.7,
top_p: float = 1.0, stop=None, repeat_penalty: float = 1.0,
presence_penalty: float = 0.0,
frequency_penalty: float = 0.0) -> AsyncGenerator[str, None]:
async for chunk in self.generate_chat_stream(
[{"role": "user", "content": prompt}], max_tokens, temperature, top_p, stop):
yield chunk
# ------------------------------------------------------------------ #
# SSE streaming: iterate the blocking requests stream on a worker thread
# and hand chunks to the event loop through an asyncio.Queue.
# ------------------------------------------------------------------ #
async def _stream(self, url: str, payload: dict, delta_key: str
) -> AsyncGenerator[str, None]:
import json
loop = asyncio.get_event_loop()
queue: asyncio.Queue = asyncio.Queue()
_SENTINEL = object()
def _worker():
import requests
try:
with requests.post(url, json=payload, stream=True, timeout=3600) as r:
r.raise_for_status()
for raw in r.iter_lines(decode_unicode=True):
if not raw or not raw.startswith("data:"):
continue
data = raw[len("data:"):].strip()
if data == "[DONE]":
break
try:
obj = json.loads(data)
except ValueError:
continue
choice = (obj.get("choices") or [{}])[0]
text = (choice.get(delta_key) or {}).get("content") or ""
if text:
loop.call_soon_threadsafe(queue.put_nowait, text)
if obj.get("usage"):
self._store_usage(obj["usage"])
if choice.get("finish_reason"):
break
except Exception as exc: # surface to the consumer
loop.call_soon_threadsafe(queue.put_nowait, exc)
finally:
loop.call_soon_threadsafe(queue.put_nowait, _SENTINEL)
threading.Thread(target=_worker, daemon=True).start()
while True:
item = await queue.get()
if item is _SENTINEL:
break
if isinstance(item, Exception):
raise item
yield item
This diff is collapsed.
......@@ -49,7 +49,13 @@ def build_hardware_summary() -> Dict[str, Any]:
total_vram_mb = 0
available_vram_mb = 0
# Only use torch if it's ALREADY loaded (i.e. we're in an engine). Never import
# it here — the front is torch-free and must stay that way (importing torch in
# the front is heavy and would initialise CUDA in the wrong process).
import sys as _sys
try:
if "torch" not in _sys.modules:
raise ImportError("torch not loaded (front) — using torch-free path")
import torch
if torch.cuda.is_available():
......@@ -76,6 +82,23 @@ def build_hardware_summary() -> Dict[str, Any]:
except Exception:
pass
# Torch-free path (e.g. the front, which imports no torch): enumerate every
# physical card via nvidia-smi + sysfs so VRAM is reported for the whole node.
if not gpus:
try:
from codai.frontproxy.gpu_detect import gpu_stats
for c in gpu_stats():
total_mb = int(round((c.get("mem_total") or 0) * 1024))
used_mb = int(round((c.get("mem_used") or 0) * 1024))
if total_mb <= 0:
continue
gpus.append({"name": c.get("name") or c.get("vendor"),
"total_vram_mb": total_mb})
total_vram_mb += total_mb
available_vram_mb += max(0, total_mb - used_mb)
except Exception:
pass
if not gpus:
for total_path in sorted(glob.glob("/sys/class/drm/card*/device/mem_info_vram_total")):
used_path = total_path.replace("vram_total", "vram_used")
......
......@@ -60,8 +60,13 @@ def _is_text_response(content_type: str | None) -> bool:
)
async def execute_broker_request(app, envelope):
"""Validate and execute a broker request envelope."""
async def execute_broker_request(app, envelope, executor=None):
"""Validate and execute a broker request envelope.
``executor`` is an ``async (method, path, headers, query, body) -> {status_code,
headers, body}`` callable. When omitted the request is run in-process against
``app`` via the ASGI bridge (engine / single-process mode). The front passes its
own executor that proxies to the right engine over HTTP."""
logger.debug(
"broker dispatch → op=%s request_id=%s path=%r method=%r stream=%s",
......@@ -136,14 +141,20 @@ async def execute_broker_request(app, envelope):
headers["content-type"] = envelope.content_type
started_at = perf_counter()
response = await execute_internal_request(
app,
method=envelope.method,
path=envelope.path,
headers=headers,
query=envelope.query,
body=body,
)
if executor is not None:
response = await executor(
method=envelope.method, path=envelope.path, headers=headers,
query=envelope.query, body=body,
)
else:
response = await execute_internal_request(
app,
method=envelope.method,
path=envelope.path,
headers=headers,
query=envelope.query,
body=body,
)
elapsed_ms = round((perf_counter() - started_at) * 1000, 3)
response_headers = response["headers"]
......
......@@ -224,6 +224,13 @@ configuration directory (--config DIR, default: OS-specific CoderAI directory).
action="store_true",
help="Dump model output: raw output, parsed output, and litellm debug info",
)
parser.add_argument(
"--debug-requests",
action="store_true",
help="Log the full request/response payloads exchanged with API clients "
"(opencode, etc.): incoming messages + tools and the outgoing "
"content/tool_calls. Use to diagnose agentic tool-call loops.",
)
parser.add_argument(
"--list-cached-models",
action="store_true",
......@@ -278,4 +285,39 @@ configuration directory (--config DIR, default: OS-specific CoderAI directory).
help="Ignore any existing pipeline cache and rebuild it from scratch this "
"run (use after changing a model's quantization/precision config).",
)
# ─── Frontend/engine split ───────────────────────────────────────────────
parser.add_argument(
"--single-process",
action="store_true",
help="Run the legacy single-process server (UI/API and all model work in "
"one process). Default boots a front proxy + supervised engine "
"subprocess(es) so the web UI stays responsive during model work.",
)
parser.add_argument(
"--engine-only",
action="store_true",
help="Run this process as an engine (binds an internal localhost port, no "
"front proxy). Normally launched automatically by the front; not "
"intended to be run by hand.",
)
parser.add_argument(
"--internal-port",
type=int,
default=None,
help="Internal port for --engine-only mode (the front assigns one per engine).",
)
parser.add_argument(
"--debug-engine",
action="store_true",
help="General engine debugging in the front/engine split (engine lifecycle, "
"spawn details, health transitions). Does NOT include the internal "
"HTTP access log — use --debug-engine-web for that.",
)
parser.add_argument(
"--debug-engine-web",
action="store_true",
help="Show the internal front↔engine HTTP requests in an engine's access log "
"(proxied calls, /internal/engine-state, /healthz, …). Suppressed by "
"default since every engine only ever serves internal front traffic.",
)
return parser.parse_args()
This diff is collapsed.
# CoderAI - OpenAI-compatible API server
# Copyright (C) 2026 Stefy Lanza <stefy@nexlab.net>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
"""Front proxy package: always-responsive web/API front + supervised engines.
See ``docs/frontend-engine-split.md`` and ``docs/process-isolation-plans.md``.
"""
from codai.frontproxy.app import run_front, build_app
__all__ = ["run_front", "build_app"]
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -21,6 +21,7 @@ from threading import Lock
from typing import List, Optional
import json
import os
import re
import time
......@@ -179,11 +180,15 @@ def detect_model_capabilities(model_name: str) -> ModelCapabilities:
return caps
# ── Image: upscaling (checked before general SD rule to catch SD-family upscalers) ──
if any(x in n for x in ['real-esrgan', 'esrgan', 'swinir', 'edsr',
'bsrgan', 'hat-', 'dat-',
# 'hat-'/'dat-' are short, ambiguous tokens (e.g. they appear inside
# "chat-", "update-"); require a word boundary before them so a text "chat"
# model isn't mistaken for the HAT/DAT super-resolution checkpoints.
if (any(x in n for x in ['real-esrgan', 'esrgan', 'swinir', 'edsr',
'bsrgan',
'x2-upscaler', 'x4-upscaler', 'x2_upscaler', 'x4_upscaler',
'latent-upscaler', 'latent_upscaler',
'ldm-super-resolution', 'rcan-', 'sr3-']):
'ldm-super-resolution', 'rcan-', 'sr3-'])
or re.search(r'\b[hd]at-', n)):
caps.image_upscaling = True
caps.image_to_image = True
return caps
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
python tools/video_editor.py --no-browser --host 0.0.0.0 --media-dir tools/coderai_media --session
tools/gen_township_fighters.py -c township_output/township_config.json
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment