build/launcher: --versioned image tags, --host bind, libcuda-for-vulkan docs

build_oci_image.sh: - --versioned derives a deterministic tag from codai.__version__ + build mode + GPU: local-venv -> coderai:full_<gpu>_<version>, from-scratch -> coderai:base_<gpu>_<version> (overrides -t/--tag). - --gpu all|nvidia|vulkan (default all) sets the <gpu> token; label only — the image always bundles both CUDA and Vulkan. run_oci.sh: - --host ADDR binds the published port to a specific interface (-p ADDR:PORT:8776); default unchanged (all interfaces). CODERAI_HOST stays 0.0.0.0 in-container. Banner URL reflects the bind host. Docs (AI.PROMPT, dist-bundle README.md/.txt, README-RUN.txt): - Document --versioned/--gpu, additive --nvidia/--vulkan/--all, --vulkan auto-libcuda + --with-libcuda, graceful llama-cpp degradation, --host, and the new uninstall.sh + confirm gates. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

build/launcher: --versioned image tags, --host bind, libcuda-for-vulkan docs
build_oci_image.sh: - --versioned derives a deterministic tag from codai.__version__ + build mode + GPU: local-venv -> coderai:full_<gpu>_<version>, from-scratch -> coderai:base_<gpu>_<version> (overrides -t/--tag). - --gpu all|nvidia|vulkan (default all) sets the <gpu> token; label only — the image always bundles both CUDA and Vulkan. run_oci.sh: - --host ADDR binds the published port to a specific interface (-p ADDR:PORT:8776); default unchanged (all interfaces). CODERAI_HOST stays 0.0.0.0 in-container. Banner URL reflects the bind host. Docs (AI.PROMPT, dist-bundle README.md/.txt, README-RUN.txt): - Document --versioned/--gpu, additive --nvidia/--vulkan/--all, --vulkan auto-libcuda + --with-libcuda, graceful llama-cpp degradation, --host, and the new uninstall.sh + confirm gates. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
d669c797 · Stefy Lanza (nextime / spora ) · c077b7da · d669c797 · d669c797 · d669c797
Commit d669c797 authored Jun 21, 2026 by Stefy Lanza (nextime / spora )
6 changed files
--- a/AI.PROMPT
+++ b/AI.PROMPT
@@ -311,15 +311,25 @@ Full build (slow, ~15 min — rebuilds the bundle):
  packaging/linux/build_oci_image.sh                      # tags coderai:dist
  - By DEFAULT it then exports the image and assembles the final distribution
    bundle (dist/coderai-docker-dist.tar). Pass --no-dist to just build.
+  - --from-scratch (default) builds from pinned packages + native wheels;
+    --venv PATH / --from-venv build from a local venv (e.g. ./venv_all).
+  - --versioned derives a deterministic tag from codai.__version__ + build mode +
+    GPU: local-venv -> coderai:full_<gpu>_<version>, from-scratch ->
+    coderai:base_<gpu>_<version> (overrides -t/--tag). --gpu all|nvidia|vulkan
+    (default all) sets <gpu>; it is a LABEL only — the image always bundles both
+    CUDA and Vulkan. e.g. `--venv ./venv_all --versioned` -> coderai:full_all_0.1.0.
 Distribution bundle (packaging/linux/make_dist_bundle.sh → dist/coderai-docker-dist.tar):
-  Expands to coderai-docker-dist/{install.sh, coderai-docker, coderai-dist.tar.gz,
+  Expands to coderai-docker-dist/{install.sh, uninstall.sh, coderai-docker,
-  README.txt}. Staged UNDER dist/ (not /tmp — too small) and the 12 GB image
+  coderai-dist.tar.gz, README.txt/.md}. Staged UNDER dist/ (not /tmp — too small)
-  tarball is HARDLINKED into the stage, not copied. install.sh: `docker load`s the
+  and the 12 GB image tarball is HARDLINKED into the stage, not copied. install.sh:
-  image (sudo-fallback for daemon access), then installs the coderai-docker runner
+  `docker load`s the image (sudo-fallback for daemon access), then installs the
-  to /usr/local/bin (root) or ~/.local/usr/bin (user, added to ~/.bashrc PATH if
+  coderai-docker runner to /usr/local/bin (root) or ~/.local/usr/bin (user, added
-  missing). The runner is run_oci.sh with its default image tag pinned (sed) to
+  to ~/.bashrc PATH if missing). uninstall.sh reverses it (removes the runner from
-  whatever was shipped, so it matches the loaded tag.
+  both locations + optionally the image; --keep-image). Both print what they'll do
+  and confirm before proceeding, bypassable with --yes/-y. The runner is run_oci.sh
+  with its default image tag pinned (sed) to whatever was shipped, so it matches
+  the loaded tag.
 Smoke test (no weights, checks services + every bundled binary):
  DOCKER="sudo docker" GPU="--gpus all" PORT=18082 \
    packaging/linux/smoke_test_services.sh coderai:dist
@@ -327,6 +337,15 @@ Smoke test (no weights, checks services + every bundled binary):
 Run against your LIVE local config + data (no rebuild — pure bind-mounts):
  packaging/linux/run_oci.sh --nvidia --local \
    --map /AI/guffcache --map /AI/huggingface --map /AI/offloads
+  - GPU backends are ADDITIVE: --cpu/--nvidia/--vulkan combine (last-one no longer
+    wins). --nvidia --vulkan enables both (maps libcuda via --gpus all AND
+    /dev/dri); --all = nvidia+vulkan. The bundled llama-cpp is a CUDA build, so
+    --vulkan auto bind-mounts the host's libcuda.so.1 (override with
+    --with-libcuda[=PATH]); a missing libcuda now degrades gracefully (server
+    starts, Vulkan/GGUF backend just unavailable) instead of crash-looping.
+  - -p/--port PORT sets the host port; --host ADDR binds the published port to a
+    specific interface (e.g. 127.0.0.1 localhost-only; default = all interfaces).
+    CODERAI_HOST stays 0.0.0.0 so the server listens on all interfaces in-container.
  - The image launcher reads config from /config/coderai and runs
    `coderai --config /config/coderai`, rewriting server.host/port in config.json.
  - `--local` (= --config-dir ~/.coderai) copies ONLY the *.json config files to

--- a/packaging/linux/README-RUN.txt
+++ b/packaging/linux/README-RUN.txt
@@ -56,7 +56,12 @@ Basic run (NVIDIA):
    -v "$PWD/coderai-cache:/cache" \
    coderai:local
-AMD/Intel Vulkan: replace `--gpus all` with `--device /dev/dri`.
+AMD/Intel Vulkan: replace `--gpus all` with `--device /dev/dri`. NOTE: the bundled
+  llama-cpp is a CUDA build, so GGUF needs libcuda.so.1 even under Vulkan — on a
+  host with the NVIDIA driver add e.g.
+    -v /usr/lib/x86_64-linux-gnu/libcuda.so.1:/usr/lib/x86_64-linux-gnu/libcuda.so.1:ro
+  (the coderai-docker runner's --vulkan does this automatically). Without it the
+  server still starts; only the Vulkan/GGUF text backend is unavailable.
 CPU only: drop the GPU flag entirely.
 Run as container-root instead: just omit the `--user` line (see "Running as a
 non-root user" below for rootless / userns-remap alternatives).

--- a/packaging/linux/build_oci_image.sh
+++ b/packaging/linux/build_oci_image.sh
@@ -37,6 +37,12 @@ DS4_DIR="${CODERAI_DS4_DIR:-$HOME/.coderai/ds4}"
 # bundle (image tarball + install.sh + coderai-docker runner). Disable with
 # --no-dist (just builds the image).
 MAKE_DIST=1
+# --versioned auto-derives a structured image tag from the app version + build
+# mode + target GPU: full_<gpu>_<version> for a local-venv build, base_<gpu>_<version>
+# for a from-scratch build (e.g. coderai:full_all_0.1.0). GPU_SEL is a label only —
+# the image bundles CUDA + Vulkan regardless — so pick what you intend to ship.
+VERSIONED=0
+GPU_SEL="all"   # all | nvidia | vulkan
 usage() {
  cat <<'EOF'
@@ -62,6 +68,13 @@ Options:
  --no-parler             Do not bundle the Parler-TTS venv overlay.
  --no-tools              Do not bundle the lip-sync (wav2lip/sadtalker) venvs or ds4.
  -t, --tag TAG           Image tag to create (default: coderai:local or OCI_IMAGE from versions.env).
+  --versioned             Auto-derive the tag from the app version + build mode + GPU:
+                            local-venv build -> coderai:full_<gpu>_<version>
+                            from-scratch     -> coderai:base_<gpu>_<version>
+                          (overrides -t/--tag and the default). <version> comes from
+                          codai.__version__; <gpu> from --gpu (default all).
+  --gpu all|nvidia|vulkan Target-GPU label used by --versioned (default: all). Label
+                          only — the image always bundles both CUDA and Vulkan.
  --no-dist               Just build the image; skip exporting + bundling for distribution.
  --dist                  Force building the distribution bundle (default ON).
  -h, --help              Show this help.
@@ -147,6 +160,21 @@ while [[ $# -gt 0 ]]; do
      IMAGE_TAG="$2"
      shift 2
      ;;
+    --versioned)
+      VERSIONED=1
+      shift
+      ;;
+    --gpu)
+      if [[ $# -lt 2 ]]; then
+        echo "Error: --gpu requires one of: all nvidia vulkan" >&2
+        exit 2
+      fi
+      case "$2" in
+        all|nvidia|vulkan) GPU_SEL="$2" ;;
+        *) echo "Error: --gpu must be one of: all nvidia vulkan (got '$2')" >&2; exit 2 ;;
+      esac
+      shift 2
+      ;;
    --no-dist)
      MAKE_DIST=0
      shift
@@ -175,6 +203,26 @@ while [[ $# -gt 0 ]]; do
  esac
 done
+# --versioned: derive coderai:<prefix>_<gpu>_<version>. prefix=full for a local
+# venv build (the running stack), base for a from-scratch build. Overrides any
+# -t/--tag or positional tag, since the whole point is a deterministic name.
+if [[ "$VERSIONED" == "1" ]]; then
+  APP_VERSION="$(cd "$ROOT_DIR" && python3 - <<'PY'
+import re, pathlib, sys
+txt = pathlib.Path("codai/__init__.py").read_text()
+m = re.search(r'__version__\s*=\s*["\']([^"\']+)["\']', txt)
+print(m.group(1) if m else "")
+PY
+)"
+  if [[ -z "$APP_VERSION" ]]; then
+    echo "Error: --versioned could not read __version__ from codai/__init__.py" >&2
+    exit 2
+  fi
+  if [[ "$BUILD_MODE" == "venv" ]]; then PREFIX="full"; else PREFIX="base"; fi
+  IMAGE_TAG="coderai:${PREFIX}_${GPU_SEL}_${APP_VERSION}"
+  echo "[build] --versioned -> $IMAGE_TAG"
+fi
 PYTHON_VERSION="${PYTHON_VERSION:-3.13.5}"
 PBS_RELEASE="${PBS_RELEASE:-20250612}"
 UV_VERSION="${UV_VERSION:-0.7.13}"

--- a/packaging/linux/dist-bundle/README.md
+++ b/packaging/linux/dist-bundle/README.md
@@ -9,6 +9,7 @@ studio + township fighters, all behind one nginx port).
 | File | Purpose |
 |------|---------|
 | `install.sh` | Loads the image into Docker and installs the runner. |
+| `uninstall.sh` | Removes the runner and (optionally) the image. |
 | `coderai-docker` | The run wrapper (installed to your PATH by `install.sh`). |
 | `coderai-dist.tar.gz` | The Docker image (gzip-compressed `docker save`). |
 | `README.md` / `README.txt` | This file. |
@@ -21,7 +22,7 @@ studio + township fighters, all behind one nginx port).
 ## Install
 ```sh
-./install.sh
+./install.sh            # prompts to confirm; add --yes to skip
 ```
 Installs `coderai-docker` to:
@@ -30,10 +31,20 @@ Installs `coderai-docker` to:
 - `~/.local/usr/bin` — when run as a normal user (added to PATH via `~/.bashrc`
  if it isn't there already).
+## Uninstall
+```sh
+./uninstall.sh          # removes the runner + (after a prompt) the image
+```
+`--keep-image` keeps the image; `--yes` skips all prompts. Your runtime data is
+left untouched.
 ## Run
 ```sh
 coderai-docker --nvidia          # CUDA; or --vulkan (AMD/Intel) / --cpu
+coderai-docker --nvidia --vulkan # both backends (also: --all)
 coderai-docker --help            # all options
 ```
@@ -43,7 +54,9 @@ Then open <http://127.0.0.1:8776/admin>.
 | Option | Description |
 |--------|-------------|
+| `--nvidia` / `--vulkan` | GPU backends; **additive** (pass both, or `--all`). `--vulkan` auto-maps the host `libcuda.so.1` (the bundled llama-cpp is a CUDA build); `--with-libcuda[=PATH]` overrides. |
 | `-p, --port PORT` | Host port (default `8776`). |
+| `--host ADDR` | Bind the published port to a specific interface (e.g. `127.0.0.1` localhost-only; default all interfaces). |
 | `--data-dir PATH` | Where config/models/cache live (default `./coderai-runtime`). |
 | `--local` | Run against your existing `~/.coderai` config. |
 | `--map HOST[:CONT]` | Bind-mount a host dir at the same path (for absolute model paths in `models.json`), e.g. `--map /AI/guffcache`. |

--- a/packaging/linux/dist-bundle/README.txt
+++ b/packaging/linux/dist-bundle/README.txt
@@ -8,6 +8,7 @@ studio + township fighters, all behind one nginx port).
 Contents
 --------
  install.sh           Loads the image into Docker and installs the runner.
+  uninstall.sh         Removes the runner and (optionally) the image.
  coderai-docker       The run wrapper (installed to your PATH by install.sh).
  coderai-dist.tar.gz  The Docker image (gzip-compressed `docker save`).
  README.txt           This file.
@@ -19,23 +20,35 @@ Requirements
 Install
 -------
-  ./install.sh
+  ./install.sh            (prompts to confirm; add --yes to skip the prompt)
  Installs `coderai-docker` to:
    - /usr/local/bin            when run as root
    - ~/.local/usr/bin          when run as a normal user (added to PATH via
                                ~/.bashrc if it isn't there already)
+Uninstall
+---------
+  ./uninstall.sh          Removes the runner from both locations and (after a
+                          prompt) the image. --keep-image keeps the image;
+                          --yes skips all prompts. Runtime data is left untouched.
 Run
 ---
  coderai-docker --nvidia                 # CUDA; or --vulkan (AMD/Intel) / --cpu
+  coderai-docker --nvidia --vulkan        # both backends (also: --all)
  coderai-docker --help                   # all options
  Then open http://127.0.0.1:8776/admin
 Useful options (see --help)
 ---------------------------
+  --nvidia / --vulkan     GPU backends; ADDITIVE (pass both, or --all). --vulkan
+                          auto-maps the host libcuda.so.1 (the bundled llama-cpp
+                          is a CUDA build); --with-libcuda[=PATH] overrides it.
  -p, --port PORT         Host port (default 8776).
+  --host ADDR             Bind the published port to a specific interface
+                          (e.g. 127.0.0.1 for localhost-only; default all).
  --data-dir PATH         Where config/models/cache live (default ./coderai-runtime).
  --local                 Run against your existing ~/.coderai config.
  --map HOST[:CONT]       Bind-mount a host dir at the same path (for absolute

--- a/packaging/linux/run_oci.sh
+++ b/packaging/linux/run_oci.sh
@@ -19,6 +19,9 @@ declare -A MODES=()
 # full --gpus all). "auto" = detect via ldconfig; or an explicit path.
 WITH_LIBCUDA=""
 PORT="${CODERAI_PORT:-8776}"
+# Host interface the published port binds to. Empty = Docker's default (all
+# interfaces, 0.0.0.0). Set e.g. 127.0.0.1 to expose only on localhost.
+HOST_BIND="${CODERAI_HOST_BIND:-}"
 DATA_ROOT="$PWD/coderai-runtime"
 DETACH=0
 NAME="coderai"
@@ -71,6 +74,9 @@ Options:
                      host. P is an explicit path; default auto-detects via
                      ldconfig. (Implied automatically when --nvidia is set.)
  -p, --port PORT     Host port to expose (default: 8776).
+  --host ADDR         Host interface to bind the published port to (e.g.
+                      127.0.0.1 for localhost-only, 0.0.0.0 for all interfaces).
+                      Default: Docker's default (all interfaces).
  --data-dir PATH     Directory for config/models/cache (default: ./coderai-runtime).
  --name NAME         Container name (default: coderai).
  -d, --detach        Run in background.
@@ -117,6 +123,9 @@ while [[ $# -gt 0 ]]; do
    -p|--port)
      [[ $# -ge 2 ]] || { echo "Error: $1 requires a port" >&2; exit 2; }
      PORT="$2"; shift 2 ;;
+    --host)
+      [[ $# -ge 2 ]] || { echo "Error: --host requires an address" >&2; exit 2; }
+      HOST_BIND="$2"; shift 2 ;;
    --data-dir)
      [[ $# -ge 2 ]] || { echo "Error: --data-dir requires a path" >&2; exit 2; }
      DATA_ROOT="$2"; shift 2 ;;
@@ -159,7 +168,15 @@ done
 mkdir -p "$DATA_ROOT/config" "$DATA_ROOT/models" "$DATA_ROOT/cache"
 DATA_ROOT="$(cd "$DATA_ROOT" && pwd)"
-args=(run --rm --name "$NAME" --ipc=host -p "$PORT:8776" -e CODERAI_HOST=0.0.0.0 -e CODERAI_PORT=8776)
+# Publish spec: bind to a specific host interface when --host is given, else let
+# Docker use its default (all interfaces). CODERAI_HOST stays 0.0.0.0 so the
+# server listens on all interfaces *inside* the container.
+if [[ -n "$HOST_BIND" ]]; then
+  PUBLISH="$HOST_BIND:$PORT:8776"
+else
+  PUBLISH="$PORT:8776"
+fi
+args=(run --rm --name "$NAME" --ipc=host -p "$PUBLISH" -e CODERAI_HOST=0.0.0.0 -e CODERAI_PORT=8776)
 if [[ "$DETACH" == "1" ]]; then
  args+=(-d)
 fi
@@ -297,7 +314,7 @@ Starting CoderAI OCI container
  image:   $IMAGE_TAG
  mode:    $(echo "${!MODES[@]}" | tr ' ' '+' | tr 'A-Z' 'a-z')
  libcuda: $LIBCUDA_NOTE
-  url:     http://127.0.0.1:$PORT/admin
+  url:     http://${HOST_BIND:-127.0.0.1}:$PORT/admin
  data:    $DATA_ROOT
  config:  $CONFIG_NOTE
  debug:   ${DEBUG_SPEC:-off}