• Stefy Lanza (nextime / spora )'s avatar
    docker/backend: graceful llama-cpp load + additive GPU modes + libcuda... · c077b7da
    Stefy Lanza (nextime / spora ) authored
    docker/backend: graceful llama-cpp load + additive GPU modes + libcuda mapping; admin GGUF batch/slots tuning
    
    Backend robustness:
    - vulkan.py catches Exception (not just ImportError) around the llama_cpp
      import: a CUDA-built llama-cpp missing libcuda.so.1 raised RuntimeError/OSError
      that crash-looped the whole server. Now it logs a warning and marks the
      Vulkan/GGUF backend unavailable; CUDA/CPU/ds4 keep working.
    - detect_available_backends() reads LLAMA_CPP_AVAILABLE instead of re-importing
      (which re-raised the same error).
    
    Docker launcher (run_oci.sh):
    - GPU backends are now additive: --nvidia --vulkan enables both (maps libcuda via
      --gpus all AND /dev/dri). Added --all and --with-libcuda[=PATH].
    - --vulkan auto bind-mounts the host's libcuda.so.1 (the bundled llama-cpp is a
      CUDA build), so Vulkan GGUF loads without full --gpus all. Banner shows mode set
      and libcuda status.
    
    Dist bundle:
    - New uninstall.sh (removes runner + optional image), wired into make_dist_bundle.
    - install.sh + uninstall.sh print what they'll do and confirm before proceeding,
      bypassable with --yes/-y.
    
    Admin GGUF tuning:
    - Expose n_batch / n_ubatch / n_seq_max (llama.cpp -b/-ub/-np) in the model config
      UI and apply them in the Vulkan backend to shrink VRAM at the ceiling; n_seq_max
      gated on llama-cpp-python support.
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    c077b7da
Name
Last commit
Last update
codai Loading commit data...
docs Loading commit data...
packaging Loading commit data...
samples Loading commit data...
tests Loading commit data...
tools Loading commit data...
.dockerignore Loading commit data...
.gitignore Loading commit data...
AI.PROMPT Loading commit data...
CODERAI_API_DOCUMENTATION.md Loading commit data...
CoderAI.gif Loading commit data...
DISTRIBUTION.md Loading commit data...
LICENSE.md Loading commit data...
MULTIMODAL_CAPABILITIES.md Loading commit data...
MULTIMODAL_UI_EXAMPLES.md Loading commit data...
README.md Loading commit data...
build-oci.sh Loading commit data...
build.ps1 Loading commit data...
build.sh Loading commit data...
coderai Loading commit data...
coderai-broker-implementation-reference.md Loading commit data...
coderai-integration.md Loading commit data...
commands Loading commit data...
osxbuild.sh Loading commit data...
package-oci.sh Loading commit data...
package-tarball.sh Loading commit data...
requirements-nvidia.txt Loading commit data...
requirements-vulkan.txt Loading commit data...
requirements.txt Loading commit data...
run-oci.sh Loading commit data...
smoke-test-oci.sh Loading commit data...
todo.md Loading commit data...
video_editor.config.json Loading commit data...