• Your Name's avatar
    Fix CUDA backend for GGUF models - force CUDA via environment variables · f77d34da
    Your Name authored
    - Set GGML_DISABLE_VULKAN=1 and GGML_VULKAN_DEVICE='' before loading model
    - These must be set before llama_cpp import since it reads them at init
    - Restore Vulkan settings on cleanup so subsequent Vulkan models work
    - Addresses issue where GGUF models ran on CPU instead of CUDA with --backend nvidia
    f77d34da
Name
Last commit
Last update
.vscode Loading commit data...
.gitignore Loading commit data...
LICENSE.md Loading commit data...
README.md Loading commit data...
aaa Loading commit data...
build.sh Loading commit data...
coder Loading commit data...
coderai Loading commit data...
requirements-nvidia.txt Loading commit data...
requirements-vulkan.txt Loading commit data...
requirements.txt Loading commit data...
requirements.txt~ Loading commit data...