-
Your Name authored
- Add --no-ram CLI option to force model loading without CPU RAM spilling - Implement --no-ram behavior for: - llama-cpp-python: n_gpu_layers=-1, use_mmap=False, ignore --n-ctx - HuggingFace transformers: device_map='cuda:0', low_cpu_mem_usage=True - Diffusers: force full GPU loading - sd.cpp: maximize GPU usage - Propagate flag through model manager - Add startup banner message
b782a092
| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| __init__.py | ||
| app.py | ||
| images.py | ||
| log.py | ||
| state.py | ||
| text.py | ||
| transcriptions.py | ||
| tts.py |