-
Your Name authored
- Enhanced flash attention status output in NvidiaBackend to always show availability - Added debug output in chat completions endpoint for force-reasoning mode - Shows CLI flag value, API param, reasoning action, and whether injection was done - Displays the actual injected system prompt content when debug mode is enabled
b7d84534
| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| __init__.py | ||
| base.py | ||
| cuda.py | ||
| vulkan.py |