• Your Name's avatar
    Add debug output for flash-attention and force-reasoning mode · b7d84534
    Your Name authored
    - Enhanced flash attention status output in NvidiaBackend to always show availability
    - Added debug output in chat completions endpoint for force-reasoning mode
    - Shows CLI flag value, API param, reasoning action, and whether injection was done
    - Displays the actual injected system prompt content when debug mode is enabled
    b7d84534
Name
Last commit
Last update
..
__init__.py Loading commit data...
base.py Loading commit data...
cuda.py Loading commit data...
vulkan.py Loading commit data...