• Your Name's avatar
    Add debug output for flash-attention and force-reasoning mode · b7d84534
    Your Name authored
    - Enhanced flash attention status output in NvidiaBackend to always show availability
    - Added debug output in chat completions endpoint for force-reasoning mode
    - Shows CLI flag value, API param, reasoning action, and whether injection was done
    - Displays the actual injected system prompt content when debug mode is enabled
    b7d84534
Name
Last commit
Last update
..
backends Loading commit data...
models Loading commit data...
pydantic Loading commit data...
queue Loading commit data...
__init__.py Loading commit data...