• Stefy Lanza (nextime / spora )'s avatar
    front: drain in-flight requests before bouncing an engine · 34d666d6
    Stefy Lanza (nextime / spora ) authored
    An engine restart (admin button / config change) previously SIGTERM'd the
    process immediately, severing any active SSE stream mid-response — the client
    saw httpcore.RemoteProtocolError "peer closed connection without sending
    complete message body".
    
    Now restart_engine marks the engine `draining` first: the router stops routing
    NEW requests to it (Engine.is_alive() reports false while draining, and the poll
    loop can't flip it back healthy), and the supervisor waits up to
    server.engine_restart_drain_grace seconds (default 30, 0 = immediate) for the
    in-flight count to reach zero before killing the process. Stragglers past the
    grace window are still bounced.
    
    In-flight is tracked per engine in the front proxy: proxy() increments on send
    and decrements once the streamed response is fully drained (or the send failed).
    Co-Authored-By: 's avatarClaude Opus 4.8 <noreply@anthropic.com>
    34d666d6
Name
Last commit
Last update
..
admin Loading commit data...
api Loading commit data...
backends Loading commit data...
broker Loading commit data...
frontproxy Loading commit data...
models Loading commit data...
openai Loading commit data...
pydantic Loading commit data...
queue Loading commit data...
tasks Loading commit data...
__init__.py Loading commit data...
cli.py Loading commit data...
config.py Loading commit data...
main.py Loading commit data...
platform_paths.py Loading commit data...