front: drain in-flight requests before bouncing an engine
An engine restart (admin button / config change) previously SIGTERM'd the
process immediately, severing any active SSE stream mid-response — the client
saw httpcore.RemoteProtocolError "peer closed connection without sending
complete message body".
Now restart_engine marks the engine `draining` first: the router stops routing
NEW requests to it (Engine.is_alive() reports false while draining, and the poll
loop can't flip it back healthy), and the supervisor waits up to
server.engine_restart_drain_grace seconds (default 30, 0 = immediate) for the
in-flight count to reach zero before killing the process. Stragglers past the
grace window are still bounced.
In-flight is tracked per engine in the front proxy: proxy() increments on send
and decrements once the streamed response is fully drained (or the send failed).
Co-Authored-By:
Claude Opus 4.8 <noreply@anthropic.com>
Showing
Please
register
or
sign in
to comment