-
Stefy Lanza (nextime / spora ) authored
An engine restart (admin button / config change) previously SIGTERM'd the process immediately, severing any active SSE stream mid-response — the client saw httpcore.RemoteProtocolError "peer closed connection without sending complete message body". Now restart_engine marks the engine `draining` first: the router stops routing NEW requests to it (Engine.is_alive() reports false while draining, and the poll loop can't flip it back healthy), and the supervisor waits up to server.engine_restart_drain_grace seconds (default 30, 0 = immediate) for the in-flight count to reach zero before killing the process. Stragglers past the grace window are still bounced. In-flight is tracked per engine in the front proxy: proxy() increments on send and decrements once the streamed response is fully drained (or the send failed). Co-Authored-By:Claude Opus 4.8 <noreply@anthropic.com>
34d666d6