Skip to content

Critical: Missing Graceful Shutdown for All Listeners #417

@bearice

Description

@bearice

Problem Description

The current listener implementations (HttpX, SOCKS, etc.) cannot be gracefully shut down. This is a critical architectural deficiency that affects production deployments.

Current Behavior

  • Accept loops run in spawned tasks with infinite loops
  • No mechanism to signal shutdown to listeners
  • Process termination forcibly kills all active connections
  • Clients experience unexpected connection drops

Affected Code

  • src/listeners/httpx.rs:299 - tokio::spawn(this_tcp.accept(...)) never stops
  • src/listeners/socks.rs - Similar infinite accept loops
  • All listener implementations lack shutdown coordination

Technical Analysis

Root Cause

// src/listeners/httpx.rs:299 - PROBLEMATIC CODE
tokio::spawn(this_tcp.accept(tcp_listener, contexts.clone(), tcp_queue));
Ok(()) // Returns immediately, but accept task runs forever

The listen() method spawns a background task and returns Ok(()), but the spawned task has no way to receive shutdown signals.

Impact Assessment

  • High: Production services cannot perform rolling updates without connection drops
  • High: Violates graceful degradation principles for distributed systems
  • Medium: Makes integration testing difficult (can't cleanly stop listeners)

Proposed Solution

1. Update Listener Trait

Add CancellationToken parameter to enable shutdown coordination:

async fn listen(
    self: Arc<Self>,
    contexts: Arc<ContextManager>, 
    timeouts: Timeouts,
    queue: Sender<ContextRef>,
    shutdown_token: CancellationToken, // ADD THIS
) -> Result<()>;

2. Implement Graceful Accept Pattern

loop {
    tokio::select! {
        accept_result = tcp_listener.accept() => {
            // Handle new connections
        }
        _ = shutdown_token.cancelled() => {
            info!("Graceful shutdown initiated");
            break;
        }
    }
}

3. Connection Draining

Wait for active connections to complete before shutdown.

Implementation Phases

  • Phase 1: Add CancellationToken to Listener trait
  • Phase 2: Refactor all listener accept loops
  • Phase 3: Add connection tracking and draining
  • Phase 4: Integration testing

Priority: Critical

Blocks production-ready deployments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions