Skip to content

Feature Request: Worker Pool Reuse and Persistent Connection Management #403

@SuperCoolPencil

Description

@SuperCoolPencil

Feature Request: Worker Pool Reuse and Persistent Connection Management

Overview

Currently, Surge manages concurrency at two distinct levels: a high-level WorkerPool for download tasks and a low-level set of goroutines within ConcurrentDownloader for individual connections. Each download creates its own http.Client and spawns a fresh set of workers, leading to significant overhead and sub-optimal resource utilization.

We propose implementing a Unified Network Worker Pool and Persistent Connection Manager to reduce latency, improve throughput, and minimize CPU/Memory churn.

Problem Statement

The current implementation suffers from several inefficiencies:

  1. Connection Thrashing: Every ConcurrentDownloader instance initializes a new http.Client. Connections opened during mirror probing or previous downloads from the same host are discarded, forcing expensive TCP/TLS handshakes for every new task.
  2. Goroutine Churn: Spawning 16-64 goroutines per download and destroying them immediately after completion adds unnecessary scheduler overhead, especially for small files or batch downloads.
  3. Cold Start Latency: Each new download starts with an empty connection pool, missing out on TCP slow-start optimizations and warm keep-alive connections.
  4. Resource Fragmentation: Memory allocated for worker buffers is localized to the ConcurrentDownloader instance, preventing efficient cross-download buffer reuse.

Proposed Solution

1. Global Network Worker Pool

Instead of ConcurrentDownloader spawning its own goroutines, it should request workers from a global pool managed at the engine level.

  • Persistent Workers: Workers should be long-lived goroutines that wait for work units (chunks/tasks).
  • Dynamic Assignment: Workers can be dynamically reassigned between downloads based on priority or bandwidth availability.

2. Centralized Connection Manager (Shared Transport)

Implement a singleton or managed set of http.Transport instances shared across the entire application.

  • Mirror Warmth: Connections established during ProbeMirrorsWithProxy should be kept alive and handed off to the ConcurrentDownloader.
  • Cross-Download Reuse: If a user downloads multiple files from the same mirror (e.g., a parts-based download), the same TCP connections should be reused.
  • Hedged Dialing Integration: The existing hedged dialing logic can more effectively "pre-warm" a global pool rather than a per-download pool.

3. Shared Buffer Pool

Promote the sync.Pool for byte buffers to a global level to ensure that memory used for one download's chunks can be immediately reused by another without waiting for GC.

Technical Considerations

Architecture Change

graph TD
    Engine[Download Engine] --> WP[Global Worker Pool]
    Engine --> CM[Connection Manager]
    WP --> W1[Worker 1]
    WP --> W2[Worker 2]
    WP --> Wn[Worker n]
    W1 -- Requests --> CM
    W2 -- Requests --> CM
    CM -- Shared Transport --> Mirrors((Mirrors))
Loading

Proposed Interface Changes

  • internal/engine/concurrent: Download() should accept a handle to the global worker pool and a shared client.
  • internal/download: The WorkerPool should be renamed to TaskPool to avoid confusion, and a new NetworkWorkerPool should be introduced.

Expected Benefits

  • Reduced Time-to-First-Byte (TTFB): Drastic reduction in startup time due to warm connections.
  • Lower Resource Footprint: Reduced memory allocation and fewer active goroutines during idle or transition periods.
  • Improved Stability: Better control over total system-wide connection counts and goroutine limits.
  • Better Batch Performance: Sequential downloads in a batch will benefit from the "head start" provided by previous tasks' connections.

Verification Plan

  1. Benchmark Throughput: Compare batch download performance of 100 small files before and after reuse.
  2. Trace Connection Lifecycle: Use netstat or Go execution traces to verify that TCP connections persist across multiple ConcurrentDownloader.Download calls to the same host.
  3. Monitor Memory/GC: Validate that sync.Pool hits improve and heap allocations for workers decrease.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions