You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Feature Request: Worker Pool Reuse and Persistent Connection Management
Overview
Currently, Surge manages concurrency at two distinct levels: a high-level WorkerPool for download tasks and a low-level set of goroutines within ConcurrentDownloader for individual connections. Each download creates its own http.Client and spawns a fresh set of workers, leading to significant overhead and sub-optimal resource utilization.
We propose implementing a Unified Network Worker Pool and Persistent Connection Manager to reduce latency, improve throughput, and minimize CPU/Memory churn.
Problem Statement
The current implementation suffers from several inefficiencies:
Connection Thrashing: Every ConcurrentDownloader instance initializes a new http.Client. Connections opened during mirror probing or previous downloads from the same host are discarded, forcing expensive TCP/TLS handshakes for every new task.
Goroutine Churn: Spawning 16-64 goroutines per download and destroying them immediately after completion adds unnecessary scheduler overhead, especially for small files or batch downloads.
Cold Start Latency: Each new download starts with an empty connection pool, missing out on TCP slow-start optimizations and warm keep-alive connections.
Resource Fragmentation: Memory allocated for worker buffers is localized to the ConcurrentDownloader instance, preventing efficient cross-download buffer reuse.
Proposed Solution
1. Global Network Worker Pool
Instead of ConcurrentDownloader spawning its own goroutines, it should request workers from a global pool managed at the engine level.
Persistent Workers: Workers should be long-lived goroutines that wait for work units (chunks/tasks).
Dynamic Assignment: Workers can be dynamically reassigned between downloads based on priority or bandwidth availability.
Implement a singleton or managed set of http.Transport instances shared across the entire application.
Mirror Warmth: Connections established during ProbeMirrorsWithProxy should be kept alive and handed off to the ConcurrentDownloader.
Cross-Download Reuse: If a user downloads multiple files from the same mirror (e.g., a parts-based download), the same TCP connections should be reused.
Hedged Dialing Integration: The existing hedged dialing logic can more effectively "pre-warm" a global pool rather than a per-download pool.
3. Shared Buffer Pool
Promote the sync.Pool for byte buffers to a global level to ensure that memory used for one download's chunks can be immediately reused by another without waiting for GC.
Technical Considerations
Architecture Change
graph TD
Engine[Download Engine] --> WP[Global Worker Pool]
Engine --> CM[Connection Manager]
WP --> W1[Worker 1]
WP --> W2[Worker 2]
WP --> Wn[Worker n]
W1 -- Requests --> CM
W2 -- Requests --> CM
CM -- Shared Transport --> Mirrors((Mirrors))
Loading
Proposed Interface Changes
internal/engine/concurrent: Download() should accept a handle to the global worker pool and a shared client.
internal/download: The WorkerPool should be renamed to TaskPool to avoid confusion, and a new NetworkWorkerPool should be introduced.
Expected Benefits
Reduced Time-to-First-Byte (TTFB): Drastic reduction in startup time due to warm connections.
Lower Resource Footprint: Reduced memory allocation and fewer active goroutines during idle or transition periods.
Improved Stability: Better control over total system-wide connection counts and goroutine limits.
Better Batch Performance: Sequential downloads in a batch will benefit from the "head start" provided by previous tasks' connections.
Verification Plan
Benchmark Throughput: Compare batch download performance of 100 small files before and after reuse.
Trace Connection Lifecycle: Use netstat or Go execution traces to verify that TCP connections persist across multiple ConcurrentDownloader.Download calls to the same host.
Monitor Memory/GC: Validate that sync.Pool hits improve and heap allocations for workers decrease.
Feature Request: Worker Pool Reuse and Persistent Connection Management
Overview
Currently, Surge manages concurrency at two distinct levels: a high-level
WorkerPoolfor download tasks and a low-level set of goroutines withinConcurrentDownloaderfor individual connections. Each download creates its ownhttp.Clientand spawns a fresh set of workers, leading to significant overhead and sub-optimal resource utilization.We propose implementing a Unified Network Worker Pool and Persistent Connection Manager to reduce latency, improve throughput, and minimize CPU/Memory churn.
Problem Statement
The current implementation suffers from several inefficiencies:
ConcurrentDownloaderinstance initializes a newhttp.Client. Connections opened during mirror probing or previous downloads from the same host are discarded, forcing expensive TCP/TLS handshakes for every new task.ConcurrentDownloaderinstance, preventing efficient cross-download buffer reuse.Proposed Solution
1. Global Network Worker Pool
Instead of
ConcurrentDownloaderspawning its own goroutines, it should request workers from a global pool managed at the engine level.2. Centralized Connection Manager (Shared Transport)
Implement a singleton or managed set of
http.Transportinstances shared across the entire application.ProbeMirrorsWithProxyshould be kept alive and handed off to theConcurrentDownloader.3. Shared Buffer Pool
Promote the
sync.Poolfor byte buffers to a global level to ensure that memory used for one download's chunks can be immediately reused by another without waiting for GC.Technical Considerations
Architecture Change
graph TD Engine[Download Engine] --> WP[Global Worker Pool] Engine --> CM[Connection Manager] WP --> W1[Worker 1] WP --> W2[Worker 2] WP --> Wn[Worker n] W1 -- Requests --> CM W2 -- Requests --> CM CM -- Shared Transport --> Mirrors((Mirrors))Proposed Interface Changes
internal/engine/concurrent:Download()should accept a handle to the global worker pool and a shared client.internal/download: TheWorkerPoolshould be renamed toTaskPoolto avoid confusion, and a newNetworkWorkerPoolshould be introduced.Expected Benefits
Verification Plan
netstator Go execution traces to verify that TCP connections persist across multipleConcurrentDownloader.Downloadcalls to the same host.sync.Poolhits improve and heap allocations for workers decrease.