Workers
Configuring worker count, thread vs process mode
Tuning parallelism for your workload.
Compression
Zstd, gzip, and content negotiation
Zero-dependency compression with Python 3.14 stdlib.
Production
Hardening, reverse proxy, and scaling patterns
Running Pounce in production environments.
Graceful Shutdown
Connection draining for Kubernetes
Zero dropped requests during rolling deployments.
Graceful Reload
Zero-downtime SIGHUP reload
Rolling worker restart without dropping connections.
Hot Reload
In-process code updates
Deploy new code without connection drops.
OpenTelemetry
Distributed tracing with OTLP
Native integration for Jaeger, Datadog, Tempo.
Security
Proxy headers, CRLF protection, request smuggling
Built-in security features for production deployments.
Observability
Health checks, request IDs, Prometheus metrics
Monitoring, tracing, and metrics for production.
Prometheus Metrics
Built-in /metrics endpoint for scraping
Prometheus text format export with zero dependencies.
Rate Limiting
Per-IP token bucket rate limiting
Protect against abusive clients and API abuse.
Request Queueing
Bounded queue with load shedding
Graceful degradation under traffic spikes.
Sentry
Error tracking and performance monitoring
Optional Sentry integration for production errors.