The user stats build is currently the slowest step by a wide margin. Investigate parallelizing the aggregation and shard writing (or splitting by username ranges / month buckets).\n\nIdeas:\n- Parallelize by shard group or username prefix.\n- Pre-aggregate monthly counts per raw chunk, then reduce.\n- Consider a staged temp DB per worker, then merge.\n- Explore WAL + chunked transactions to reduce lock contention.