@rg0now has proposed the following in #267:
The fact that there is only a single readLoop() per each net.PacketConn limits the max performance of UDP TURN listeners quite substantially, since there is a single thread that handles all clients connecting via the listener (see l7mp/stunner#60). This is in contrast to net.Conns, which spawn per-client-connection threads. Would you consider a somewhat intrusive patch to remove this bottleneck?