There's probably a factor of 5-6x to be had here. I could potentially port in some experimental code I have in https://github.com/oconnor663/chacha20_simd. What do people think?