Simple command --> sambamba index -t 20 any_large_file.bam
0.6.7 rarely uses more than 5 threads under top while 0.6.6 uses all 20 I gave it.
Test on 105GB bam file using 20 threads.
0.6.6 --> 3m35.989s
0.6.7 --> 12m44.751s
I see the same pattern on all files on multiple machines (both AMD and Intel).