I think it might be nice to estimate variances using incremental algorithm instead of smoothing variances estimated in every batch.
- In this case BN will work fine even with batch size of 1.
- Estimated variance will be less biased when batch size is small.