Noticeable speed difference between conda install and PyPI version

Found by @mirandaconrado

We wrote a simple benchmarking script and are comparing speed on OSX using the conda install of pytorch and the PyPI install. 

Here are the results:
```
**Install from PyPI wheel (using pip)**
0.1.12_2
volatile = False 	batchsize = 1	time = 2.668	 samples/sec = 3.748
volatile = True  	batchsize = 1	time = 2.125	 samples/sec = 4.707
volatile = False 	batchsize = 2	time = 3.031	 samples/sec = 6.598
volatile = True  	batchsize = 2	time = 2.222	 samples/sec = 9.002
volatile = False 	batchsize = 4	time = 2.714	 samples/sec = 14.736
volatile = True  	batchsize = 4	time = 2.305	 samples/sec = 17.356
volatile = False 	batchsize = 8	time = 3.506	 samples/sec = 22.821
volatile = True  	batchsize = 8	time = 3.012	 samples/sec = 26.558
volatile = False 	batchsize = 16	time = 4.008	 samples/sec = 39.916
volatile = True  	batchsize = 16	time = 3.616	 samples/sec = 44.243
volatile = False 	batchsize = 32	time = 4.557	 samples/sec = 70.220
volatile = True  	batchsize = 32	time = 3.822	 samples/sec = 83.730

**From Conda install**
0.1.12_2
volatile = False 	batchsize = 1	time = 2.234	 samples/sec = 4.476
volatile = True  	batchsize = 1	time = 1.711	 samples/sec = 5.843
volatile = False 	batchsize = 2	time = 2.359	 samples/sec = 8.479
volatile = True  	batchsize = 2	time = 1.939	 samples/sec = 10.316
volatile = False 	batchsize = 4	time = 2.443	 samples/sec = 16.371
volatile = True  	batchsize = 4	time = 2.017	 samples/sec = 19.831
volatile = False 	batchsize = 8	time = 2.444	 samples/sec = 32.730
volatile = True  	batchsize = 8	time = 2.172	 samples/sec = 36.828
volatile = False 	batchsize = 16	time = 2.773	 samples/sec = 57.708
volatile = True  	batchsize = 16	time = 2.351	 samples/sec = 68.052
volatile = False 	batchsize = 32	time = 3.424	 samples/sec = 93.453
volatile = True  	batchsize = 32	time = 2.996	 samples/sec = 106.800
```

Here's the script I'm using to generate the results:
`speed_comparisson_test.sh`
```bash
conda create --name pytorch_speed_from_pypi -y python=2.7.13 numpy pyyaml
source activate pytorch_speed_from_pypi
wget http://download.pytorch.org/whl/torch-0.1.12.post2-cp27-none-macosx_10_7_x86_64.whl
pip uninstall -y torch
pip install torch-0.1.12.post2-cp27-none-macosx_10_7_x86_64.whl --user
python test.py
pip uninstall -y torch
source deactivate
conda-env remove --name pytorch_speed_from_pypi -y
rm torch-0.1.12.post2-cp27-none-macosx_10_7_x86_64.whl

conda create --name pytorch_speed_conda_only -y python=2.7.13 numpy pyyaml
source activate pytorch_speed_conda_only
conda install pytorch -y -c soumith
python test.py
source deactivate
conda-env remove -y --name pytorch_speed_conda_only
```

And [here's a dump](https://pastebin.com/f8kzP3Ff) of `test.py`.

It's also worth noting that when `speed_comparisson_test.sh` is run with `OMP_NUM_THREADS=1` I don't see a significant difference in speed from the conda install vs the wheel.


cc @VitalyFedyunin @ngimel @mruberry

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Noticeable speed difference between conda install and PyPI version #2163

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Noticeable speed difference between conda install and PyPI version #2163

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions