Skip to content

Noticeable speed difference between conda install and PyPI version #2163

@alykhantejani

Description

@alykhantejani

Found by @mirandaconrado

We wrote a simple benchmarking script and are comparing speed on OSX using the conda install of pytorch and the PyPI install.

Here are the results:

**Install from PyPI wheel (using pip)**
0.1.12_2
volatile = False 	batchsize = 1	time = 2.668	 samples/sec = 3.748
volatile = True  	batchsize = 1	time = 2.125	 samples/sec = 4.707
volatile = False 	batchsize = 2	time = 3.031	 samples/sec = 6.598
volatile = True  	batchsize = 2	time = 2.222	 samples/sec = 9.002
volatile = False 	batchsize = 4	time = 2.714	 samples/sec = 14.736
volatile = True  	batchsize = 4	time = 2.305	 samples/sec = 17.356
volatile = False 	batchsize = 8	time = 3.506	 samples/sec = 22.821
volatile = True  	batchsize = 8	time = 3.012	 samples/sec = 26.558
volatile = False 	batchsize = 16	time = 4.008	 samples/sec = 39.916
volatile = True  	batchsize = 16	time = 3.616	 samples/sec = 44.243
volatile = False 	batchsize = 32	time = 4.557	 samples/sec = 70.220
volatile = True  	batchsize = 32	time = 3.822	 samples/sec = 83.730

**From Conda install**
0.1.12_2
volatile = False 	batchsize = 1	time = 2.234	 samples/sec = 4.476
volatile = True  	batchsize = 1	time = 1.711	 samples/sec = 5.843
volatile = False 	batchsize = 2	time = 2.359	 samples/sec = 8.479
volatile = True  	batchsize = 2	time = 1.939	 samples/sec = 10.316
volatile = False 	batchsize = 4	time = 2.443	 samples/sec = 16.371
volatile = True  	batchsize = 4	time = 2.017	 samples/sec = 19.831
volatile = False 	batchsize = 8	time = 2.444	 samples/sec = 32.730
volatile = True  	batchsize = 8	time = 2.172	 samples/sec = 36.828
volatile = False 	batchsize = 16	time = 2.773	 samples/sec = 57.708
volatile = True  	batchsize = 16	time = 2.351	 samples/sec = 68.052
volatile = False 	batchsize = 32	time = 3.424	 samples/sec = 93.453
volatile = True  	batchsize = 32	time = 2.996	 samples/sec = 106.800

Here's the script I'm using to generate the results:
speed_comparisson_test.sh

conda create --name pytorch_speed_from_pypi -y python=2.7.13 numpy pyyaml
source activate pytorch_speed_from_pypi
wget http://download.pytorch.org/whl/torch-0.1.12.post2-cp27-none-macosx_10_7_x86_64.whl
pip uninstall -y torch
pip install torch-0.1.12.post2-cp27-none-macosx_10_7_x86_64.whl --user
python test.py
pip uninstall -y torch
source deactivate
conda-env remove --name pytorch_speed_from_pypi -y
rm torch-0.1.12.post2-cp27-none-macosx_10_7_x86_64.whl

conda create --name pytorch_speed_conda_only -y python=2.7.13 numpy pyyaml
source activate pytorch_speed_conda_only
conda install pytorch -y -c soumith
python test.py
source deactivate
conda-env remove -y --name pytorch_speed_conda_only

And here's a dump of test.py.

It's also worth noting that when speed_comparisson_test.sh is run with OMP_NUM_THREADS=1 I don't see a significant difference in speed from the conda install vs the wheel.

cc @VitalyFedyunin @ngimel @mruberry

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: performanceIssues related to performance, either of kernel code or framework gluetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions