Skip to content

Conversation

@syed-ahmed
Copy link
Collaborator

@syed-ahmed syed-ahmed commented Jun 3, 2019

Stack from ghstack:

Resubmit of #20624

Effective Bandwidth Benchmark

Float Type

Before:

log_normal, size, elements 65536 forward 4.84466552734375e-06 bandwidth (GB/s) 54.1098242015748
log_normal, size, elements 131072 forward 5.0425529479980465e-06 bandwidth (GB/s) 103.97273075895983
log_normal, size, elements 262144 forward 7.326602935791016e-06 bandwidth (GB/s) 143.11898832098927
log_normal, size, elements 524288 forward 1.1749267578125e-05 bandwidth (GB/s) 178.49214736623378
log_normal, size, elements 1048576 forward 2.05230712890625e-05 bandwidth (GB/s) 204.37019103643124
log_normal, size, elements 2097152 forward 3.9284229278564456e-05 bandwidth (GB/s) 213.53627534643442
log_normal, size, elements 4194304 forward 7.281541824340821e-05 bandwidth (GB/s) 230.40746595613766
log_normal, size, elements 8388608 forward 0.00013544559478759766 bandwidth (GB/s) 247.7336531514311
log_normal, size, elements 16777216 forward 0.0002670741081237793 bandwidth (GB/s) 251.27431659866272
log_normal, size, elements 33554432 forward 0.0005250406265258789 bandwidth (GB/s) 255.63303336753222

After:

log_normal, size, elements 65536 forward 5.47647476196289e-06 bandwidth (GB/s) 47.86728897588159
log_normal, size, elements 131072 forward 6.859302520751953e-06 bandwidth (GB/s) 76.43459351936045
log_normal, size, elements 262144 forward 7.7056884765625e-06 bandwidth (GB/s) 136.07817175445544
log_normal, size, elements 524288 forward 8.029937744140625e-06 bandwidth (GB/s) 261.1666574289786
log_normal, size, elements 1048576 forward 1.1892318725585938e-05 bandwidth (GB/s) 352.6901773138733
log_normal, size, elements 2097152 forward 1.9683837890625e-05 bandwidth (GB/s) 426.1672975875969
log_normal, size, elements 4194304 forward 3.241539001464844e-05 bandwidth (GB/s) 517.5694629130921
log_normal, size, elements 8388608 forward 5.803346633911133e-05 bandwidth (GB/s) 578.1910700272298
log_normal, size, elements 16777216 forward 0.00011091709136962891 bandwidth (GB/s) 605.0362768381755
log_normal, size, elements 33554432 forward 0.00021491527557373046 bandwidth (GB/s) 624.5146029834174

Double Type

Before:

log_normal, size, elements 65536 forward 5.793571472167969e-06 bandwidth (GB/s) 45.247392089547326
log_normal, size, elements 131072 forward 8.199214935302735e-06 bandwidth (GB/s) 63.943682918057576
log_normal, size, elements 262144 forward 1.3582706451416015e-05 bandwidth (GB/s) 77.19934195373004
log_normal, size, elements 524288 forward 2.3326873779296876e-05 bandwidth (GB/s) 89.90283137988553
log_normal, size, elements 1048576 forward 4.379749298095703e-05 bandwidth (GB/s) 95.76584673062604
log_normal, size, elements 2097152 forward 8.105754852294922e-05 bandwidth (GB/s) 103.48953493979646
log_normal, size, elements 4194304 forward 0.0001421213150024414 bandwidth (GB/s) 118.04855590951854
log_normal, size, elements 8388608 forward 0.00027796506881713865 bandwidth (GB/s) 120.71456367804988
log_normal, size, elements 16777216 forward 0.0005494546890258789 bandwidth (GB/s) 122.13721229493271
log_normal, size, elements 33554432 forward 0.0010767412185668946 bandwidth (GB/s) 124.65179718729368

After:

log_normal, size, elements 65536 forward 5.91278076171875e-06 bandwidth (GB/s) 44.33514628129032
log_normal, size, elements 131072 forward 7.789134979248047e-06 bandwidth (GB/s) 67.31017005056627
log_normal, size, elements 262144 forward 9.219646453857422e-06 bandwidth (GB/s) 113.73277763392811
log_normal, size, elements 524288 forward 1.5113353729248047e-05 bandwidth (GB/s) 138.7615242500079
log_normal, size, elements 1048576 forward 2.7089118957519532e-05 bandwidth (GB/s) 154.83353321964444
log_normal, size, elements 2097152 forward 4.64177131652832e-05 bandwidth (GB/s) 180.71997580169503
log_normal, size, elements 4194304 forward 8.719682693481446e-05 bandwidth (GB/s) 192.40626740399748
log_normal, size, elements 8388608 forward 0.0001693272590637207 bandwidth (GB/s) 198.16320293339717
log_normal, size, elements 16777216 forward 0.00033437252044677735 bandwidth (GB/s) 200.70089464986953
log_normal, size, elements 33554432 forward 0.0006206154823303223 bandwidth (GB/s) 216.26551676737367

Differential Revision: D15632930

@pytorchbot pytorchbot added module: cuda Related to torch.cuda, and CUDA support in general module: internals Related to internal abstractions in c10 and ATen module: operators labels Jun 3, 2019
Move THCTensor_(lognormal) to ATen

gh-metadata: pytorch pytorch 21299 gh/syed-ahmed/11/head
@syed-ahmed syed-ahmed changed the title Move THCTensor_(exponential) to ATen Move THCTensor_(lognormal) to ATen Jun 3, 2019
Move THCTensor_(lognormal) to ATen

gh-metadata: pytorch pytorch 21299 gh/syed-ahmed/11/head
Move THCTensor_(lognormal) to ATen

gh-metadata: pytorch pytorch 21299 gh/syed-ahmed/11/head
@zou3519 zou3519 deleted the gh/syed-ahmed/11/head branch June 5, 2019 02:16
@facebook-github-bot
Copy link
Contributor

@ezyang merged this pull request in c82bf8e.

zdevito pushed a commit to zdevito/ATen that referenced this pull request Jun 5, 2019
Summary:
Pull Request resolved: pytorch/pytorch#21299
ghimport-source-id: 2c63f289f02087f023feda8bff6b90ed49737889

Reviewed By: jerryzh168

Differential Revision: D15632930

Pulled By: ezyang

fbshipit-source-id: 85c17cdca486b46942c5b500e4fd4d95bb5657f9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: cuda Related to torch.cuda, and CUDA support in general module: internals Related to internal abstractions in c10 and ATen open source

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants