Skip to content

Conversation

@aam-at
Copy link
Contributor

@aam-at aam-at commented Apr 27, 2017

Use magma_*gesdd instead of magma_*gesvd. Gesdd uses divide-and-conquer algorithm and is significantly faster than gesvd routine. Speed comparison (10 runs, Titan X):

N Old New
1088 0.7469 0.2180
2112 3.4861 0.9318
3136 8.7378 2.2060
4160 14.7790 4.3269
5184 25.9419 8.6554

@colesbury
Copy link
Member

@pytorchbot test this please

@soumith
Copy link
Contributor

soumith commented Apr 28, 2017

merged via a44317f

Thanks a lot @aam-at

@soumith soumith closed this Apr 28, 2017
@aam-at aam-at deleted the magma_sgesdd branch April 28, 2017 12:42
@aam-at
Copy link
Contributor Author

aam-at commented Apr 28, 2017

@soumith Thanks!

@aam-at aam-at restored the magma_sgesdd branch May 5, 2017 06:12
@aam-at aam-at deleted the magma_sgesdd branch May 5, 2017 06:13
jjsjann123 pushed a commit to jjsjann123/pytorch that referenced this pull request Jan 26, 2022
* Have Kernel Inherit IrContainer (pytorch#1375)
* Kernel<-Fusion Step 1 - Convert ExprSort to StmtSort (pytorch#1376)
* Kernel<-Fusion Step 2 - Mutator refactor (pytorch#1377)
* Kernel<-Fusion Step 3 - Debug print for expr_eval and type promotion fix (pytorch#1379)
* Kernel<-Fusion Step 4 - Have kernel inherit Fusion (pytorch#1380)
* Kernel<-Fusion Step 5 - Move lowering passes into their own files (pytorch#1382)
* Kernel<-Fusion Step 6 - Remove kir::IrBuilder (pytorch#1383)
* Kernel<-Fusion Step 7 - Remove kir functions from ComputeAtMap (pytorch#1384)
* Kernel<-Fusion Step 8 - Clean up [lower/executor] utils (pytorch#1387)
* Kernel<-Fusion Step 9 - Remove TensorView::fuserTv (pytorch#1388)
* Kernel<-Fusion Step 10 - Remove lowerVal/lowerExpr (pytorch#1389)
* Kernel<-Fusion Step 11 - Finish cleaning up kir (pytorch#1390)
hubertlu-tw pushed a commit to hubertlu-tw/pytorch that referenced this pull request Nov 1, 2022
* initial check in

* fix

* fix test

* address some review comments and cleanup

* fix

* bookmark

* fix sync placement to come before gather

* similar fix for non-gather case

* add async bert

* update gpt minimal test

* allow selection of default pp test

* fix bert test

* cleanup

* cleanup
rraminen pushed a commit to rraminen/pytorch that referenced this pull request May 14, 2024
* Pass wheel destination directory to fix_so.sh

* Update triton commit

* Using shell=True results in error processing arguments to script

* Update triton commit post-PR merge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants