Support sparse tensors in THPP #667

colesbury · 2017-02-01T19:49:46Z

The only op currently supported is dense tensor + sparse tensor addition, which is all we need for autograd.

I've made tensor_type tensor public. The alternative is to add a getter, but getters always seemed verbose and lame in C++.

torch/lib/THPP/tensors/THCTensor.hpp

torch/lib/THPP/tensors/generic/THSTensor.cpp

apaszke · 2017-02-01T20:52:35Z

I've merged my branch and we have a conflict now 😕

torch/lib/THPP/tensors/generic/THCTensor.cpp

 }

+template<>
+bool THCTensor<real>::isSparse() const {


torch/lib/THPP/tensors/generic/THSTensor.cpp

+
+template<>
+long long THSTensor<real>::numel() const {
+  throw std::runtime_error("THSTensor::>::() not supported");


…checker (pytorch#607)" (pytorch#667) onnx/onnx@9d2b530

…9c90c8 Previous import was a4dcc47791eb127652f5aaddd51d8896d446a067 Included changes: - **[985af3f](onnx/onnx@985af3f)**: Update PythonAPIOverview.md (pytorch#738) <Dmytro Dzhulgakov> - **[b69be33](onnx/onnx@b69be33)**: Add backend test for upsample (pytorch#729) <Sebastian Meßmer> - **[0d9496e](onnx/onnx@0d9496e)**: Input test data of concat op should be float (pytorch#711) <Changming Sun> - **[20bcb8b](onnx/onnx@20bcb8b)**: Fix the spec for batchnorm and instancenorm (pytorch#733) <Lu Fang> - **[c9f825f](onnx/onnx@c9f825f)**: Refine a little bit about op spec. (pytorch#666) <Ke Zhang> - **[a484eb2](onnx/onnx@a484eb2)**: Fix an error in Conv doc (pytorch#731) <Lu Fang> - **[7410cc4](onnx/onnx@7410cc4)**: Fix incorrect package output paths (pytorch#730) <bddppq> - **[be546e2](onnx/onnx@be546e2)**: Improve optimizer's API and docs (pytorch#713) <Lu Fang> - **[c61506f](onnx/onnx@c61506f)**: Fix the shape inference python API (pytorch#716) <Lu Fang> - **[e9d4134](onnx/onnx@e9d4134)**: Fix cmake on windows when not building python extension (pytorch#728) <bddppq> - **[72187aa](onnx/onnx@72187aa)**: Add value_info support in make_graph (pytorch#726) <Lu Fang> - **[67b7d89](onnx/onnx@67b7d89)**: Fix gen_proto in cmake (pytorch#719) <bddppq> - **[fcb4ae3](onnx/onnx@fcb4ae3)**: docs rewording: Important Python Functions -> Python API Overview (pytorch#721) <anderspapitto> - **[24275d6](onnx/onnx@24275d6)**: Ignore .eggs directory when doing lint (pytorch#722) <bddppq> - **[54be8fa](onnx/onnx@54be8fa)**: Use cmake3 if it's available (pytorch#718) <bddppq> - **[b8c4238](onnx/onnx@b8c4238)**: Add python function docs (pytorch#714) <Lu Fang> - **[e177493](onnx/onnx@e177493)**: Remove unused cmake utils (pytorch#712) <bddppq> - **[72d6ad6](onnx/onnx@72d6ad6)**: Remove pycmd from CMake (pytorch#710) <bddppq> - **[93f0d40](onnx/onnx@93f0d40)**: Fix windows local build (pytorch#709) <Raymond Yang> - **[6734224](onnx/onnx@6734224)**: CMake fixes and setup.py cleanup (pytorch#706) <bddppq> - **[7f6a4fd](onnx/onnx@7f6a4fd)**: Add docs to explain important functions in ONNX Infra (pytorch#682) <Lu Fang> - **[f0f6b3d](onnx/onnx@f0f6b3d)**: fix hardmax test cases make output dtype same as input (pytorch#705) <Wenhao Hu> - **[c970f0c](onnx/onnx@c970f0c)**: Fix the Dummy backend (pytorch#701) <Lu Fang> - **[2af45df](onnx/onnx@2af45df)**: setup.py uses cmake build system (pytorch#606) <anderspapitto> - **[dfcaade](onnx/onnx@dfcaade)**: clean up unused variable left by removing consumed_input (pytorch#697) <bddppq> - **[accfc74](onnx/onnx@accfc74)**: Remove incorrect backend test (pytorch#700) <Lu Fang> - **[e558732](onnx/onnx@e558732)**: add max inclusive version to defs.get_schema function (pytorch#695) <Wenhao Hu> - **[16f02eb](onnx/onnx@16f02eb)**: add API to add domain to min/max version for extension. (pytorch#694) <Ke Zhang> - **[3e560dd](onnx/onnx@3e560dd)**: Fix doc for initializer (pytorch#690) <bddppq> - **[6cc4f53](onnx/onnx@6cc4f53)**: Add model save function (pytorch#692) <Lu Fang> - **[21eaf9b](onnx/onnx@21eaf9b)**: Changing the string discussing versions in operator specifications. (pytorch#691) <Niklas Gustafsson> - **[3b0cdf4](onnx/onnx@3b0cdf4)**: Minor code quality improvements in optimizer/ (pytorch#612) <Sebastian Meßmer> - **[641f126](onnx/onnx@641f126)**: Fix Gemm doc wording (pytorch#689) <bddppq> - **[4a0ec75](onnx/onnx@4a0ec75)**: Clarifies installation error message when external protobuf dependencies are missing (pytorch#684) <Daniel J. H> - **[960a2c3](onnx/onnx@960a2c3)**: Check outputs dtype in backend tests (pytorch#567) <bddppq> - **[1d7dee4](onnx/onnx@1d7dee4)**: Fix Average pool test cases converted from PyTorch (pytorch#677) <Lu Fang> - **[36d7fff](onnx/onnx@36d7fff)**: Fix Attribute default value pybind11 binding (pytorch#671) <bddppq> - **[0536866](onnx/onnx@0536866)**: git ignore .pytest_cache (pytorch#674) <bddppq> - **[afc84ac](onnx/onnx@afc84ac)**: Update README.md (pytorch#672) <Dmytro Dzhulgakov> - **[9d2b530](onnx/onnx@9d2b530)**: Revert "[Typing 1/3] Setup mypy type checker (pytorch#607)" (pytorch#667) <bddppq> - **[086727e](onnx/onnx@086727e)**: [Typing 1/3] Setup mypy type checker (pytorch#607) <Sebastian Meßmer> - **[5716e20](onnx/onnx@5716e20)**: Convert all Node tests to Model tests (pytorch#651) <bddppq> - **[6fe932a](onnx/onnx@6fe932a)**: Replace unittest.skip with custom exception (pytorch#659) <Dmytro Dzhulgakov> - **[ecac1c1](onnx/onnx@ecac1c1)**: Merge Rel 1.1.0 branch into master (pytorch#657) <Anirudh> - **[5cb999d](onnx/onnx@5cb999d)**: Minor cleanups to shape inference (pytorch#653) <anderspapitto> - **[f4acf28](onnx/onnx@f4acf28)**: Remove allowconsumed enforceconsumed from op schema. (pytorch#617) <Ke Zhang> - **[a8e4648](onnx/onnx@a8e4648)**: Adjust link flags when built in Windows Debug mode (pytorch#647) <Yinghai Lu> - **[7c009fe](onnx/onnx@7c009fe)**: Fix lint error in optimizer test (pytorch#656) <bddppq> - **[063d12f](onnx/onnx@063d12f)**: Fix optimizer split pass for models with constant output (pytorch#652) <bddppq>

This reverts commit b9fde03.

Signed-off-by: Eli Uriegas <[email protected]>

* wmma_op + unit test * add arch limitation to wmma test * change arch limitation * Refactor + Add all type unit test(int4 compile failed) * Add f32_16x16x16_bf16 unit test * tempsave * tempsave * tempsave * runtime bug, cannot find symbol * workaround for incorrect HIP warpSize return value * debugging * tempsave * Correctness OK, waiting for optimization * Tidy up + format * temp save * temp save, reproduce the v_bfi_b32 issue * add inline asm for wmmaop test * tidy up * clean some debug purpose code * discard some codes * clang format * clang format * compiler issue fixed + increase tile size * navi3x_multipleD+example * temp save * workable * batchedgemm[OK], groupconv[debug] * groupconv: Sanity check[OK], Performance[Bad] * navi3x_groupconv_need_optimization * create necessary files * save progress * Add Inter-Row thread transfer * save progress * save debugging progress * sanity check pass * fix a host tensor bug and clean up flash-attn code * format * cancel unnecessary change * cancel unnecessary change * cancel unnecessary change * temp save, add asm backend flag to amd_wmma * Mat-A LDS Bypass sanity pass * temp save * gemm sanity fix * Porting new blockwise gemm to flash attention * Example branch provide to compiler team * tempsave * Fix a bug * batched gemm ported * conv A-skip lds ported * Skip B-Lds real gemm * Skip B Lds Gemm + MulD * batched gemm, conv, skip b lds * format * Attn, skip b lds * Change GridwiseOp nam * fix a typo caused bug * Skip A_Lds sanity pass, Skip B_Lds scratch occured * Bug found, intra-row permute off caused * bug found * a fix * disable buffer load due to incorrect 3rd dword * update fmha config, no scratch generated * update 3rd dword * fmha config update * FMHA, add support to gfx1101/gfx1102 * Merge origin dev (pytorch#2) * [Navi3x] Fix Gridwise_multiple_d operation (pytorch#649) * Add CMake Option "USE_OPT_NAVI3X" * fix bug * standardize docs (pytorch#655) * Separate bibtex requirement from rocm-docs-core (pytorch#656) * separate bibtex requirement from rocm-docs-core * point requirements to source rocm-docs-core repo * Add CMake Option "USE_OPT_NAVI3X" (pytorch#647) * Add CMake Option "USE_OPT_NAVI3X" * remove navi3x opt compile option from cmake script * Conv + quantization + tanh (pytorch#645) * Rename file. Prepare to support another activation * Add comment for quantization * Extract out_elementop * Add tanh example * Add conv + bias + tanh quantization instance * Add missing parameter * Refine cmake * Add external api and client example * Extract variable in example * Fix the comment --------- Co-authored-by: zjing14 <[email protected]> * Add a denorm test fix (pytorch#603) * Add type_convert implementations for bf16 * Add the fix for conv_fwd * Add the fix for conv_bwd_data * Add the fix for conv_bwd_weight * Format * Format * Another format * Add a macro to use workaround on MI200 only * Format --------- Co-authored-by: Rosty Geyyer <[email protected]> Co-authored-by: zjing14 <[email protected]> * simplify karg in device/grid of split-k op (pytorch#644) * simplify karg in device/grid split-k op * fix mk_kn_mn instances * add more instances * use name from tensor layout * fix 3rd dword of buffer source descriptor (pytorch#659) * add fp64 instances (pytorch#658) Co-authored-by: root <[email protected]> * Issue pytorch#666: Revert "simplify karg in device/grid of split-k op (pytorch#644)" (pytorch#665) This reverts commit bb5530a. * Groupnorm + swish external api (pytorch#668) * Rename to proper naming * Add example of groupnorm + swish * Extract duplicate code in example * Add groupnorm + swish instances * Ractor instance generation, split into multiple cpp file * Add external api and client example * Refine profiler message * Use ck math version of exp * Refine problem size in example * Add host version of exp * add a marco to turn on/off denorm fix (off by default) (pytorch#673) * add a marco to turn off denorm fix by default * expose the marco --------- Co-authored-by: root <[email protected]> * fixed quant example (pytorch#672) Co-authored-by: root <[email protected]> * Add dependabot config and pin rocm-docs-core (pytorch#663) * [gtest] suppress unsafe buffer warn (pytorch#670) ref: ROCm/MIOpen#1912 * Add memory index guard in wmma device ops (pytorch#667) * Add more macros to turn on/off denorm fix (pytorch#678) Co-authored-by: Rosty Geyyer <[email protected]> * Fix a typo (pytorch#676) * Add (pytorch#677) * Allow using ROCm release candidate compilers. (pytorch#679) * enable use of rocm5.5 release candidate 4 * upgrade to ROCM5.5 RC5 * try fix the PUB_KEY error, remove the cmake-data package * upgrade to latest cmake version * use private dockerhub repo for rocm5.5 rc5 * add missing bracket * add vector load check * solve conflicts --------- Co-authored-by: Sam Wu <[email protected]> Co-authored-by: Sam Wu <[email protected]> Co-authored-by: rocking5566 <[email protected]> Co-authored-by: zjing14 <[email protected]> Co-authored-by: Rostyslav Geyyer <[email protected]> Co-authored-by: Rosty Geyyer <[email protected]> Co-authored-by: carlushuang <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: Jun Liu <[email protected]> Co-authored-by: Illia Silin <[email protected]> * Disable SkipLDS & Align AIT api (pytorch#3) * fix layernorm, reduction Ops (pytorch#4) * [Navi3x] Fix Gridwise_multiple_d operation (pytorch#649) * Add CMake Option "USE_OPT_NAVI3X" * fix bug * standardize docs (pytorch#655) * Separate bibtex requirement from rocm-docs-core (pytorch#656) * separate bibtex requirement from rocm-docs-core * point requirements to source rocm-docs-core repo * Add CMake Option "USE_OPT_NAVI3X" (pytorch#647) * Add CMake Option "USE_OPT_NAVI3X" * remove navi3x opt compile option from cmake script * Conv + quantization + tanh (pytorch#645) * Rename file. Prepare to support another activation * Add comment for quantization * Extract out_elementop * Add tanh example * Add conv + bias + tanh quantization instance * Add missing parameter * Refine cmake * Add external api and client example * Extract variable in example * Fix the comment --------- Co-authored-by: zjing14 <[email protected]> * Add a denorm test fix (pytorch#603) * Add type_convert implementations for bf16 * Add the fix for conv_fwd * Add the fix for conv_bwd_data * Add the fix for conv_bwd_weight * Format * Format * Another format * Add a macro to use workaround on MI200 only * Format --------- Co-authored-by: Rosty Geyyer <[email protected]> Co-authored-by: zjing14 <[email protected]> * simplify karg in device/grid of split-k op (pytorch#644) * simplify karg in device/grid split-k op * fix mk_kn_mn instances * add more instances * use name from tensor layout * fix 3rd dword of buffer source descriptor (pytorch#659) * add fp64 instances (pytorch#658) Co-authored-by: root <[email protected]> * Issue pytorch#666: Revert "simplify karg in device/grid of split-k op (pytorch#644)" (pytorch#665) This reverts commit bb5530a. * Groupnorm + swish external api (pytorch#668) * Rename to proper naming * Add example of groupnorm + swish * Extract duplicate code in example * Add groupnorm + swish instances * Ractor instance generation, split into multiple cpp file * Add external api and client example * Refine profiler message * Use ck math version of exp * Refine problem size in example * Add host version of exp * add a marco to turn on/off denorm fix (off by default) (pytorch#673) * add a marco to turn off denorm fix by default * expose the marco --------- Co-authored-by: root <[email protected]> * fixed quant example (pytorch#672) Co-authored-by: root <[email protected]> * Add dependabot config and pin rocm-docs-core (pytorch#663) * [gtest] suppress unsafe buffer warn (pytorch#670) ref: ROCm/MIOpen#1912 * Add memory index guard in wmma device ops (pytorch#667) * Add more macros to turn on/off denorm fix (pytorch#678) Co-authored-by: Rosty Geyyer <[email protected]> * Fix a typo (pytorch#676) * Add (pytorch#677) * Allow using ROCm release candidate compilers. (pytorch#679) * enable use of rocm5.5 release candidate 4 * upgrade to ROCM5.5 RC5 * try fix the PUB_KEY error, remove the cmake-data package * upgrade to latest cmake version * use private dockerhub repo for rocm5.5 rc5 * add missing bracket * Disable SkipLDS & Align AIT api * Update dependabot config (pytorch#682) Co-authored-by: samjwu <[email protected]> * update attn api * solve type_convert bug + enable --------- Co-authored-by: Sam Wu <[email protected]> Co-authored-by: Sam Wu <[email protected]> Co-authored-by: rocking5566 <[email protected]> Co-authored-by: zjing14 <[email protected]> Co-authored-by: Rostyslav Geyyer <[email protected]> Co-authored-by: Rosty Geyyer <[email protected]> Co-authored-by: carlushuang <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: Jun Liu <[email protected]> Co-authored-by: Illia Silin <[email protected]> Co-authored-by: samjwu <[email protected]> Co-authored-by: haocwang <[email protected]> * fix typo * Fix attention with causal mask * multiple fix, try ait compile * Add A/B not use LDS pipeline * Clang format, Add gfx1101, gfx1102 support of FMHA example * cancel change of format script * 1. Enable 2-stage global Prefetch ( May cause VGPR spilling) 2. Enable FP16 accumulator blockwise_gemm * clang-format * 1. change blockwise gemm loopover direction from kmn to mnk ( ~1% improvement) 2. change kernel timing mode to 50 warmup + 50 timed repeat * Update low level abstration of blockwise gemm wmma * (2/5) bilinear gemm pass, perf bug: skip a lds has lower performance than skip b lds * (3/5) batched gemm pass, perf bug: skip a lds has lower performance than skip b lds * (4/5) grouped conv pass * (5/5) attention pass, todo: debug lds perf bug * AIT Attention API refactor (pytorch#8) * sanity pass * sanity pass 2 * confirm significant performance regression. * turn on all instances * turn off instance format * Fix bug & tunning & format * DML meta, self_attn+cross_attn * sanity pass * remove useless flag * update tile and problem size used in AIT attention * bug fix in grouped conv supporting check * deprecate inline asm wmma * Bug fix: double lds skip * clang-format * Fix errors in 1. example, fmha 2. gridwise pipeline 3. deviceop, fmha, change some containers from vector to array * part2 of previous commit * clang format * API fix of gridwisegemmpipeline * separate array base and vector base attention tensor transformation * fix gemm * clang format * add gemm fp16 instances * Temp save * fpAintB kernel compile pass * Sanity pass. * Temp save * debug code enabled * Fp16AInt8B_GEMM sanity * MQA implementation * GQA-4 example * tempsave * Compile pass * New implementation of fp16Aint8B Gemm, Acheieve similar math throughput with native fp16 Gemm * format * Todo: fix gemm_bilinear_wmma instances compilation bug * Solve a bug when K1=16 * remove unnecessary changes * Remove tensor layout limitation to LDS usage in tesnor contraction * update self-attention and cross-attention * fix a typo of name * Add arch limiter for fp8 gemm * enable fp8 gemm_xdl for all gfx9 targets * temporarily disable gemm_xdl_fp16_fp8 on MI100/200 * fix the cmake logic for gemm_xdl_fp16_fp8 * re-enable the gemm_xdl_fp16_fp8 on MI100/200 --------- Co-authored-by: aska-0096 <[email protected]> Co-authored-by: Sam Wu <[email protected]> Co-authored-by: Sam Wu <[email protected]> Co-authored-by: rocking5566 <[email protected]> Co-authored-by: Rostyslav Geyyer <[email protected]> Co-authored-by: Rosty Geyyer <[email protected]> Co-authored-by: carlushuang <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: Jun Liu <[email protected]> Co-authored-by: Illia Silin <[email protected]> Co-authored-by: samjwu <[email protected]> Co-authored-by: haocwang <[email protected]> Co-authored-by: illsilin <[email protected]>

apaszke reviewed Feb 1, 2017

View reviewed changes

torch/lib/THPP/tensors/THCTensor.hpp Outdated

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

torch/lib/THPP/tensors/generic/THSTensor.cpp Outdated

This comment was marked as off-topic.

Sign in to view

colesbury force-pushed the thpp_sparse branch from 2af5b92 to e87f14b Compare February 1, 2017 22:03

Support sparse tensors in THPP

cba2a7b

colesbury force-pushed the thpp_sparse branch from e87f14b to cba2a7b Compare February 1, 2017 22:05

colesbury merged commit 138f254 into pytorch:master Feb 1, 2017

colesbury deleted the thpp_sparse branch February 1, 2017 22:44

fmassa reviewed Feb 1, 2017

View reviewed changes

torch/lib/THPP/tensors/generic/THCTensor.cpp

}

template<>

bool THCTensor<real>::isSparse() const {

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

fmassa reviewed Feb 1, 2017

View reviewed changes

torch/lib/THPP/tensors/generic/THSTensor.cpp

template<>

long long THSTensor<real>::numel() const {

throw std::runtime_error("THSTensor::>::() not supported");

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

zou3519 pushed a commit to zou3519/pytorch that referenced this pull request Mar 30, 2018

[auto] Update onnx to 9d2b530 - Revert "[Typing 1/3] Setup mypy type …

c4e5001

…checker (pytorch#607)" (pytorch#667) onnx/onnx@9d2b530

jjsjann123 pushed a commit to jjsjann123/pytorch that referenced this pull request Apr 11, 2021

Revert "Fusion::lookupValue() (pytorch#652)" (pytorch#667)

f6a6b35

This reverts commit b9fde03.

KyleCZH pushed a commit to KyleCZH/pytorch that referenced this pull request Sep 20, 2021

conda: Unpin numpy's run dependency (pytorch#667)

58d7a67

Signed-off-by: Eli Uriegas <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support sparse tensors in THPP #667

Support sparse tensors in THPP #667

Uh oh!

colesbury commented Feb 1, 2017

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

apaszke commented Feb 1, 2017

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Support sparse tensors in THPP #667

Support sparse tensors in THPP #667

Uh oh!

Conversation

colesbury commented Feb 1, 2017

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

apaszke commented Feb 1, 2017

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants