Skip to content

CUDA kernel for ClipGradNorm for TensorSeq gradients#12412

Merged
baijumeswani merged 6 commits intomasterfrom
bmeswani/clipgradnorm
Aug 5, 2022
Merged

CUDA kernel for ClipGradNorm for TensorSeq gradients#12412
baijumeswani merged 6 commits intomasterfrom
bmeswani/clipgradnorm

Conversation

@baijumeswani
Copy link
Contributor

Currently, ClipGradNorm in onnxruntime is executed by running more primitive onnxruntime ops followed by a SequenceConstruct before feeding into AdamWOptimizer.

This SequenceConstruct will create another copy of the input tensors and put them into a TensorSeq. To avoid making these copies, the ClipGradNorm kernel can take in a TensorSeq and clip the gradients in place.

@baijumeswani baijumeswani added the training issues related to ONNX Runtime training; typically submitted using template label Aug 1, 2022
@baijumeswani baijumeswani force-pushed the bmeswani/clipgradnorm branch from 1eb026f to a1119e1 Compare August 2, 2022 16:42
@baijumeswani baijumeswani force-pushed the bmeswani/clipgradnorm branch from 536797d to b91946f Compare August 3, 2022 23:54
askhade
askhade previously approved these changes Aug 4, 2022
@baijumeswani baijumeswani merged commit a7d6290 into master Aug 5, 2022
@baijumeswani baijumeswani deleted the bmeswani/clipgradnorm branch August 5, 2022 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

training issues related to ONNX Runtime training; typically submitted using template

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants