Skip to content

Fix integer overflow in LongformerAttention#12435

Merged
tianleiwu merged 1 commit intomasterfrom
tlwu/fix_longformer_int_overflow
Aug 3, 2022
Merged

Fix integer overflow in LongformerAttention#12435
tianleiwu merged 1 commit intomasterfrom
tlwu/fix_longformer_int_overflow

Conversation

@tianleiwu
Copy link
Contributor

@tianleiwu tianleiwu commented Aug 3, 2022

Description:

LongformerAttention might need >2GB memory for large batch size. GetScratch1Size function might encounter integer overflow. For example batch_size=64, sequence_length=4096, window=256, num_heads=12, element_size=2. The value will be larger than 2GB thus cause integer overflow. If user uses smaller batch size like 16 (or 32), this error will not be triggered.

Motivation and Context

  • Why is this change required? What problem does it solve?
  • If it fixes an open issue, please link to the issue here.

LongformerAttention cuda might fail with large batch size. For debug build, assertion error will be triggered at assert(current_pointer == scratch2). For release build, the error is like onnxruntime::CudaCall] CUDA failure 700: an illegal memory access was encountered

@tianleiwu tianleiwu merged commit 97a340b into master Aug 3, 2022
@tianleiwu tianleiwu deleted the tlwu/fix_longformer_int_overflow branch August 3, 2022 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants