Skip to content

Conversation

@anjali411
Copy link
Contributor

@anjali411 anjali411 requested review from nairbv and yf225 October 10, 2019 18:36
@pytorchbot pytorchbot added module: cpu CPU specific problem (e.g., perf, algorithm) module: operators labels Oct 10, 2019
}

TORCH_CHECK(!(std::isinf(minval) || std::isinf(maxval)), "range of [", minval, ", ", maxval, "] is not finite");

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, let's also add the example from the issue to a unit test too

@nairbv nairbv changed the title [histc issue] added a finite range check to resolve torch.histc segfaults if tensor has inf torch.histc added a finite range check to resolve torch.histc segfaults if tensor has inf Oct 10, 2019
@nairbv nairbv changed the title torch.histc added a finite range check to resolve torch.histc segfaults if tensor has inf torch.histc added a finite range check to resolve torch.histc segfaults if tensor has inf Oct 10, 2019
@nairbv nairbv changed the title torch.histc added a finite range check to resolve torch.histc segfaults if tensor has inf torch.histc added a finite range check to resolve segfaults if tensor has inf Oct 10, 2019
Copy link
Contributor

@vadimkantorov vadimkantorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be fine for NaN?

@anjali411
Copy link
Contributor Author

@vadimkantorov good point! I added a check for that and it will throw an error for tensors with nan values in case a finite range is not specified

@anjali411 anjali411 changed the title torch.histc added a finite range check to resolve segfaults if tensor has inf torch.histc added a finite range check to resolve segfaults if tensor has inf. also added checks for nan values, min>max Oct 11, 2019
@anjali411 anjali411 added the module: bc-breaking Related to a BC-breaking change label Oct 14, 2019
@anjali411 anjali411 added the module: cuda Related to torch.cuda, and CUDA support in general label Oct 16, 2019
Copy link
Contributor

@yf225 yf225 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look awesome! CI failure is likely unrelated, after rebasing and passing CI we should be good to go.

}
TORCH_CHECK(!(std::isinf(minvalue) || std::isinf(maxvalue) || std::isnan(minvalue) || std::isnan(maxvalue)), "range of [", minvalue, ", ", maxvalue, "] is not finite");
TORCH_CHECK(minvalue < maxvalue, "max must be larger than min");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows CI gives error:

15:26:54 C:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\ucrt\corecrt_math.h(409): error: more than one instance of overloaded function "fpclassify" matches the argument list:
15:26:54 SummaryOps.cu
15:26:54             function "fpclassify(float)"
15:26:54             function "fpclassify(double)"
15:26:54             function "fpclassify(long double)"
15:26:54             argument types are: (uint8_t)
15:26:54           detected during:
15:26:54             instantiation of "__nv_bool isinf(_Ty) [with _Ty=uint8_t]" 
15:26:54 C:/Jenkins/workspace/pytorch-builds/pytorch-win-ws2016-cuda9-cudnn7-py3-build/aten/src/ATen/native/cuda/SummaryOps.cu(326): here
15:26:54             instantiation of "at::Tensor at::<unnamed>::_histc_cuda_template(const at::Tensor &, int64_t, input_t, input_t) [with input_t=uint8_t]" 
15:26:54 C:/Jenkins/workspace/pytorch-builds/pytorch-win-ws2016-cuda9-cudnn7-py3-build/aten/src/ATen/native/cuda/SummaryOps.cu(357): here

To fix this issue, in this file I think we should write THCNumerics<input_t>::isinf(val) instead of std::isinf(val), and THCNumerics<input_t>::isnan(val) instead of std::isnan(val) (this is taken from

assert(!THCNumerics<scalar_t>::isinf(val));
assert(!THCNumerics<scalar_t>::isnan(val));

). This only applies to CUDA code though, and we don't need to change the CPU code.

TORCH_CHECK(!(THCNumerics<input_t>::isinf(minvalue) || THCNumerics<input_t>::isinf(maxvalue) || THCNumerics<input_t>::isnan(minvalue) || THCNumerics<input_t>::isnan(maxvalue)), "range of [", minvalue, ", ", maxvalue, "] is not finite");
TORCH_CHECK(minvalue < maxvalue, "max must be larger than min");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rocm build is failing with:

18:29:27 [ 87%] Linking CXX executable ../bin/generate_proposals_op_gpu_test
18:29:29 ../lib/libtorch.so: undefined reference to `isinf(float)'
18:29:29 ../lib/libtorch.so: undefined reference to `isnan(float)'
18:29:29 ../lib/libtorch.so: undefined reference to `isnan(double)'
18:29:29 ../lib/libtorch.so: undefined reference to `isinf(double)'
18:29:29 clang: �[0;1;31merror: �[0mlinker command failed with exit code 1 (use -v to see invocation)�[0m

To fix this, I think we need to #include <THC/THCNumerics.cuh> at the top of this file after #include <ATen/cuda/CUDAApplyUtils.cuh>, taking inspiration from the aten/src/ATen/native/cuda/MultinomialKernel.cu file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh okay 👌 thanks!!

maxval = maxval + 1;
}

TORCH_CHECK(!(std::isinf(minval) || std::isinf(maxval) || std::isnan(minval) || std::isnan(maxval)), "range of [", minval, ", ", maxval, "] is not finite");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::isnan -> th_isnan

@yf225
Copy link
Contributor

yf225 commented Oct 22, 2019

Current CI errors seem unrelated and we are ready to merge this PR :D

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anjali411 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

zdevito pushed a commit to zdevito/ATen that referenced this pull request Oct 24, 2019
… has inf. also added checks for nan values, min>max (#27712)

Summary:
pytorch/pytorch#27464
Pull Request resolved: pytorch/pytorch#27712

Differential Revision: D18064544

Pulled By: anjali411

fbshipit-source-id: c9c6d8eb4d55f2b5320409ba238bf44b0be8902e
@facebook-github-bot
Copy link
Contributor

@anjali411 merged this pull request in 136bb07.

@nairbv
Copy link
Collaborator

nairbv commented Dec 26, 2019

@anjali411 this previously segfaulted right? so not BC-breaking?

thiagocrepaldi pushed a commit to thiagocrepaldi/pytorch that referenced this pull request Feb 4, 2020
… has inf. also added checks for nan values, min>max (pytorch#27712)

Summary:
pytorch#27464
Pull Request resolved: pytorch#27712

Differential Revision: D18064544

Pulled By: anjali411

fbshipit-source-id: c9c6d8eb4d55f2b5320409ba238bf44b0be8902e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: bc-breaking Related to a BC-breaking change module: cpu CPU specific problem (e.g., perf, algorithm) module: cuda Related to torch.cuda, and CUDA support in general

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants