-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Instance Norm. #2562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instance Norm. #2562
Conversation
|
Not sure, but it looks like if we resolve the merge conflict the memory check will pass. |
Co-authored-by: Marcus Edel <[email protected]>
|
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍 |
|
Keep open please, thanks! |
|
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍 |
|
@zoq @iamshnoo : Hi! I've figured out why "GradientInstanceNormLayersTest" crashes. The batchNorm object is initialized in the Forward() function of InstanceNorm layer, so when ResetParameters() is called by FFN, memory isn't allotted for the parameters ( & subsequently gradient) of InstanceNorm layer. Hence, the memory crash later. We could initialize the batchNorm object in the constructor itself. Although, we would need to add input.n_cols to it's aruguments. Please let me know what you think about this. |
|
@hello-fri-end yep, that seems like a possible solution. I am concerned about having to pass in Feel free to create a new PR with the commits in this branch, and add the modifications. If the tests pass in that, and nobody has any issues with the addition of the extra parameter, that's good enough for me. |
Instance Norm Layer.
Note : Currently "InstanceNormLayerTest" and "InstanceNormLayerParametersTest" pass. "GradientInstanceNormLayerTest" crashes and gives weird memory errors for reasons unknown to me as of yet (I need some help debugging this, please.) I am probably missing something minor with the Gradient test calculations.
To compare armadillo functionality implementation with PyTorch implementations, take a look at this Google Colab notebook.
The basic idea followed for this implementation is that, Instance Norm applied on a (N,C,H,W) input is equivalent to Batch Norm applied on a (1,N*C,H,W) input. So, this layer is implemented as a BatchNorm wrapper layer.
This idea is same as the one followed by the original author of the paper here and also the same as the PyTorch implementation here.