Skip to content

Conversation

@peterjc123
Copy link
Collaborator

The PR added the new optimizer of Nadam. I wrote this code according to the Nadam code in Keras and the Adam code in the PyTorch repo. I've tested the code and it shows better performance in MNIST than original Adam.

peterjc123 and others added 3 commits April 30, 2017 14:21
Save memory by exploiting in-place operations.
@Jiaming-Liu
Copy link
Contributor

Jiaming-Liu commented May 1, 2017

You might would love to see this PR on your branch where in-place opts are exploited.

Jiaming-Liu and others added 2 commits May 1, 2017 00:36
@Jiaming-Liu
Copy link
Contributor

Tried it on CPU. It gradually became very slow after a few epochs. Any ideas?

@peterjc123
Copy link
Collaborator Author

It didn't happen on my machine. Can you send me the log?

@Jiaming-Liu
Copy link
Contributor

Jiaming-Liu commented May 14, 2017

I guess this is not super helpful. The "[mm:ss<00:00, ]" indicates the time spent on each epoch. Problem solved by switching to Adam.
screenshot_20170511-105638

@ethancaballero
Copy link

@unnir
Copy link

unnir commented Nov 8, 2017

any updates?

@peterjc123 peterjc123 closed this Mar 16, 2018
@peterjc123 peterjc123 deleted the master branch March 28, 2018 12:13
zou3519 pushed a commit to zou3519/pytorch that referenced this pull request Mar 30, 2018
Summary:
My commit  bab5bc  broke things wiht fp16 compute, as i had tested it only with the null-input, that actually produced fp32 data (even dtype was given as float16). Also, I had confused the concepts of "float16 compute" and fp16 data. Issue pytorch#1408.

This fixes those issues, tested with both Volta and M40 GPUs. Basically restored much of the previous code and fixed the null input to do FloatToHalf.

Reviewed By: pietern

Differential Revision: D6211849

fbshipit-source-id: 5b41cffdd605f61a438a4c34c56972ede9eee28e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants