Skip to content

Conversation

@nv-kkudrynski
Copy link
Contributor

No description provided.

# FIXME: GAN models checkpoints are on cuda.
if [[ $f = $GANs* ]]; then
# FIXME: GAN and NVIDIA models checkpoints are on cuda.
if [[ $f = $GANs* ]] || [[ $f = $CUDAs* ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a cuda checkpoint will require user have a cuda pytorch installed to load it. Would it possible to convert them to cpu in hubconf entrypoints and relax it here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

considering this is NVIDIA, i think it's okay

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx. I had to revive the above condition in run_pytorch.sh to pass your CI

@netlify
Copy link

netlify bot commented Jun 5, 2019

@ailzhang
Copy link
Contributor

ailzhang commented Jun 5, 2019

@nv-kkudrynski Also I'm not sure why this didn't trigger the circleCi check. See #23 as an example, maybe try rebasing?

@gottbrath
Copy link

@soumith -- are we going to be able to accept this whole pull request? I believe from side conversation you had some concerns with the NCF. Have those been addressed?

If we can't accept the NCF part would it make sense to split this so that we could get the waveglow and tacotron2 models up first then address the NCF issues?

@soumith
Copy link
Contributor

soumith commented Jun 6, 2019

yes, there are a few changes needed.

  1. can we remove NCF from this PR for now -- I dont think the model as it stands is useful for Hub (we are curating), because it isn't really useful for building research on top of it -- or useful to visualize anything about the output
  2. Can you add code that visualizes the outputs of WaveGlow / TacoTron-2. They produce spectrograms, that's great -- it'd be useful to see that. You can use the GAN model that I cleaned up as an example: https://pytorch.org/hub/facebookresearch_pytorch-gan-zoo_dcgan/ . This way, people pulling your model know what to do with it further.

hub_model = torch.hub.load(github='nvidia/DeepLearningExamples', model='nvidia_tacotron2')
hub_model = hub_model.cuda()
hub_model.eval()
inp = torch.randint(low=0, high=148, size=(1,140), dtype=torch.long)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed strategy of this example to go from plain text to sound using the two torch.hub models in series

hub_model = torch.hub.load(github='nvidia/DeepLearningExamples', model='nvidia_waveglow')
```
will load the WaveGlow model pre-trained on [LJ Speech dataset](https://keithito.com/LJ-Speech-Dataset/)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thoughts as above. Add details on what the model takes as input and gives as output.
For example see: https://pytorch.org/hub/facebookresearch_pytorch-gan-zoo_dcgan/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed strategy of this example to go from plain text to sound using the two torch.hub models in series

print('\nWaveglow model test output:')
print(out.size())
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a visualization of the test output, or some other way to interpret the output

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed strategy of this example to go from plain text to sound using the two torch.hub models in series

# FIXME: GAN models checkpoints are on cuda.
if [[ $f = $GANs* ]]; then
# FIXME: GAN and NVIDIA models checkpoints are on cuda.
if [[ $f = $GANs* ]] || [[ $f = $CUDAs* ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

considering this is NVIDIA, i think it's okay

@nv-kkudrynski
Copy link
Contributor Author

  1. can we remove NCF from this PR for now -- I dont think the model as it stands is useful for Hub (we are curating), because it isn't really useful for building research on top of it -- or useful to visualize anything about the output

Indeed, NCF is quite specific. We also had discussions about it. It could however be useful if loaded pretrained=False (I added example in readme and provided reference for details about training). Then pre-trained model could serve as reference. Let me know what you think.

  1. Can you add code that visualizes the outputs of WaveGlow / TacoTron-2. They produce spectrograms, that's great -- it'd be useful to see that. You can use the GAN model that I cleaned up as an example: https://pytorch.org/hub/facebookresearch_pytorch-gan-zoo_dcgan/ . This way, people pulling your model know what to do with it further.

In our flow formatting input/output is not part of the model object, so we need a few internal changes to adapt to your request. We are working on this atm.

@soumith
Copy link
Contributor

soumith commented Jun 7, 2019

@nv-kkudrynski i think NCF can serve as a reference with pretrained=False if it's easy to train upon. Is the code snippet that shows -- taking a csv dataset or Pandas Dataframe, training the NCF model with it -- going to be about ~20 lines or less? If so, I think it can be worth it for the reader who comes to the hub page. If the reference model is only useful if the data is preprocessed and run through a high-performance set of instructions that are given in the NCF repo, then it's almost better to just point users to the NCF repo directly without a card.

@soumith
Copy link
Contributor

soumith commented Jun 7, 2019

even an "easy-finetuning" starting from the pretrained model possibility has potential, but from what I can tell with the dataset, the pre-trained model doesn't have much to gain wrt fine-tuning -- the features aren't generic to generalize to a different set of users + attributes to do recommendation on

@soumith
Copy link
Contributor

soumith commented Jun 7, 2019

In our flow formatting input/output is not part of the model object, so we need a few internal changes to adapt to your request. We are working on this atm.

Great, thanks for the update.
Just FYI, if the input our output is not part of the model object, that is okay. This is why hub entrypoints are callables and not models.
For example, you can have an entry-point for pre-processing, and an entry-point for post-processing.
In the BERT example, they have an entry bertTokenizer which is not a model, it's just a pre-processing tokenizer -- but it's available as an entrypoint.
The example showcases taking an input, tokenizing it, and then passing in the output of bertTokenizer into bertModel. Maybe something like this will help with your case as well, just a thought.

@nv-kkudrynski
Copy link
Contributor Author

Just FYI, if the input our output is not part of the model object, that is okay. This is why hub entrypoints are callables and not models.

I think there was a discussion about it which concluded with exactly the opposite :)
Which is reflected in your guidelines at: https://pytorch.org/docs/stable/hub.html:

Entrypoint function should ALWAYS return a model(nn.module).

But don't worry, update is coming ;)

@soumith
Copy link
Contributor

soumith commented Jun 7, 2019

@ailzhang that line needs to be removed I think?

@nv-kkudrynski
Copy link
Contributor Author

I removed NCF from this PR to enable tacotron2/waveglow hopefully before the weekend.
Now the 2 models play very nicely together in the examples - producing sound from input text.

@gottbrath
Copy link

@soumith - have the changes here addressed all your concerns?

@soumith soumith merged commit aafb3e3 into pytorch:master Jun 10, 2019
@soumith
Copy link
Contributor

soumith commented Jun 10, 2019

thanks so much @nv-kkudrynski , the updated stuff looks good. I might make some additional changes so that it plays well with opening the notebooks in Google Colab.

Additionally, if you are interested as a follow-up (totally up to you and if it makes sense), add end-points for any intermediate features that might make sense for folks to use these models as a feature-extractor / fine-tuner.

@nv-kkudrynski
Copy link
Contributor Author

Thanks to everyone for all the feedback and smooth cooperation!

facebook-github-bot pushed a commit to pytorch/pytorch that referenced this pull request Jun 10, 2019
Summary:
update doc as pointed out in pytorch/hub#22
Pull Request resolved: #21568

Differential Revision: D15732927

Pulled By: ailzhang

fbshipit-source-id: 78ab026539e5ee59e7c3a8144e2c9fcbbc225733
@nv-kkudrynski nv-kkudrynski deleted the publishing_nvidia_models branch June 10, 2019 20:22
@soumith
Copy link
Contributor

soumith commented Jun 11, 2019

fyi, I have updated the hub card to play well with google colab -- now you can generate the speech and listen to it immediately, all in the colab notebook.

@nv-kkudrynski
Copy link
Contributor Author

Thanks a lot for that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants