Skip to content

Conversation

@arktrail
Copy link
Contributor

@arktrail arktrail commented Jun 15, 2020

Try to implement the nn::ParameterDict for the C++ front-end from #25883.

@yf225 please take a look and tell me if the PR has problems. I took the #36904 and #28652 as reference. I am hoping for any suggestions.

@dr-ci
Copy link

dr-ci bot commented Jun 15, 2020

💊 CI failures summary and remediations

As of commit 2ee86b2 (more details on the Dr. CI page):


  • 2/2 failures possibly* introduced in this PR
    • 1/2 non-CircleCI failure(s)

1 failure not recognized by patterns:

Job Step Action
CircleCI binary_linux_libtorch_3_7m_cpu_gcc5_4_cxx11-abi_shared-with-deps_build Spin Up Environment 🔁 rerun

Extra GitHub checks: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 41 times.

@arktrail
Copy link
Contributor Author

When implementing the ParameterDict, I refer to ModuleDict from #28652, in which the ModuleDict derived from Cloneable. It seems that the Cloneable is just for Module, so I guess when it comes to the Parameter part (both ParameterDict and ParameterList), I don't need to derive the Cloneable?

It is my first time to commit to PyTorch, so please correct me if my understanding is totally wrong, thank you guys so much!

@glaringlee glaringlee self-requested a review June 17, 2020 18:52
@glaringlee
Copy link
Contributor

@yyn19951228
Thanks for working on this. I think there are some misunderstanding here. Let me explain:
Overall, we treat the parameterDict/List like a Module.

  1. ParameterDict should be inherited from Module class and clone-able (What you've done is right). The reason is that we want parameterDict be visible by all the Module functions, and clone-able. So parameterDict can be used as a normal dict, it can also be used as a Module if necessary. We don't have any use case though, but we should make it consistent with python impl.

  2. You don't need to create a new OrderedDict in your ParameterDict class, instead, you should reuse the parameter OrderedDict within Module class, so all other functions in Module functions can see the parameterDict, and clone() works in that case.

Take a look at the python implementation here:
https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/container.py#L485
https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/module.py (as reference)
We have module c++ impl here and it support most of the functionalities in module.py
https://github.com/pytorch/pytorch/blob/master/torch/csrc/api/include/torch/nn/module.h
https://github.com/pytorch/pytorch/blob/master/torch/csrc/api/src/nn/module.cpp

What you need to do, is to follow the parameterDict python impl translate it into C++. I think most of the function will be exactly the. same as python implementation. The update() api, you might need to overload it into different version. (Iterator, dict, etc). Let me know if you need any help.

Copy link
Contributor

@glaringlee glaringlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above.

@arktrail
Copy link
Contributor Author

@glaringlee Many thanks for reviewing my diff! I do have some misunderstanding about the implementations, let me rewrite these parts.

@arktrail
Copy link
Contributor Author

@glaringlee I have updated my impl for parameterDict, please check if it meet the necessity. Also, I get lost for two places:

  1. For the initialization function, I would like to override by taking the std::initializer_list like this
explicit ParameterDictImpl(
      std::initializer_list<std::pair<std::string, Tensor>> initializer_list) {
    parameters_.reserve(initializer_list.size());
    for (const auto& item : initializer_list) {
      insert(std::move(item.first), std::move(item.second));
    }
  }

But when I want to init by using

ParameterDict dict(
    {
      {"a", torch::tensor({1.0})}
    }
  );

it shows that

no matching constructor for initialization of 'torch::nn::ParameterDict'ccls(2)
pimpl.h(53, 18): candidate inherited constructor not viable: cannot convert initializer list argument to 'std::nullptr_t' (aka 'nullptr_t')
(162, 1): constructor from base class 'ModuleHolder<torch::nn::ParameterDictImpl>' inherited here
pimpl.h(70, 18): candidate inherited constructor not viable: cannot convert initializer list argument to 'std::shared_ptr<ParameterDictImpl>'
.....
no instance of constructor "torch::nn::ParameterDict::ParameterDict" matches the argument list -- argument types are: ({...})

May I ask what might be the proper way to implement this?

  1. I use the container as the arguments for update(), may I ask does iterator also needed in such case?

Thank you very much! And sorry for these questions, they seem dummy maybe.

@arktrail arktrail requested a review from glaringlee June 19, 2020 08:36
Copy link
Contributor

@glaringlee glaringlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yyn19951228 Thank you so much for working on this. I've made some comments. Please take a look. I pasted some links in the comments, so you know why the code should work like that. Let me know if you need any help, I got some env issue today, can not build libtorch for long time, sorry for replying late.

"(b): Parameter containing: [CPUFloatType of size [1]]\n"
"(c): Parameter containing: [CPUFloatType of size [1]]\n"
")");
} No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add a new line to avoid lint error

TORCH_MODULE(ParameterDict);

} // namespace nn
} // namespace torch No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add a new line to avoid lint error

torch::OrderedDict<std::string, torch::Tensor> params) {
parameters_.reserve(params.size());
for (const auto& item : params) {
insert(std::move(item.key()), std::move(item.value()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here, you should insert directly into 'parameters_'.
Check the ordered_dict.h,


maybe you can just do parameters_(params).
The reason is that, this is constructor, we should keep whatever is passed in. If passed in parameters don't need grad, then we don't have to set require_grad to true as default.

Copy link
Contributor Author

@arktrail arktrail Jun 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiler complains

torch::OrderedDict<std::__1::string, at::Tensor> torch::nn::Module::parameters_
The registered parameters of this `Module`.
Inorder to access parameters_ in ParameterDict and ParameterList

type 'OrderedDict<std::string, at::Tensor>' (aka 'OrderedDict<basic_string<char, char_traits<char>, allocator<char> >, at::Tensor>') does not provide a call operator
call of an object of a class type without appropriate operator() or conversion functions to pointer-to-function type

if I use parameters_(params) here.
So can I use parameters_ = params here? I think OrderedDict implements the copy constructor.

OrderedDict& operator=(OrderedDict&& other) = default;

@@ -0,0 +1,114 @@
#include <gtest/gtest.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please modify the test if necessary after you finish modifying the parameterdict implementation.

@glaringlee
Copy link
Contributor

Now answer your previous two questions.

  1. You can not use initialize_list at this point. We actually use a ModuleHolder class to hold your implementation. So the initialize_list passed to ModuleHolder first and then it will be forwarded to the ParameterDict impl class. but unfortunately, we have no correct template to forward a single initialize_list at this point. see here.

  2. We don't need to support Container in iterator since the iterator is for ParameterDict itself, the only type is OrderedList (parameter_).

@arktrail arktrail requested a review from glaringlee June 22, 2020 05:25
@glaringlee
Copy link
Contributor

@yyn19951228 stuck with another code review today, will review yours tomorrow if I can not make it today.

@arktrail
Copy link
Contributor Author

@glaringlee Doesn't matter.

@zou3519 zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 23, 2020
Copy link
Contributor

@glaringlee glaringlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yyn19951228 Sorry about delay. I added more comments, please take a look at. Thanks a lot.

@arktrail arktrail requested a review from glaringlee June 25, 2020 05:41
Copy link
Contributor

@glaringlee glaringlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yyn19951228 some nit things, then we are good to go. Thanks a lot for your help!

@arktrail arktrail requested a review from glaringlee June 26, 2020 05:44
@arktrail
Copy link
Contributor Author

Hi @glaringlee , I have updated these comments! Please check if there is anything left, thank you very much!

jeffdaily and others added 6 commits June 26, 2020 09:45
Summary:
Test was added a few months back in #36503 but recently became flaky for ROCm.

CC ezyang xw285cornell sunway513
Pull Request resolved: #40577

Differential Revision: D22258196

Pulled By: ezyang

fbshipit-source-id: 8a22b0c17b536b3d42d0382f7737df0f8823ba08
Summary: Pull Request resolved: #40534

Differential Revision: D22258874

Pulled By: seemethere

fbshipit-source-id: 1954a22ed52e1a65caf89725ab1db9f40ff917b8
Summary:
Switch windows CPU testers from `windows.xlarge` to `windows.medium` class.
Remove VS 14.16 CUDA build
Only do smoke force-on-cpu tests using VS2019+CUDA10.1 config.
Pull Request resolved: #40592

Differential Revision: D22259351

Pulled By: malfet

fbshipit-source-id: f934ff774dfc7d47f12c3da836ca314c12d92208
Summary:
Pull Request resolved: #40584

Also patch [this github issue](#33124)
involving an illegal assembly instruction in 8x8-dq-aarch64-neon.S.

Test Plan:
Build binaries, copy to shaker, run executables. Also run all
existing caffe tests.

Reviewed By: kimishpatel

Differential Revision: D22240670

fbshipit-source-id: 51960266ce58699fe6830bcf75632b92a122f638
Summary: Pull Request resolved: #40381

Differential Revision: D22173938

Pulled By: Krovatkin

fbshipit-source-id: 305fc4484977e828cc4cee6e053a1e1ab9f0d6c7
Summary:
Pull Request resolved: #40614

This update pulls in a oneliner fix, which sets the TCP_NODELAY option on the TCP sockets of the UV transport. This leads to exceptional performance gains in terms of latency, with about a 25x improvement in one simple benchmark. This thus resolves a regression that TensorPipe had compared to the ProcessGroup agent and, in fact, ends up beating it by 2x.

The benchmark I ran is this, with the two endpoints pinned to different cores of the same machine:
```
torch.jit.script
def remote_fn(t: int):
    return t

torch.jit.script
def local_fn():
    for _ in range(1_000_000):
        fut = rpc.rpc_async("rhs", remote_fn, (42,))
        fut.wait()
```

And the average round-trip time (one iteration) is:
- TensorPipe with SHM: 97.2 us
- TensorPipe with UV _after the fix_: 205us
- Gloo: 440us
- TensorPipe with UV _before the fix_: 5ms

Test Plan: Ran PyTorch RPC test suite

Differential Revision: D22255393

fbshipit-source-id: 3f6825d03317d10313704c05a9280b3043920507
@glaringlee
Copy link
Contributor

@yyn19951228
Hi, I think there are some problems with your rebase.
This PR should only have your changes, but it is now have 28 files changes :).
If you have problems on rebasing this correctly, you can open a new pr with the latest pytorch and put your code in.
I will close this one and approve your new PR.

Meghan Lele and others added 14 commits June 26, 2020 12:35
Summary:
Pull Request resolved: #40611

This commit removes a dead store in `transformWith` of exit_transforms.cpp.

Test Plan: Continuous integration.

Reviewed By: suo

Differential Revision: D22254136

fbshipit-source-id: f68c4625f7be8ae29b3500303211b2299ce5d6f6
Summary:
Pull Request resolved: #40595

ghstack-source-id: 106691774

Test Plan: waitforsandcastle

Differential Revision: D22247729

fbshipit-source-id: 14745588cae267c1e0cc51cd9541a9b8abb830e5
Summary: Pull Request resolved: #40632

Differential Revision: D22262316

Pulled By: malfet

fbshipit-source-id: 3d525767bfbfc8e2497541849d85cabf0379a43b
Summary:
Pull Request resolved: #36764

This allows bundling inputs that are large uniform buffers in
channels-last memory format.

Test Plan: Unit test.

Differential Revision: D21142660

Pulled By: dreiss

fbshipit-source-id: 31bbea6586d07c1fd0bcad4cb36ed2b8bb88a7e4
Summary:
Pull Request resolved: #37055

Sometimes it's okay to bundle a large example input tensor with a model.
Add a utility function to make it easy for users to do that *on purpose*.

Test Plan: Unit test.

Differential Revision: D22264239

Pulled By: dreiss

fbshipit-source-id: 05c6422be1aa926cca850f994ff1ae83c0399119
Summary:
This replicates the pattern of other "do for luck" commands.
Prep change to add RocM to CircleCI.
Pull Request resolved: #40631

Differential Revision: D22261707

Pulled By: malfet

fbshipit-source-id: 3dadfa434deab866a8800715f3197e84169cf43e
Summary:
Reland of #35594
Pull Request resolved: #40551

Reviewed By: ezyang

Differential Revision: D22249831

Pulled By: ngimel

fbshipit-source-id: b221b3c0a490ccaaabba50aa698a2490536e0917
…ion API (#39459)

Summary:
Pull Request resolved: #39459

Update to this PR: this code isn't going to fully solve #37010. The changes required for 37010 is more than this PR initially planned. Instead, this PR switches op registration of rng related tests to use the new API (similar to what was done in #36925)

Test Plan:
1) unit tests

Imported from OSS

Reviewed By: ezyang

Differential Revision: D22264889

fbshipit-source-id: 82488ac6e3b762a756818434e22c2a0f9cb9dd47
Summary:
Pull Request resolved: #40466

Extend row wise sparse Adagrad fusion op to FP16 (rounding-to-nearest) for PyTorch.

Reviewed By: jianyuh

Differential Revision: D22003571

fbshipit-source-id: e97e01745679a9f6e7b0f81ce5a6ebf4d4a1df41
Summary:
## TLDR
Support using NaN default value for missing dense features in RawInputProcessor for *DPER2*. In preparation for subsequent support for null flag features in *compute meta*. For train_eval this is already supported in DPER3 and we do not plan to support this in DPER2 train eval.
## Overview
Intern project plan to support adding dense flags for missing feature values instead of replacing with zero.

Project plan :
https://docs.google.com/document/d/1OsPUTjpJycwxWLCue3Tnb1mx0uDC_2KKWvC1Rwpo2NI/edit?usp=sharing

## Code paths:
See https://fb.quip.com/eFXUA0tbDmNw for the call stack for all affected code paths.

Test Plan:
# A. DPER3 blob value inspection
## 1. Build local bento kernel in fbcode folder
`buck build mode/dev-nosan //bento/kernels:bento_kernel_ads_ranking`

## 2. Use kernel `ads_ranking (local)` to print dense feature blob values
n280239

## 2.1 Try `default_dense_value = "0.0"` (default)
```
preproc_6/feature_preproc_6/dper_feature_processor_7/raw_input_proc_7/float_feature_sparse_to_dense_7/float_features [[0.       ]
 [0.       ]
 [0.       ]
 [0.       ]
 [0.       ]
 [0.       ]
 [0.       ]
 [1.       ]
 [1.7857143]
 [1.7777778]
 [1.       ]
 [0.       ]
 [0.5625   ]
 [0.       ]
 [0.       ]
 [0.8      ]
 [0.       ]
 [1.       ]
 [0.56     ]
 [0.       ]]
```
## 2.2 Try `default_dense_value = "123"`
```
preproc_2/feature_preproc_2/dper_feature_processor_3/raw_input_proc_3/float_feature_sparse_to_dense_3/float_features [[123.       ]
 [123.       ]
 [123.       ]
 [123.       ]
 [123.       ]
 [123.       ]
 [123.       ]
 [  1.       ]
 [  1.7857143]
 [  1.7777778]
 [  1.       ]
 [123.       ]
 [  0.5625   ]
 [123.       ]
 [123.       ]
 [  0.8      ]
 [123.       ]
 [  1.       ]
 [  0.56     ]
 [123.       ]]
```
## 2.3 Try `default_dense_value = float("nan")`
```
RuntimeError: [enforce fail at enforce_finite_op.h:40] std::isfinite(input_data[i]). Index 0 is not finite (e.g., NaN, Inf): -nan (Error from operator:
input: "unary_4/logistic_regression_loss_4/average_loss_4/average_loss" name: "" type: "EnforceFinite" device_option { random_seed: 54 })
```
which is expected due to nan input.

# B. Unit test
`buck test  fblearner/flow/projects/dper/tests/preprocs:raw_feature_extractor_test`

https://www.internalfb.com/intern/testinfra/testconsole/testrun/5348024586274923/

{F241336814}

Differential Revision: D21961595

fbshipit-source-id: 3dcb153b3c7f42f391584f5e7f52f3d9c76de31f
@arktrail
Copy link
Contributor Author

Hi @glaringlee , sorry about that! I have created a new PR #40654, and could you please check that? Thank you very much!

@glaringlee
Copy link
Contributor

Close this PR since it is hard to rebase in this PR.
@yyn19951228 opened a new PR #40654 with same change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.