Skip to content

3432 remove einops and let vit support torchscript#3433

Closed
yiheng-wang-nv wants to merge 1 commit intoProject-MONAI:devfrom
yiheng-wang-nv:remove-einops-to-support-vit-torchscript
Closed

3432 remove einops and let vit support torchscript#3433
yiheng-wang-nv wants to merge 1 commit intoProject-MONAI:devfrom
yiheng-wang-nv:remove-einops-to-support-vit-torchscript

Conversation

@yiheng-wang-nv
Copy link
Copy Markdown
Contributor

Signed-off-by: Yiheng Wang [email protected]

Fixes #3432 .

Description

This PR makes ViT supports torchscript via:

  1. remove einops
  2. make other necessary changes

Status

Ready

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

@wyli
Copy link
Copy Markdown
Contributor

wyli commented Dec 2, 2021

Hi Yiheng, is it possible to keep the einops implementation? It seems it's pytorch layers support TS and it has a much better readability.

Cc @ahatamiz

@ahatamiz
Copy link
Copy Markdown
Contributor

ahatamiz commented Dec 2, 2021

Hi @wyli @yiheng-wang-nv

Is it possible to close this PR ? : (1) einpos is compatible with TS (2) we certainly should not impose such huge breaking changes.

@ahatamiz
Copy link
Copy Markdown
Contributor

ahatamiz commented Dec 2, 2021

@yiheng-wang-nv
Copy link
Copy Markdown
Contributor Author

@wyli @yiheng-wang-nv

Can this PR not be approved ? I am not sure who comes up with these decisions, but : (1) einpos is compatible with TS (2) we certainly should not impose such huge breaking changes.

I would be more than happy to discuss in MONAI meetings, but it amazes me how we are careless in self-imposing breaking changes and not care for backward compatibility.

Hi @ahatamiz , I'm confused about the comments, why do you say "huge breaking changes"? Does the network structure change? In addition, can you point out which changes impact the backward compatibility? Thanks.

@yiheng-wang-nv
Copy link
Copy Markdown
Contributor Author

I did not change the existing test cases, and none of them failed. Therefore, if this PR brings huge breaking changes, is it proved that the current test cases are incomplete?

The purpose of this PR is to solve the torchscript error in https://github.com/Project-MONAI/MONAI/runs/4383941246?check_suite_focus=true#step:10:8707
If you can fix it without removing einops, I'm very willing to close this PR, thanks! @ahatamiz

@wyli
Copy link
Copy Markdown
Contributor

wyli commented Dec 2, 2021

Would be nice if both einops and torchscript are supported and included here...

@yiheng-wang-nv
Copy link
Copy Markdown
Contributor Author

Hi @ahatamiz , this is a simplified code that can test if torchscript works on Rearrange. Maybe you can have a try first:

import torch
import torch.nn as nn
from einops.layers.torch import Rearrange

class SimpleRearrange(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer = nn.Sequential(Rearrange('b c h w -> b h w c'))
    
    def forward(self, x):
        result = self.layer(x)
        return result
    
net = SimpleRearrange()
net.eval()
with torch.no_grad():
    torch.jit.script(net)

@wyli
Copy link
Copy Markdown
Contributor

wyli commented Dec 2, 2021

ok, looks like it's difficult if we don't specify the dimensionality in advance

@ahatamiz
Copy link
Copy Markdown
Contributor

ahatamiz commented Dec 2, 2021

@yiheng-wang-nv

I did not change the existing test cases, and none of them failed. Therefore, if this PR brings huge breaking changes, is it proved that the current test cases are incomplete?

The purpose of this PR is to solve the torchscript error in https://github.com/Project-MONAI/MONAI/runs/4383941246?check_suite_focus=true#step:10:8707 If you can fix it without removing einops, I'm very willing to close this PR, thanks! @ahatamiz

Test cases are not incomplete. But if einops is removed, models trained with older versions will not be compatible with this new version. In addition, it needs to be investigated whether your new implementation changes the behavior of patch embedding or not.

I suggest closing this PR.

@ahatamiz
Copy link
Copy Markdown
Contributor

ahatamiz commented Dec 2, 2021

https://github.com/Project-MONAI/research-contributions/blob/master/UNETR/BTCV/main.py#L149

Hi @ahatamiz , this is a simplified code that can test if torchscript works on Rearrange. Maybe you can have a try first:

import torch
import torch.nn as nn
from einops.layers.torch import Rearrange

class SimpleRearrange(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer = nn.Sequential(Rearrange('b c h w -> b h w c'))
    
    def forward(self, x):
        result = self.layer(x)
        return result
    
net = SimpleRearrange()
net.eval()
with torch.no_grad():
    torch.jit.script(net)

@yiheng-wang-nv did you check here where we use Torchscript for UNETR?
https://github.com/Project-MONAI/research-contributions/blob/master/UNETR/BTCV/main.py#L149

@ahatamiz
Copy link
Copy Markdown
Contributor

ahatamiz commented Dec 2, 2021

@yiheng-wang-nv
Copy link
Copy Markdown
Contributor Author

@yiheng-wang-nv

Maybe you look at this: https://github.com/arogozhnikov/einops/blob/45e795f122840beec70f0e1970b1e37b5809a8e6/tests/test_layers.py#L188

torchscript works on Rearrange.

Hi @ahatamiz , I double checked the einops library, and I'm sure I used the latest release (0.3.2), but it cannot support torchscript. Can you have a try first?

check einops

@yiheng-wang-nv
Copy link
Copy Markdown
Contributor Author

The link you set

@yiheng-wang-nv

I did not change the existing test cases, and none of them failed. Therefore, if this PR brings huge breaking changes, is it proved that the current test cases are incomplete?
The purpose of this PR is to solve the torchscript error in https://github.com/Project-MONAI/MONAI/runs/4383941246?check_suite_focus=true#step:10:8707 If you can fix it without removing einops, I'm very willing to close this PR, thanks! @ahatamiz

Test cases are not incomplete. But if einops is removed, models trained with older versions will not be compatible with this new version. In addition, it needs to be investigated whether your new implementation changes the behavior of patch embedding or not.

I suggest closing this PR.

Hi @ahatamiz , I'm sure the changes will not impact the behavior. Yes, the new state dict is not 100% the same as the old version. The reason is that in patch embedding, a nn.Sequential module is removed, but if you do something like:

new_weights["patch_embedding.patch_embeddings.weight"] = weights["patch_embedding.patch_embeddings.1.weight"]
new_weights["patch_embedding.patch_embeddings.bias"] = weights["patch_embedding.patch_embeddings.1.bias"]

The old trained weights can still be used, and the output of the network is exactly the same. Therefore, I admit that this PR brings breaking changes, but I do not think that it is: such huge breaking changes. and how we are careless in self-imposing breaking changes and not care for backward compatibility.

As I said, the purpose is to solve the torchscript error in https://github.com/Project-MONAI/MONAI/runs/4383941246?check_suite_focus=true#step:10:8707
If you can fix it without removing einops, I'm very willing to close this PR.

I did not know eniops before, and in order to fix the error I saw, I learnt it, and tried to replaced it with a no-error version, and when I saw the unit test passed, I submitted this PR, that's it.

@wyli
Copy link
Copy Markdown
Contributor

wyli commented Dec 6, 2021

Another idea would be having a system-wide configuration flag, to indicate whether we start a torchscriptable instance of the network with reduced functionality/readability.

@ahatamiz
Copy link
Copy Markdown
Contributor

ahatamiz commented Dec 6, 2021

Hi @yiheng-wang-nv and @wyli

Thanks for the comments. Unfortunately, there is no such as a concept as huge or small breaking change ( my bad to mention in that way earlier). A breaking change, no matter what it is, will disrupt the continuity of MONAI and hence shall be avoided unless absolutely necessary. @yiheng-wang-nv I like the fix you proposed for enabling the use of old weights with this version, but I doubt that all users from previous versions can adopt it correctly (some users may not be aware of this post or any announcements we make as a migrations guide).

I am also surprised that you mention einops still does not support Torchscript despite the provided links. I will look into it to see what can cause the issue you are facing.

Overall, I believe we need a wider discussion with the group to avoid these situations. Maybe we need better CI/CD pipelines. At this point, I don't believe we still have agreed upon a solution regarding this.

@Nic-Ma
Copy link
Copy Markdown
Contributor

Nic-Ma commented Dec 7, 2021

Thanks for the great discussions!
And I think it raised an important topic for us:

  1. As TorchScript always can't support the latest network layers or features, should we make sure all the MONAI networks support TorchScript?
  2. If yes, how can we write unit tests to make sure all the cases of a network args can support TorchScript?
  3. If not, what should we do for TorchScript format when developing networks?
  4. Should we also consider other export format? Like ONNX, etc?
    @ericspod @BenTenmann @wyli @rijobro Can we add this topic to the network API design discussion? And looking forward to your great ideas.

Thanks in advance.

@ericspod
Copy link
Copy Markdown
Member

ericspod commented Dec 7, 2021

Thanks for the great discussions! And I think it raised an important topic for us:

1. As TorchScript always can't support the latest network layers or features, should we make sure all the MONAI networks support TorchScript?

2. If yes, how can we write unit tests to make sure all the cases of a network args can support TorchScript?

3. If not, what should we do for TorchScript format when developing networks?

4. Should we also consider other export format? Like ONNX, etc?
   @ericspod @BenTenmann @wyli @rijobro Can we add this topic to the network API design discussion? And looking forward to your great ideas.

Thanks in advance.

  1. We should be trying our hardest to support Torchscript in our definitions and making the coding compromises where necessary, we've already done that in a number of places for existing networks. A few of our networks don't and it's just a necessity right now of their implementations so we should clearly document this.
  2. We have testing routines such as test_script_save to help us test for compatibility in our unit tests such as here.
  3. If Torchscript can't be supported I would document this clearly in the docstring for the network, we probably won't be able to get 100% of networks to be compatible so this is a necessity.
  4. I had a play with ONNX a while back with our networks, we should definitely try supporting it as an important feature, there are limitations and not all Torchscript objects are convertible but it should work with trace or script semantics.

@Nic-Ma
Copy link
Copy Markdown
Contributor

Nic-Ma commented Dec 7, 2021

Thanks @ericspod for the discussion.
So I think we should continue to complete this task to enable the TorchScript support for ViT network.
@ahatamiz are you working on it now? Let's work together to enable it with TorchScript unit tests as @yiheng-wang-nv developed in this PR.

Thanks in advance.

@ahatamiz
Copy link
Copy Markdown
Contributor

ahatamiz commented Dec 10, 2021

Hi @Nic-Ma

Thanks for the efforts. I hope you and @wyli can review and comment on this PR. If you believe something needs to be changed in any ways, please feel free to address it based on guidelines and policies. As such, I don't have any further suggestions or recommendations for this PR.

Thanks

P.S: The einops is still supported by TorchScript and this can be acheived, however I leave the final decision to the discretion of other reviewers of this PR.

@yiheng-wang-nv
Copy link
Copy Markdown
Contributor Author

New comments:
@ahatamiz and I figured out the issue, where the latest einops version is v0.3.2, which is released in Sep 1, 2021. However, their torchscript support comes a bit later (in Sep 11, 2021, see: arogozhnikov/einops@7f88ed5). Therefore, if using the released version, the issue will happen.
Now, let me close this PR, and I think we can keep the current implementation and add torchscript support after einops release its next version.

@wyli
Copy link
Copy Markdown
Contributor

wyli commented Feb 8, 2022

New comments:
@ahatamiz and I figured out the issue, where the latest einops version is v0.3.2, which is released in Sep 1, 2021. However, their torchscript support comes a bit later (in Sep 11, 2021, see: arogozhnikov/einops@7f88ed5). Therefore, if using the released version, the issue will happen.
Now, let me close this PR, and I think we can keep the current implementation and add torchscript support after einops release its next version.

Looks like the new version is released, shall we revisit this ticket? https://github.com/arogozhnikov/einops/releases/tag/v0.4.0

@Nic-Ma
Copy link
Copy Markdown
Contributor

Nic-Ma commented Feb 8, 2022

Hi @ahatamiz ,

Could you please help take a look?

Thanks in advance.

@wyli
Copy link
Copy Markdown
Contributor

wyli commented Feb 8, 2022

I think Ali is currently busy, @yiheng-wang-nv could you please help try to reenable this test https://github.com/yiheng-wang-nv/MONAI/blob/19e39562bdb70b68bd251bc1a524050586789c5a/tests/test_vit.py#L138

@yiheng-wang-nv
Copy link
Copy Markdown
Contributor Author

Hi @wyli , sure, I'll update this PR today

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ViT does not support torchscript

5 participants