Skip to content

Conversation

@zou3519
Copy link
Contributor

@zou3519 zou3519 commented Dec 6, 2019

Stack from ghstack:

This PR begins the process of removing BUILD_NAMEDTENSOR macros. There
will be followups.

Reasons for removing the macros:

  • BUILD_NAMEDTENSOR is always on and has been on since pytorch 1.3.0.
  • Since we don't test building without it, it is useless to keep around.
  • Code becomes nicer to read without the macros

Reasons for not removing the macros:

  • potential for feature flagging
  • binary size reduction / pytorch modularity. I don't think this is a
    major consideration because macros like this are not the best way to
    achieve this.

Now, I argue against needing to feature flag. The main reason why we
might want to feature flag is if we need to disable the feature.
We'd need a fast switch to disable the feature if someone discovers
in the future that named tensors caused some regression in some existing workflows.

In #25798, I did a variety of
macro- and micro- benchmarks to determine the performance impact of named
tensors on regular tensors.

The
microbenchmarks

were not very stable, and running the
microbenchmarks for more iterations doesn't actually help because the
noise is not distributed in a nice way. Instead of microbenchmarks I ran
a profiler
(perf)

to estimate how much overhead named tensors add to unnamed code. I
estimated the overhead to be less than 100ns for add and even smaller
for mm; there are ways to optimize even futher if we find this to be a
problem.

Initial
macrobenchmarks

were also not very stable. I ran imagenet for some number of epochs. To
make them more stable, I got rid of the data loading (which seemed to
vary between runs). In some benchmarkers without data
loading
,
we can see that the results are less noisy now. These results support
no noticeable regressions in speed.

Test Plan:

  • wait for CI

Differential Revision: D18858543

This PR begins the process of removing BUILD_NAMEDTENSOR macros. There
will be followups.

Reasons for removing the macros:
- BUILD_NAMEDTENSOR is always on and has been on since pytorch 1.3.0.
- Since we don't test building without it, it is useless to keep around.
- Code becomes nicer to read without the macros

Reasons for not removing the macros:
- potential for feature flagging

Now, I argue against needing to feature flag. The main reason why we
might want to feature flag is if we need to disable the feature.
We'd need a fast switch to disable the feature if someone discovers
in the future that named tensors caused some regression in some existing workflows.

In #25798, I did a variety of
macro- and micro- benchmarks to determine the performance impact of named
tensors on regular tensors.

[The
microbenchmarks](#25798 (comment))
were not very stable, and running the
microbenchmarks for more iterations doesn't actually help because the
noise is not distributed in a nice way. Instead of microbenchmarks I ran
a [profiler
(perf)](#25798 (comment))
to estimate how much overhead named tensors add to unnamed code. I
estimated the overhead to be less than 100ns for `add` and even smaller
for `mm`; there are ways to optimize even futher if we find this to be a
problem.

[Initial
macrobenchmarks](#25798 (comment))
were also not very stable. I ran imagenet for some number of epochs. To
make them more stable, I got rid of the data loading (which seemed to
vary between runs). [In some benchmarkers without data
loading](#25798 (comment)),
we can see that the results are less noisy now. These results support
no noticeable regressions in speed.

Test Plan:
- wait for CI

[ghstack-poisoned]
@zou3519 zou3519 requested a review from apaszke as a code owner December 6, 2019 19:37
zou3519 added a commit that referenced this pull request Dec 6, 2019
This PR begins the process of removing BUILD_NAMEDTENSOR macros. There
will be followups.

Reasons for removing the macros:
- BUILD_NAMEDTENSOR is always on and has been on since pytorch 1.3.0.
- Since we don't test building without it, it is useless to keep around.
- Code becomes nicer to read without the macros

Reasons for not removing the macros:
- potential for feature flagging

Now, I argue against needing to feature flag. The main reason why we
might want to feature flag is if we need to disable the feature.
We'd need a fast switch to disable the feature if someone discovers
in the future that named tensors caused some regression in some existing workflows.

In #25798, I did a variety of
macro- and micro- benchmarks to determine the performance impact of named
tensors on regular tensors.

[The
microbenchmarks](#25798 (comment))
were not very stable, and running the
microbenchmarks for more iterations doesn't actually help because the
noise is not distributed in a nice way. Instead of microbenchmarks I ran
a [profiler
(perf)](#25798 (comment))
to estimate how much overhead named tensors add to unnamed code. I
estimated the overhead to be less than 100ns for `add` and even smaller
for `mm`; there are ways to optimize even futher if we find this to be a
problem.

[Initial
macrobenchmarks](#25798 (comment))
were also not very stable. I ran imagenet for some number of epochs. To
make them more stable, I got rid of the data loading (which seemed to
vary between runs). [In some benchmarkers without data
loading](#25798 (comment)),
we can see that the results are less noisy now. These results support
no noticeable regressions in speed.

Test Plan:
- wait for CI

ghstack-source-id: 72c39f4
Pull Request resolved: #30894
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Dec 6, 2019
@zou3519 zou3519 requested review from gchanan, izdeby and nairbv and removed request for apaszke December 6, 2019 19:38
@nairbv
Copy link
Collaborator

nairbv commented Dec 6, 2019

Reasons for not removing the macros:

One other (possible) reason we could mention might be compiled code size. I assume we don't have a desire to compile without this, maybe for mobile?

@nairbv
Copy link
Collaborator

nairbv commented Dec 6, 2019

There will be followups.

I assume this means there are places in the code where the flag will still be present after this PR? What are those, and then what happens if we try to build with it turned off?

@zou3519
Copy link
Contributor Author

zou3519 commented Dec 6, 2019

One other (possible) reason we could mention might be compiled code size. I assume we don't have a desire to compile without this, maybe for mobile?

I can note that, but we don't have a desire to compile without named tensors. All builds right now (including mobile) have named tensors on.

There will be followups.

I assume this means there are places in the code where the flag will still be present after this PR? What are those[?]

The last cases are deleting BUILD_NAMEDTENSOR from python code generation. I don't want to include everything in this PR because that might result in lots of merge conflicts. In particular, I might even split this PR up to avoid merge conflicts.

and then what happens if we try to build with it turned off?

After this PR, we won't be able to build with named tensors off -- that will be an unsupported operation.

This PR begins the process of removing BUILD_NAMEDTENSOR macros. There
will be followups.

Reasons for removing the macros:
- BUILD_NAMEDTENSOR is always on and has been on since pytorch 1.3.0.
- Since we don't test building without it, it is useless to keep around.
- Code becomes nicer to read without the macros

Reasons for not removing the macros:
- potential for feature flagging
- binary size reduction / pytorch modularity. I don't think this is a
major consideration because macros like this are not the best way to
achieve this.

Now, I argue against needing to feature flag. The main reason why we
might want to feature flag is if we need to disable the feature.
We'd need a fast switch to disable the feature if someone discovers
in the future that named tensors caused some regression in some existing workflows.

In #25798, I did a variety of
macro- and micro- benchmarks to determine the performance impact of named
tensors on regular tensors.

[The
microbenchmarks](#25798 (comment))
were not very stable, and running the
microbenchmarks for more iterations doesn't actually help because the
noise is not distributed in a nice way. Instead of microbenchmarks I ran
a [profiler
(perf)](#25798 (comment))
to estimate how much overhead named tensors add to unnamed code. I
estimated the overhead to be less than 100ns for `add` and even smaller
for `mm`; there are ways to optimize even futher if we find this to be a
problem.

[Initial
macrobenchmarks](#25798 (comment))
were also not very stable. I ran imagenet for some number of epochs. To
make them more stable, I got rid of the data loading (which seemed to
vary between runs). [In some benchmarkers without data
loading](#25798 (comment)),
we can see that the results are less noisy now. These results support
no noticeable regressions in speed.

Test Plan:
- wait for CI

Differential Revision: [D18858543](https://our.internmc.facebook.com/intern/diff/D18858543)

[ghstack-poisoned]
zou3519 added a commit that referenced this pull request Dec 9, 2019
This PR begins the process of removing BUILD_NAMEDTENSOR macros. There
will be followups.

Reasons for removing the macros:
- BUILD_NAMEDTENSOR is always on and has been on since pytorch 1.3.0.
- Since we don't test building without it, it is useless to keep around.
- Code becomes nicer to read without the macros

Reasons for not removing the macros:
- potential for feature flagging
- binary size reduction / pytorch modularity. I don't think this is a
major consideration because macros like this are not the best way to
achieve this.

Now, I argue against needing to feature flag. The main reason why we
might want to feature flag is if we need to disable the feature.
We'd need a fast switch to disable the feature if someone discovers
in the future that named tensors caused some regression in some existing workflows.

In #25798, I did a variety of
macro- and micro- benchmarks to determine the performance impact of named
tensors on regular tensors.

[The
microbenchmarks](#25798 (comment))
were not very stable, and running the
microbenchmarks for more iterations doesn't actually help because the
noise is not distributed in a nice way. Instead of microbenchmarks I ran
a [profiler
(perf)](#25798 (comment))
to estimate how much overhead named tensors add to unnamed code. I
estimated the overhead to be less than 100ns for `add` and even smaller
for `mm`; there are ways to optimize even futher if we find this to be a
problem.

[Initial
macrobenchmarks](#25798 (comment))
were also not very stable. I ran imagenet for some number of epochs. To
make them more stable, I got rid of the data loading (which seemed to
vary between runs). [In some benchmarkers without data
loading](#25798 (comment)),
we can see that the results are less noisy now. These results support
no noticeable regressions in speed.

Test Plan:
- wait for CI

ghstack-source-id: 36a86e2
Pull Request resolved: #30894
@facebook-github-bot
Copy link
Contributor

@zou3519 merged this pull request in e05ee4c.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@zou3519 merged this pull request in e05ee4c.

wuhuikx pushed a commit to wuhuikx/pytorch that referenced this pull request Jan 30, 2020
Summary:
Pull Request resolved: pytorch#30894

This PR begins the process of removing BUILD_NAMEDTENSOR macros. There
will be followups.

Reasons for removing the macros:
- BUILD_NAMEDTENSOR is always on and has been on since pytorch 1.3.0.
- Since we don't test building without it, it is useless to keep around.
- Code becomes nicer to read without the macros

Reasons for not removing the macros:
- potential for feature flagging

Now, I argue against needing to feature flag. The main reason why we
might want to feature flag is if we need to disable the feature.
We'd need a fast switch to disable the feature if someone discovers
in the future that named tensors caused some regression in some existing workflows.

In pytorch#25798, I did a variety of
macro- and micro- benchmarks to determine the performance impact of named
tensors on regular tensors.

[The
microbenchmarks](pytorch#25798 (comment))
were not very stable, and running the
microbenchmarks for more iterations doesn't actually help because the
noise is not distributed in a nice way. Instead of microbenchmarks I ran
a [profiler
(perf)](pytorch#25798 (comment))
to estimate how much overhead named tensors add to unnamed code. I
estimated the overhead to be less than 100ns for `add` and even smaller
for `mm`; there are ways to optimize even futher if we find this to be a
problem.

[Initial
macrobenchmarks](pytorch#25798 (comment))
were also not very stable. I ran imagenet for some number of epochs. To
make them more stable, I got rid of the data loading (which seemed to
vary between runs). [In some benchmarkers without data
loading](pytorch#25798 (comment)),
we can see that the results are less noisy now. These results support
no noticeable regressions in speed.

Test Plan: - wait for CI

Differential Revision: D18858543

Pulled By: zou3519

fbshipit-source-id: 08bf3853a9f506c6b084808dc9ddd1e835f48c13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants