Skip to content

Conversation

@raghuramank100
Copy link
Contributor

@raghuramank100 raghuramank100 commented Nov 14, 2019

Stack from ghstack:

10x speed up of histogram observers by rewriting histogram combination routine.
Previously this was done as explicit bilinear interpolation.
Now this is done as a sample rate conversion operation, where we achieve
resampling by an upsampling (zero-order hold) followed by box filtering and
downsampling.

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
X = torch.randn(10,1,320,320)
start = time.time()
obs(X)
#obs.forward_new(X)
acc_time = acc_time + time.time()-start
print(acc_time)

Before change:
6.9

After change:
0.6

Resnet-18 accuracy is unchanged with the faster histogram observer.
Acc1 = 69.4

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D18508562

Summary:

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
raghuramank100 pushed a commit that referenced this pull request Nov 14, 2019
Summary:

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: c6c795d
Pull Request resolved: #29790
Summary:

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
raghuramank100 pushed a commit that referenced this pull request Nov 14, 2019
Summary:

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 87a1583
Pull Request resolved: #29790
Summary:

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18508562](https://our.internmc.facebook.com/intern/diff/D18508562)

[ghstack-poisoned]
Summary:

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18508562](https://our.internmc.facebook.com/intern/diff/D18508562)

[ghstack-poisoned]
raghuramank100 pushed a commit that referenced this pull request Nov 14, 2019
Summary:

 Speed up histogram observers by rewriting histogram combination routine.
 Previously this was done as explicit bilinear interpolation.
 Now this is done as a sample rate conversion operation, where we achieve
 resampling by an upsampling (zero-order hold) followed by box filtering and
 downsampling.

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Before change:
6.9

After change:
0.6

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 09481b9
Pull Request resolved: #29790
Summary:

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18508562](https://our.internmc.facebook.com/intern/diff/D18508562)

[ghstack-poisoned]
Summary:

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18508562](https://our.internmc.facebook.com/intern/diff/D18508562)

[ghstack-poisoned]
raghuramank100 pushed a commit that referenced this pull request Nov 15, 2019
Summary:

 Speed up histogram observers by rewriting histogram combination routine.
 Previously this was done as explicit bilinear interpolation.
 Now this is done as a sample rate conversion operation, where we achieve
 resampling by an upsampling (zero-order hold) followed by box filtering and
 downsampling.

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Before change:
6.9

After change:
0.6

Resnet-18 accuracy is unchanged with the faster histogram observer.
Acc1 = 69.4

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 7b2778e
Pull Request resolved: #29790
Copy link
Contributor

@lly-zero-one lly-zero-one left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVM, it seems you already added it.

Copy link

@hx89 hx89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great idea and huge speedup!

dtype=torch.double)[downsample_rate - 1 :: downsample_rate]
# Finally perform interpolation
shifted_integral_histogram = torch.zeros((Nbins))
shifted_integral_histogram[0] = 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Seems this is not needed since it's initialized to zero in line 739?

self.min_val = None
self.max_val = None
self.dst_nbins = 2 ** torch.iinfo(self.dtype).bits
self.upsample_rate = 128
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May add comment on why choose 128 here? And could make upsample_rate as an input argument.

Summary:

 10x speed up of histogram observers by rewriting histogram combination routine.
 Previously this was done as explicit bilinear interpolation.
 Now this is done as a sample rate conversion operation, where we achieve
 resampling by an upsampling (zero-order hold) followed by box filtering and
 downsampling.

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Before change:
6.9

After change:
0.6

Resnet-18 accuracy is unchanged with the faster histogram observer.
Acc1 = 69.4

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18508562](https://our.internmc.facebook.com/intern/diff/D18508562)

[ghstack-poisoned]
Summary:

 10x speed up of histogram observers by rewriting histogram combination routine.
 Previously this was done as explicit bilinear interpolation.
 Now this is done as a sample rate conversion operation, where we achieve
 resampling by an upsampling (zero-order hold) followed by box filtering and
 downsampling.

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Before change:
6.9

After change:
0.6

Resnet-18 accuracy is unchanged with the faster histogram observer.
Acc1 = 69.4

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18508562](https://our.internmc.facebook.com/intern/diff/D18508562)

[ghstack-poisoned]
raghuramank100 pushed a commit that referenced this pull request Nov 19, 2019
Summary:
 Address review comments and typos

 Speed up histogram observers by rewriting histogram combination routine.
 Previously this was done as explicit bilinear interpolation.
 Now this is done as a sample rate conversion operation, where we achieve
 resampling by an upsampling (zero-order hold) followed by box filtering and
 downsampling.

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Before change:
6.9

After change:
0.6

Resnet-18 accuracy is unchanged with the faster histogram observer.
Acc1 = 69.4

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 0785394
Pull Request resolved: #29790
Copy link

@hx89 hx89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Summary:

 10x speed up of histogram observers by rewriting histogram combination routine.
 Previously this was done as explicit bilinear interpolation.
 Now this is done as a sample rate conversion operation, where we achieve
 resampling by an upsampling (zero-order hold) followed by box filtering and
 downsampling.

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Before change:
6.9

After change:
0.6

Resnet-18 accuracy is unchanged with the faster histogram observer.
Acc1 = 69.4

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18508562](https://our.internmc.facebook.com/intern/diff/D18508562)

[ghstack-poisoned]
raghuramank100 pushed a commit that referenced this pull request Nov 19, 2019
Summary:
 Address review comments and typos

 Speed up histogram observers by rewriting histogram combination routine.
 Previously this was done as explicit bilinear interpolation.
 Now this is done as a sample rate conversion operation, where we achieve
 resampling by an upsampling (zero-order hold) followed by box filtering and
 downsampling.

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Before change:
6.9

After change:
0.6

Resnet-18 accuracy is unchanged with the faster histogram observer.
Acc1 = 69.4

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 34dec38
Pull Request resolved: #29790
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 67b77af.

@facebook-github-bot facebook-github-bot deleted the gh/raghuraman_k/8/head branch November 24, 2019 15:16
xxtEchjovs44 pushed a commit to xxtEchjovs44/pytorch that referenced this pull request Jan 29, 2020
Summary:

Test Plan:
import torch
import time
import numpy as np
from torch.quantization.observer import HistogramObserver

X = torch.randn(1,1,224,224)

obs = HistogramObserver(2048)
acc_time = 0
for i in range(100):
   X = torch.randn(10,1,320,320)
   start = time.time()
   obs(X)
   #obs.forward_new(X)
   acc_time = acc_time + time.time()-start
print(acc_time)

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 1a6c6f9
Pull Request resolved: pytorch/pytorch#29790
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants