Regression on split operator benchmark after __torch_function__ merge

(moved over from https://github.com/pytorch/pytorch/pull/30730#issuecomment-562242874, comment by @hl475)

Hi @ngoldbaum, we found there is some performance regression introduced from your PR. According to our benchmark, we found 
```
================================================================================
Before the change, Program Output:
================================================================================
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M8_N8_parts2_cpu
# Input: M: 8, N: 8, parts: 2, device: cpu
Forward Execution Time (us) : 6.546

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M256_N512_parts2_cpu
# Input: M: 256, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 6.443

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M512_N512_parts2_cpu
# Input: M: 512, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 6.437

================================================================================
After the change, Program Output:
================================================================================
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M8_N8_parts2_cpu
# Input: M: 8, N: 8, parts: 2, device: cpu
Forward Execution Time (us) : 4.252

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M256_N512_parts2_cpu
# Input: M: 256, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 4.142

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M512_N512_parts2_cpu
# Input: M: 512, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 4.210

================================================================================
```
where before means current codebase, and after means remove the change in this PR.

To reproduce the benchmark, please read https://github.com/pytorch/pytorch/tree/master/benchmarks/operator_benchmark and run `python -m pt.split_test`.

cc @ezyang @gchanan @zou3519 @VitalyFedyunin @ngimel @mruberry @mingzhe09088 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression on split operator benchmark after __torch_function__ merge #30831

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Regression on split operator benchmark after __torch_function__ merge #30831

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions