quantizationed model cannot inference with cuda?

I see the pytorch1.3 is now available, quantization is added, greate! but as the description：
 '""This currently experimental feature includes support for post-training quantization, dynamic quantization, and quantization-aware training. It leverages the FBGEMM and QNNPACK state-of-the-art quantized kernel back ends, for x86 and ARM CPUs, respectively, which are integrated with PyTorch and now share a common API.
"""

means the quantizationed model can only use the FBGEMM and QNNPACK(cpu backends) and cannot deploy the quantizationed model with cuda backend?
means now all the pytorch's  quantization ops are created just for cpus like arm and x86?

Thank you!


cc @jerryzh168 @jianyuh @dzhulgakov

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

quantizationed model cannot inference with cuda? #27729

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

quantizationed model cannot inference with cuda? #27729

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions