-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
Motivation
We want to de-virtualize TensorImpl methods as much as possible, to achieve performance gain for common Tensor methods such as numel() / sizes() / dim().
Concerns
XLA relies on TensorImpl methods such as numel() / sizes() / dim() being virtual, so that they can have their own implementation and calculate those values lazily when requested, rather than eagerly when a size change event happens.
Proposal
-
To support XLA's use case, we should add virtual methods such as
virtual_numel()/virtual_sizes()/virtual_dim()to OpaqueTensorImpl, and in TensorImpl we should add conditional branching innumel()/sizes()/dim()to call the opaque virtual methods when the TensorImpl is OpaqueTensorImpl. (We should run benchmarks such as JIT LSTM to make sure adding conditional branching and de-virtualizingnumel()/sizes()/dim()doesn't cause perf regression for our CPU / CUDA use cases.) -
We should ask XLA to move their XLATensorImpl to OpaqueTensorImpl, and ask them to override the opaque methods for their lazy size computation.
-
We should de-virtualize TensorImpl and all of its subclasses in PyTorch core library (example PR: Remove virtual keyword from most functions in TensorImpl #20909).