-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
Custom operators and triton kernels can be marked with "require_stride_order" or "require_exact_strides". This means that during lowering, Inductor will ensure that the inputs to the custom operator have the same strides as observed in the post_grad Inductor graph.
However, Inductor passes are able to muck around with the strides of inputs to custom operators and triton kernels. See #137437 for an example where an mm padding pass changed the strides!
At the very beginning of inductor, we should record the strides of all inputs to nodes that may require_stride_order and require_exact_strides and store that information somewhere (on the node). Later on, during lowering, we should make use of those strides.
#136732 seems like a WIP version of this
cc @ezyang @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @bdhirsh