add index_put api by Courtesy-Xs · Pull Request #52886 · PaddlePaddle/Paddle

Courtesy-Xs · 2023-04-13T08:26:26Z

PR types

New features

PR changes

APIs

Description

This PR add index_put and index_put_ API for Paddle, please refer to PaddlePaddle API doc for details.

(Supplementary Note: Due to some indexing mechanism problems of the Paddle framework, the performance of Paddle's ways to index is much slower than Torch, but the overall reconstruction is a process that takes time, so some advanced indexing with poor performance is firstly extracted for optimization and will expose them in the type of paddle API which are index_put and index_put_ API for users.
Advanced Indexing means using tensor as subscript to index a tensor. Under the functions supported by index_put API, its performance is far better than directly c-order indexing in paddle)

paddle-bot · 2023-04-13T08:26:30Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… clear_add_index_put_api

… less than x.dims

… clear_add_index_put_api

Xreki · 2023-05-05T09:31:01Z

+#include "paddle/phi/kernels/index_put_grad_kernel.h"
+#include <numeric>
+#include "paddle/phi/backends/gpu/gpu_context.h"
+#include "paddle/phi/backends/gpu/gpu_launch_config.h"


cpu kernel里面不需要加这些gpu相关的头文件

Xreki · 2023-05-05T09:32:33Z

+    int64_t offset = 0;
+
+    for (size_t i = 0; i < Rank; ++i) {
+      cur_ix = (int64_t(*(indices[i] + idx)));


数据类型转换用static_cast

Xreki · 2023-05-05T09:42:39Z

+                   value_grad->dtype(),
+                   false,
+                   &value_grad_dims_without1);
+      phi::ReshapeInferKernel<Context>(


这里的目的是对value_grad做Resize吧？value_grad是输出，直接用value_grad->Resize(...)就行？

value_grad的size不能调用resize变化的，value_grad的dims会影响到反向梯度的shape，需保持与前向的value的shape一致

但你ReshapeInferKernel的调用，不也会修改value_grad的shape吗？我的意思是，在L190再调用一次value_grad->Resize，直接再次设置value_grad的shape，也可避免ReshapeInferKernel中的一次memcpy。

这里的ReshapeInferKernel本身并没有修改value_grad的shape

Xreki · 2023-05-05T09:46:04Z

+template <typename T, size_t Rank>
+void set_zero_kernel(const int64_t N,
+                     const int64_t** indices,
+                     phi::Array<int64_t, Rank> stride,


CPU Kernel没必要用phi::Array，直接用const std::vector<int64_t>&或const phi::DDim&类型就行，还能避免拷贝。

Xreki · 2023-05-05T09:47:35Z

+  }
+}
+
+template <typename T, typename Context, size_t Rank>


CPU Kernel就不要将Rank作为模板了，你单测覆盖率没过正式因为Rank

Xreki · 2023-05-06T02:44:03Z

+#include "paddle/phi/kernels/expand_kernel.h"
+#include "paddle/phi/kernels/nonzero_kernel.h"
+#include "paddle/phi/kernels/reshape_kernel.h"
+#include "paddle/phi/kernels/split_kernel.h"


不要include这么多头文件

Xreki · 2023-05-06T02:44:36Z

+
+#include <vector>
+#include "paddle/fluid/memory/malloc.h"
+#include "paddle/fluid/memory/memcpy.h"


不要include fluid下面的头文件，使用phi目录下的代替

Xreki · 2023-05-06T02:47:25Z

                     PROPERTIES TIMEOUT 200)
 set_tests_properties(test_index_select_op PROPERTIES TIMEOUT 120)
 set_tests_properties(test_index_add_op PROPERTIES TIMEOUT 120)
+set_tests_properties(test_index_put_op PROPERTIES TIMEOUT 120)


这个TIMEOUT一定要设置吗，默认是多少？

这个不设置的话，CI会超时，和CI的同学确认过了，默认的值的话，很小，貌似不过15s，当时CI报错超过15s直接timeout了，具体是多少不确定

Xreki · 2023-05-06T02:47:39Z

@@ -0,0 +1,826 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.


2022 -> 2023

Xreki · 2023-05-06T02:49:38Z

+
+
+    """
+    assert len(indices) != 0, "indices can't be empty"


动态图可不加assert，算子内部负责检查吧

paddle-ci-bot · 2023-05-06T03:28:33Z

Sorry to inform you that 7b71a3a's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

Xreki · 2023-05-08T01:23:53Z

这个改一下吧。

Courtesy-Xs · 2023-05-08T02:42:40Z

这个改一下吧。

这个看起来是新增API的问题，确认了一下，改了yaml都会这样，API和Op的参数是对齐的

zyfncg · 2023-05-08T02:45:49Z

  out->set_layout(x.layout());
 }

+void IndexPutInferMeta(const MetaTensor& x,


InferMeta按照字母序放置

zyfncg · 2023-05-08T02:48:17Z

+#pragma once
+
+#include <vector>
+#include "paddle/phi/common/place.h"


place.h看上去不需要include

zyfncg · 2023-05-08T02:48:24Z

+#pragma once
+
+#include <vector>
+#include "paddle/phi/common/place.h"


Xreki · 2023-05-08T01:26:20Z

+  infer_meta :
+    func : IndexPutInferMeta
+  kernel :
+    func : index_put


输入x和indices的数据类型不同，需要指定按照谁的数据类型来选择kernel，关键字为data_type，写法如后面紧跟的index_sample

Xreki · 2023-05-08T02:25:30Z

+      const int64_t* pd_indices[7];
+      for (size_t i = 0; i < indices_v.size(); ++i) {
+        pd_indices[i] = indices_v[i]->data<int64_t>();
+      }


L108 - L111既然后续还会用到，就挪到L98吧，删除L121 - L124的重复代码。

Xreki · 2023-05-08T02:32:41Z

+                   value_grad->dtype(),
+                   false,
+                   &value_grad_dims_without1);
+      phi::ReshapeInferKernel<Context>(


但你ReshapeInferKernel的调用，不也会修改value_grad的shape吗？我的意思是，在L190再调用一次value_grad->Resize，直接再次设置value_grad的shape，也可避免ReshapeInferKernel中的一次memcpy。

Xreki · 2023-05-08T02:36:25Z

+  T* out = dev_ctx.template Alloc<T>(p_res);
+  range_kernel<T>(N, out);
+  return res;
+}


并不是说CPU、GPU Kernel里面重复，而是前向、反向中也有重复。通过模板+宏、或者设置不同的函数名来解决。

Xreki · 2023-05-08T02:37:48Z

+                      const int64_t** indices,
+                      const phi::DDim& stride,
+                      const phi::DDim& shape,
+                      int64_t isSingleValTensor,


isSingleValTensor -> is_single_val_tensor

Xreki · 2023-05-08T02:38:10Z

+                          DenseTensor* out) {
+  auto* x_data = x.data<T>();
+  auto* val_data = value.data<T>();
+  bool isInitialized = out->initialized();


isInitialized -> is_initialized

Xreki · 2023-05-08T02:39:15Z

+#include "paddle/phi/kernels/reshape_kernel.h"
+#include "paddle/phi/kernels/split_kernel.h"
+
+namespace phi {


加一层namespace funcs

Xreki · 2023-05-08T02:41:18Z

+  phi::DenseTensor res_tensor(tensor.dtype());
+  res_tensor.Resize(res_dim);
+  ExpandKernel<T, Context>(
+      dev_ctx, mid_tensor, IntArray(phi::vectorize(res_dim)), &res_tensor);


这里调Reshape和Expand都会产生memcpy，实际上只需要获得相应的dims

我开始也想过是不是可以直接resize就行，之前在
tmp_indices_v.emplace_back(DenseTensor(phi::DataType::INT64).Resize(phi::make_ddim({nonzero_indices.dims()[0],1})));替换为
tmp_indices_v.emplace_back(DenseTensor(phi::DataType::INT64).Resize(phi::make_ddim({nonzero_indices.dims()[0]})));的时候我尝试过是否通过resize可以减少一些reshape操作

但是在这里我受限的点在于我需要一个能够满足expand关系的src tensor和一个des tensor来操作，但是我并不能修改tensor这个对象，因为它是一个const reference，所以这里两次的拷贝，可能是一个必要的

Xreki · 2023-05-08T02:50:22Z

+                                      int64_t** indices,
+                                      phi::Array<int64_t, Rank> stride,
+                                      phi::Array<int64_t, Rank> shape,
+                                      int64_t isSingleValTensor,


isSingleValTensor -> is_single_val_tensor

Xreki · 2023-05-08T10:20:44Z

+    T* out = dev_ctx.template Alloc<T>(p_res);                  \
+    range_kernel<T>(N, out);                                    \
+    return res;                                                 \
+  }


真是没想到你会把整个函数全部写成一个宏。一个建议的最简单的修改方式如下：

template <typename T> void range_kernel(int64_t N, T* out) { ... } template <typename T, typename Context> phi::DenseTensor GetRangeTensor(const Context& dev_ctx, int64_t N, phi::DataType dtype) { ... } #if defined(PADDLE_WITH_CUDA) || defined(PADDLE_WITH_HIP) template <typename T> __global__ void range_cuda_kernel(int64_t N, T* out) { ... } template <typename T, typename Context> phi::DenseTensor GetRangeCudaTensor( const Context& dev_ctx, int64_t N, phi::DataType dtype) { ... } #endif

Xreki

LGTM

Xreki

LGTM

zoooo0820 · 2023-05-09T09:59:28Z

+
+    Args:
+        x (Tensor) : The Source Tensor. Supported data types are int32, int64, float16, float32, float64, bool.
+        indices (Tensor): The tuple of Tensor containing the indices to index.


这里应该是List / tuple of Tensor

这里是tuple of tensor，对齐的torch对应的API的用法

… clear_add_index_put_api

Ligoml · 2023-05-09T11:18:57Z

+        indices (Tuple of Tensor): The tuple of Tensor containing the indices to index.
+            The data type of ``tensor in indices`` must be int32, int64 or bool
+        value (Tensor): The tensor used to be assigned to x.
+        accummulate (Bool): Whether the elements in values are added to x


有默认值的参数需要注明optional，以及default是什么

Ligoml · 2023-05-09T11:22:02Z

+
+def index_put(x, indices, value, accumulate=False, name=None):
+    """
+    Outplace version of ``index_put_`` API, the output Tensor will be inplaced with input ``x``.


一般是会说 index_put_ 是 index_put 的 inplace 版本，能否反过来说辛苦文档pm确认下 @sunzhongkai588

Ligoml · 2023-05-09T11:22:26Z

+
+    Returns:
+        Tensor, same dimention and dtype with x.
+    Examples:


这里需要确认一下官网预览效果，等 PR-CI-Paddle-Doc-Preview 跑完

Ligoml

LGTM for docs

XieYunshen

LGTM for set_tests_properties(test_index_put_op PROPERTIES TIMEOUT 120)

XieYunshen

LGTM for set_tests_properties(test_index_put_op PROPERTIES TIMEOUT 120)

add index_put api

21c5464

Courtesy-Xs added 9 commits April 13, 2023 08:29

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

a75ded8

… clear_add_index_put_api

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

91c30e6

… clear_add_index_put_api

fix some bugs

9da71b6

fix value broadcast in backward and add test case in static

4538c1a

fix cpu backward bug

244d02d

add timeout=120s for index_put

01672f8

add op_compat for index_put

5a361ea

delete input_put in op_compat.yaml

a7f2d42

add inplace index_put test

d996d36

Courtesy-Xs marked this pull request as draft April 17, 2023 12:46

Courtesy-Xs marked this pull request as ready for review April 17, 2023 12:47

Courtesy-Xs marked this pull request as draft April 17, 2023 12:50

Courtesy-Xs marked this pull request as ready for review April 17, 2023 12:50

Courtesy-Xs added 6 commits April 18, 2023 03:41

refactor code

8a3fef4

add test case when index tensor in indices is int32 when indices.size…

5f77bb5

… less than x.dims

add index_put api backward in cpu place

6267d32

add backward test case

fdd0436

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

86d6cac

… clear_add_index_put_api

fix take in init.py bug

7b71a3a

Xreki reviewed May 6, 2023

View reviewed changes

Courtesy-Xs added 3 commits May 6, 2023 07:52

refactor code according to review result

48a03c6

refactor code to delete some duplicated code

0c6545a

zyfncg previously approved these changes May 8, 2023

View reviewed changes

Xreki reviewed May 8, 2023

View reviewed changes

add datatype flag in backward yaml

ed7a141

Xreki reviewed May 8, 2023

View reviewed changes

replace macro with template with conditional complilation

c92f75e

Xreki previously approved these changes May 8, 2023

View reviewed changes

fix rocmn bug

4de9b48

Courtesy-Xs dismissed Xreki’s stale review via 4de9b48 May 9, 2023 02:21

fix note and rocmn bug

ed00d81

Xreki previously approved these changes May 9, 2023

View reviewed changes

zoooo0820 reviewed May 9, 2023

View reviewed changes

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

f956aee

… clear_add_index_put_api

Courtesy-Xs dismissed Xreki’s stale review via f956aee May 9, 2023 10:22

fix conflict between flatten and index_put

43167ab

Ligoml reviewed May 9, 2023

View reviewed changes

lanxianghit previously approved these changes May 9, 2023

View reviewed changes

zyfncg approved these changes May 9, 2023

View reviewed changes

zyfncg previously approved these changes May 9, 2023

View reviewed changes

fix bug in documentation

b09221f

Courtesy-Xs dismissed stale reviews from zyfncg and lanxianghit via b09221f May 9, 2023 12:37

Ligoml reviewed May 9, 2023

View reviewed changes

Comment thread python/paddle/tensor/manipulation.py Outdated

Update python/paddle/tensor/manipulation.py

db0209f

Ligoml approved these changes May 9, 2023

View reviewed changes

XieYunshen approved these changes May 9, 2023

View reviewed changes

PaddlePaddle locked and limited conversation to collaborators May 9, 2023

PaddlePaddle unlocked this conversation May 9, 2023

raindrops2sea approved these changes May 10, 2023

View reviewed changes

lanxianghit approved these changes May 10, 2023

View reviewed changes

Xreki merged commit f3393f4 into PaddlePaddle:develop May 10, 2023

Courtesy-Xs deleted the clear_add_index_put_api branch July 7, 2023 03:14

		@@ -0,0 +1,826 @@
		# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.

Conversation

Courtesy-Xs commented Apr 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Description

Uh oh!

paddle-bot Bot commented Apr 13, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paddle-ci-bot Bot commented May 6, 2023

Uh oh!

Xreki commented May 8, 2023

Uh oh!

Courtesy-Xs commented May 8, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Courtesy-Xs commented Apr 13, 2023 •

edited

Loading