【Hackathon 5th No.6】 为 Paddle 增强put_along_axis API -part#59674
Conversation
69b10d3 to
c2fdb6f
Compare
|
你的PR提交成功,感谢你对开源项目的贡献! |
|
@zoooo0820 CI都过了,能否麻烦review一下,改动有点大,麻烦您了 |
| do { | ||
| assumed = old; | ||
| old = atomicCAS(address, assumed, val * assumed); | ||
| } while (assumed != old); |
| CudaAtomicAdd(imag, val.imag)); | ||
| } | ||
|
|
||
| // For atomicMul. |
There was a problem hiding this comment.
这里的atomicMul都是参考的前面的atomicAdd以及后面的atomicMin这些,只是把加改成了乘
| ): | ||
| tensor = paddle.to_tensor(input) | ||
| else: | ||
| tensor = input |
There was a problem hiding this comment.
这组atleast_xxx的改动是否和这个PR无关
There was a problem hiding this comment.
应该是之前解决冲突的时候导致的,develop分支的代码是绿色部分,不知道为啥这里显示是我这个PR里面的改动,我再改改
| *out, axis, index, value, include_self, dev_ctx); | ||
| } | ||
| } else { | ||
| PADDLE_THROW(errors::InvalidArgument( |
| *out, axis, index, value, include_self, dev_ctx); | ||
| } | ||
| } else { | ||
| PADDLE_THROW(errors::InvalidArgument( |
| } | ||
|
|
||
| int64_t index_idx = 0; | ||
| int* num_elements = new int[grad_size](); |
There was a problem hiding this comment.
此处的num_elements好像没有delete。看起来该项的几处用法都是数组语义,是否可以考虑用stl替代下,更安全一些
There was a problem hiding this comment.
已经改成了std::vector
| int64_t index_idx = index.numel() - 1; | ||
| for (int i = 0; i < grad_size; i++) { | ||
| grad_data[i] = static_cast<tensor_t>(0); | ||
| } |
There was a problem hiding this comment.
想确认下,此处的移除是因为这段赋值是非必要的,还是历史行为是错误的?
There was a problem hiding this comment.
这一段是因为value的grad没有初始化为0,我在 #59163 这个PR里面加的,我现在把value grad初始化为0挪到了put_along_axis_grad_kernel.cc以及put_along_axis_grad_kernel.cu初始化这个grad的时候。
| *x_grad, | ||
| include_self, | ||
| dev_ctx); | ||
| } else { |
There was a problem hiding this comment.
index_type现在在外部有检查吗,这里是只会有int32/int64两种情况吗
There was a problem hiding this comment.
之前就是只有这两种情况,我再加个检查
There was a problem hiding this comment.
已经在python接口加了类型检查
| self.axis_type = "int64" | ||
|
|
||
|
|
||
| class TestPutAlongAxisOpMulIncludeSelf(TestPutAlongAxisOp): |
There was a problem hiding this comment.
这几个includeself的类测试的是false的情况,命名可以加个not,更清晰一些
| self.dtype = 'float64' | ||
| self.x_type = "float64" | ||
| self.x_shape = (10, 10, 10) | ||
| self.value_type = "float64" |
There was a problem hiding this comment.
GPU的场景因为添加了很多dtype的atomicMul的方法,这里能否在单测中补充下目前支持的几个数据类型在新增的reduce方案下的case,保证新增的方案是正确的
There was a problem hiding this comment.
由于Optest不支持int类型的输入,所以我统一用的unittest在前向计算做的测试,目前已经加了float32 bfloat16 int32 int64的测试,都没问题
There was a problem hiding this comment.
由于Optest不支持int类型的输入,所以我统一用的unittest在前向计算做的测试,目前已经加了float32 bfloat16 int32 int64的测试,都没问题
因为前面代码中包含一个uint8的特殊情况,能再辛苦补充一个uint8类型的测试吗
|
@zoooo0820 CI现在都过了,能否辛苦您再review一下,看看还有哪里需要再改改,麻烦您了 |
| phi::CudaAtomicMax(self_data, *src_data); | ||
| } | ||
| template <typename tensor_t, | ||
| std::enable_if_t<std::is_same<tensor_t, uint8_t>::value>* = nullptr> |
There was a problem hiding this comment.
请问此处是因为uint8没有对应的atomic操作,所以做的特殊处理吗
There was a problem hiding this comment.
是的,没有实现uint8的atomic操作,这里的定义我是仿照这个文件之前就定义的reduce_add写的。
| self.dtype = 'float64' | ||
| self.x_type = "float64" | ||
| self.x_shape = (10, 10, 10) | ||
| self.value_type = "float64" |
There was a problem hiding this comment.
由于Optest不支持int类型的输入,所以我统一用的unittest在前向计算做的测试,目前已经加了float32 bfloat16 int32 int64的测试,都没问题
因为前面代码中包含一个uint8的特殊情况,能再辛苦补充一个uint8类型的测试吗
vivienfanghuagood
left a comment
There was a problem hiding this comment.
LGTM for api change
|
|
||
| - op : put_along_axis | ||
| args : (Tensor arr, Tensor indices, Tensor values, int axis, str reduce = "assign") | ||
| args : (Tensor arr, Tensor indices, Tensor values, int axis, str reduce = "assign", bool include_self = true) |
There was a problem hiding this comment.
there is a parameter of broadcast in Python API, shall we also add it here as include_self or delete it from API?
There was a problem hiding this comment.
broadcast is processed in the Python interface, so there is no need to pass it into the C interface again
|
The docstring needs to be modified, such as missing an introduction to the parameter of |
Done for doc of en. Chinese doc is modified at PaddlePaddle/docs#6348 |
| reduce (str, optional): The reduce operation, default is 'assign', support 'add', 'assign', 'mul' and 'multiply'. | ||
| include_self (bool, optional): whether to reduce with the elements of arr. (Only support True now) | ||
| broadcast (bool, optional): whether to broadcast indices. | ||
| reduce (str, optional): The reduce operation, default is 'assign', support 'add', 'assign', 'mul', 'multiply', "mean", "amin" and "amax". |
There was a problem hiding this comment.
| reduce (str, optional): The reduce operation, default is 'assign', support 'add', 'assign', 'mul', 'multiply', "mean", "amin" and "amax". | |
| reduce (str, optional): The reduce operation, default is 'assign', support 'add', 'assign', 'mul', 'multiply', 'mean', 'amin' and 'amax'. |
和前文统一吧
* 【Hackathon 5th No.6】 为 Paddle 增强put_along_axis API -part (#59674) * fix bug of put_along_axis (#60551) * Improve the performence of put_along_axis (#60618) * fix bug of put_along_axis * improve performence of put_along_axis * [Bug-Fix] fix compile bug of cudaxxxAsync (#60934) --------- Co-authored-by: YibLiu <[email protected]>
PR types
New features
PR changes
APIs
Description
这个PR的改动为: