Fix mxnet ctc_loss bug #11834

HawkAaron · 2018-07-20T03:53:43Z

fix ctc_loss GPU bug
~~2. add blank label for gluon CTCLoss~~
edit @szha: crossed out second item

szha · 2018-07-20T04:17:22Z

python/mxnet/gluon/loss.py

        length respectively.
    weight : float or None
        Global scalar weight for loss.
+    blank_label : {'first', 'last'}, default 'last'


it was intentional not to expose this option in gluon.

That means I need to revert this commit and resend a pull request ?

you can simply add a commit that removes the change in blank_label

chinakook · 2018-07-21T23:51:09Z

Is the speed of big vocab size solved of this op?

HawkAaron · 2018-07-23T00:53:39Z

@chinakook its looks good to me

This reverts commit aab11f7.

szha · 2018-07-25T00:07:23Z

@Jerryzcn could you give this a try?

Jerryzcn · 2018-07-26T20:13:57Z

will do

Jerryzcn · 2018-07-26T22:11:29Z

Thanks @HawkAaron.

it seems removing the max subtraction negatively affect convergence. Do you observe similar result?

In the original baidu ctc, there is a section that do max subtraction. I suspect the broadcast is too slow here. Maybe we should write a function for this.

for(int r = 0; r < alphabet_size_; ++r) {
    probs[r + col_offset] = std::exp(activations[r + col_offset] - max_activation);
    denom += probs[r + col_offset];
}

Jerryzcn

subtracting max is missing

Jerryzcn · 2018-07-26T23:33:37Z

src/operator/contrib/ctc_include/detail/gpu_ctc.h

-    denoms_handle = reduce_with_axis<red::sum, false>(
-        F<mxnet::op::mshadow_op::exp>(
-            log_probs_handle -
-            broadcast<0>(reduce_with_axis<red::maximum, false>(log_probs_handle, 1),


max is necessary here.

the max is reduced in here: https://github.com/apache/incubator-mxnet/blob/master/src/operator/contrib/ctc_include/detail/gpu_ctc.h#L398

and log_probs -= denoms is in here: https://github.com/apache/incubator-mxnet/blob/master/src/operator/contrib/ctc_include/detail/gpu_ctc.h#L409

that means the broadcast line (https://github.com/apache/incubator-mxnet/blob/master/src/operator/contrib/ctc_include/detail/gpu_ctc.h#L417) should be zero

I see. Did not look at other parts of the code thanks! Found a bug on my end.

szha · 2018-07-27T19:47:34Z

@HawkAaron thanks for the fix, and @Jerryzcn thanks for the review

* fix ctc_loss GPU bug * add blank_label parameter for CTCLoss * Revert "add blank_label parameter for CTCLoss" This reverts commit aab11f7.

HawkAaron added 2 commits July 20, 2018 11:48

fix ctc_loss GPU bug

050df05

add blank_label parameter for CTCLoss

aab11f7

HawkAaron requested review from anirudh2290 and szha as code owners July 20, 2018 03:53

szha reviewed Jul 20, 2018

View reviewed changes

Revert "add blank_label parameter for CTCLoss"

ec19d8a

This reverts commit aab11f7.

Jerryzcn suggested changes Jul 26, 2018

View reviewed changes

Jerryzcn approved these changes Jul 27, 2018

View reviewed changes

szha merged commit 2bddf6f into apache:master Jul 27, 2018

XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018

Fix mxnet ctc_loss bug (apache#11834)

9e251d3

* fix ctc_loss GPU bug * add blank_label parameter for CTCLoss * Revert "add blank_label parameter for CTCLoss" This reverts commit aab11f7.

This was referenced Sep 5, 2018

Some mxnet ctc_loss bug & feature request #10995

Closed

[MXNET-807] Support integer label type in ctc_loss operator #12468

Merged

ctc_loss with large alphabet size raises CUDA error #12493

Closed

Fix mxnet ctc_loss bug #11834

Fix mxnet ctc_loss bug #11834

Uh oh!

Conversation

HawkAaron commented Jul 20, 2018 • edited by szha Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

szha Jul 20, 2018

Choose a reason for hiding this comment

Uh oh!

HawkAaron Jul 23, 2018

Choose a reason for hiding this comment

Uh oh!

szha Jul 24, 2018

Choose a reason for hiding this comment

Uh oh!

chinakook commented Jul 21, 2018

Uh oh!

HawkAaron commented Jul 23, 2018

Uh oh!

szha commented Jul 25, 2018

Uh oh!

Jerryzcn commented Jul 26, 2018

Uh oh!

Jerryzcn commented Jul 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jerryzcn left a comment

Choose a reason for hiding this comment

Uh oh!

Jerryzcn Jul 26, 2018

Choose a reason for hiding this comment

Uh oh!

HawkAaron Jul 26, 2018

Choose a reason for hiding this comment

Uh oh!

Jerryzcn Jul 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

szha commented Jul 27, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HawkAaron commented Jul 20, 2018 •

edited by szha

Loading

Jerryzcn commented Jul 26, 2018 •

edited

Loading

Jerryzcn Jul 27, 2018 •

edited

Loading