Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Conversation

@HawkAaron
Copy link
Contributor

@HawkAaron HawkAaron commented Jul 20, 2018

  1. fix ctc_loss GPU bug
    2. add blank label for gluon CTCLoss
    edit @szha: crossed out second item

length respectively.
weight : float or None
Global scalar weight for loss.
blank_label : {'first', 'last'}, default 'last'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was intentional not to expose this option in gluon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means I need to revert this commit and resend a pull request ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can simply add a commit that removes the change in blank_label

@chinakook
Copy link
Contributor

Is the speed of big vocab size solved of this op?

@HawkAaron
Copy link
Contributor Author

@chinakook its looks good to me

@szha
Copy link
Member

szha commented Jul 25, 2018

@Jerryzcn could you give this a try?

@Jerryzcn
Copy link
Contributor

will do

@Jerryzcn
Copy link
Contributor

Jerryzcn commented Jul 26, 2018

Thanks @HawkAaron.

it seems removing the max subtraction negatively affect convergence. Do you observe similar result?

In the original baidu ctc, there is a section that do max subtraction. I suspect the broadcast is too slow here. Maybe we should write a function for this.

for(int r = 0; r < alphabet_size_; ++r) {
    probs[r + col_offset] = std::exp(activations[r + col_offset] - max_activation);
    denom += probs[r + col_offset];
}

Copy link
Contributor

@Jerryzcn Jerryzcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subtracting max is missing

denoms_handle = reduce_with_axis<red::sum, false>(
F<mxnet::op::mshadow_op::exp>(
log_probs_handle -
broadcast<0>(reduce_with_axis<red::maximum, false>(log_probs_handle, 1),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max is necessary here.

Copy link
Contributor Author

Copy link
Contributor

@Jerryzcn Jerryzcn Jul 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Did not look at other parts of the code thanks! Found a bug on my end.

@szha szha merged commit 2bddf6f into apache:master Jul 27, 2018
@szha
Copy link
Member

szha commented Jul 27, 2018

@HawkAaron thanks for the fix, and @Jerryzcn thanks for the review

XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018
* fix ctc_loss GPU bug

* add blank_label parameter for CTCLoss

* Revert "add blank_label parameter for CTCLoss"

This reverts commit aab11f7.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants