Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

upcasting accumulation type in norm increases loss/Perplexity  #14760

@nswamy

Description

@nswamy

While fixing an issue described here.
I applied the changes from master and found that the change from #14616 follows the same pattern as the softmax change in #14098 and increases the loss and hence perpelity from 126.66 to 135, without this change and the fix to softmax #14759, validation perplexity comes back to 126.66

Details on how to test are provided in #14722

@haojin2 @eric-haibin-lin

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions