Skip to content

[Doc issue] RMSNorm formula #136597

@d-kleine

Description

@d-kleine

📚 The doc issue

Description:

There appears to be a discrepancy in the RMSNorm formula as documented on the PyTorch docs website compared to the original formulation presented in the paper "Root Mean Square Layer Normalization" (arXiv:1910.07467).

Details:

The formula for RMSNorm in the PyTorch documentation is currently represented as:

$$ y = \frac{x}{\sqrt{\text{RMS}[x] + \epsilon}} * \gamma $$

However, according to the original paper, the correct formula should be:

$$ y_i = \frac{x_i}{\text{RMS}(x)} \gamma_i, \quad \text{where} \quad \text{RMS}(x) = \sqrt{\epsilon + \frac{1}{n} \sum_{i=1}^{n} x_i^2} $$

Issue:

  • The PyTorch documentation formula includes an additional square root operation around RMS[x] and epsilon, which is not present in the original paper's formula.
  • This discrepancy could lead to confusion or incorrect implementation by users relying on the PyTorch documentation.

Suggest a potential alternative/fix

Suggested Correction:

Please update the PyTorch documentation to reflect the correct formula as per the original paper. The corrected formula should be:

$$ y = \frac{x}{\sqrt{\text{MS}[x] + \epsilon}} * \gamma $$

where MS is the Mean Square.

Implementing this correction will enhance the accuracy and reliability of the PyTorch documentation. By aligning with the original paper's formula, users will have a clearer understanding of how RMSNorm is intended to function, thereby reducing potential implementation errors. This update will also help maintain consistency across different resources and ensure that users can confidently rely on the documentation for their projects.

### Tasks

cc @svekars @brycebortree @sekyondaMeta @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: docsRelated to our documentation, both in docs/ and docblocksmodule: nnRelated to torch.nntriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions