-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
📚 The doc issue
Description:
There appears to be a discrepancy in the RMSNorm formula as documented on the PyTorch docs website compared to the original formulation presented in the paper "Root Mean Square Layer Normalization" (arXiv:1910.07467).
Details:
The formula for RMSNorm in the PyTorch documentation is currently represented as:
However, according to the original paper, the correct formula should be:
Issue:
- The PyTorch documentation formula includes an additional square root operation around RMS[x] and epsilon, which is not present in the original paper's formula.
- This discrepancy could lead to confusion or incorrect implementation by users relying on the PyTorch documentation.
Suggest a potential alternative/fix
Suggested Correction:
Please update the PyTorch documentation to reflect the correct formula as per the original paper. The corrected formula should be:
where MS is the Mean Square.
Implementing this correction will enhance the accuracy and reliability of the PyTorch documentation. By aligning with the original paper's formula, users will have a clearer understanding of how RMSNorm is intended to function, thereby reducing potential implementation errors. This update will also help maintain consistency across different resources and ensure that users can confidently rely on the documentation for their projects.
### Tasks
cc @svekars @brycebortree @sekyondaMeta @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki