We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent f2bd40e commit 1987422Copy full SHA for 1987422
olmocr/train/grpo_train.py
@@ -1,5 +1,5 @@
1
"""
2
-GRPO (Generative Reward-based Policy Optimization) training script for OlmOCR.
+GRPO (Group Relative Policy Optimization) training script for OlmOCR.
3
4
5
import argparse
0 commit comments