Skip to content

🔨 Tweak translation script#15174

Merged
tiangolo merged 3 commits intomasterfrom
tweak-translation-script
Apr 23, 2026
Merged

🔨 Tweak translation script#15174
tiangolo merged 3 commits intomasterfrom
tweak-translation-script

Conversation

@YuriiMotov
Copy link
Copy Markdown
Member

@YuriiMotov YuriiMotov commented Mar 20, 2026

1. Instruct LLM to preserve the order of links from original document.

In some languages (ja, ko) sentences might be inverted and it changes the order of links. But our translation fixer tool relies on the fact that links should go in the same order as in the original document.

Commit: 7c5797c

2. On retry, pass validation error and results of previous iteration

If translation fails to pass validation we just retry it with the same input. So, retry is kind of a lottery - we just hope that next time it will make it better.

If we instead update prompt to provide validation error text and pass current result (instead of initial translation) - it improves results:

Before changes it was often something like:


$ python scripts/translate.py translate-page --en-path docs/en/docs/index.md --language ja
Found existing translation: docs/ja/docs/index.md
Translating docs/en/docs/index.md to ja (日本語)
Running agent for docs/ja/docs/index.md (attempt 1/3)
Failed on attempt 1/3: Number of markdown links does not match the number in the original document (6 vs 40)
Running agent for docs/ja/docs/index.md (attempt 2/3)
Failed on attempt 2/3: Number of markdown links does not match the number in the original document (6 vs 40)
Running agent for docs/ja/docs/index.md (attempt 3/3)
Failed on attempt 3/3: Number of markdown links does not match the number in the original document (14 vs 40)
Translation failed for docs/ja/docs/index.md after 3 attempts
Saving translation to docs/ja/docs/index.md

(result still doesn't pass validation)

After changes, it still needs several attempts, but results is improved with every attempt:

$ python scripts/translate.py translate-page --en-path docs/en/docs/index.md --language ja
Found existing translation: docs/ja/docs/index.md
Translating docs/en/docs/index.md to ja (日本語)
Running agent for docs/ja/docs/index.md (attempt 1/3)
Failed on attempt 1/3: Number of markdown links does not match the number in the original document (6 vs 40)
Running agent for docs/ja/docs/index.md (attempt 2/3)
Failed on attempt 2/3: Number of markdown links does not match the number in the original document (39 vs 40)
Running agent for docs/ja/docs/index.md (attempt 3/3)
Saving translation to docs/ja/docs/index.md

(after third attempt it gave a valid result)

Commit: c7ca144

Diff is actually not as big as GitHub shows it. I just moved prompt creation logic outside the translate_page and changed it a bit to add the verification error from the last attempt and pass the results of last attempt instead of initial translation.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Mar 20, 2026

Merging this PR will not alter performance

✅ 20 untouched benchmarks


Comparing tweak-translation-script (1931046) with master (41df44a)1

Open in CodSpeed

Footnotes

  1. No successful run was found on master (2fa00db) during the generation of this report, so 41df44a was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@YuriiMotov YuriiMotov marked this pull request as ready for review March 22, 2026 20:31
Copy link
Copy Markdown
Member

@tiangolo tiangolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thank you! 🙌

@tiangolo tiangolo merged commit 38039e1 into master Apr 23, 2026
38 checks passed
@tiangolo tiangolo deleted the tweak-translation-script branch April 23, 2026 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants