Conversation
There was a problem hiding this comment.
Pull request overview
This PR bumps the version to 25.12.33 and introduces a major refactoring of the TTS Speech Markup Language (SML) system, transitioning from simple string tokens to a structured dictionary with compiled regex patterns and token values.
Key Changes
- SML Structure Overhaul: Changed
TTS_SMLfrom simple string mappings to dictionaries containing'match'(compiled regex) and'token'(string value) keys, enabling more flexible pattern matching - Sentence Processing Refactor: Completely rewrote
get_sentences()function inlib/core.pywith new multi-pass splitting algorithm - TTS Engine Updates: All TTS engines (XTTS, YourTTS, VITS, Tacotron, Fairseq, Bark) refactored to use new SML handling with dedicated
convert_sml()andset_voice()methods
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 23 comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml, VERSION.txt, docker-compose.yml, podman-compose.yml | Version bumped to 25.12.33 |
| Dockerfile | Critical Issue: Version set to 26.1.3 instead of 25.12.33 |
| lib/conf_models.py | TTS_SML restructured with regex patterns and tokens; added default_sml_pattern |
| lib/conf_lang.py | Removed double-quote character from chars_remove list |
| lib/core.py | Major refactoring: rewrote get_sentences(), filter_sml(), updated chapter/sentence processing with new resume logic |
| lib/gradio.py | Minor UI fixes for variable naming consistency |
| lib/classes/tts_engines/*.py | All TTS engines refactored with new convert_sml() and set_voice() methods for SML token processing |
| lib/classes/tts_engines/common/headers.py | Added default_sml_pattern to imports |
| lib/init.py | Added default_sml_pattern to exports |
Critical Issues Found: Multiple bugs including version mismatch, incorrect pattern matching, indexing errors, return type mismatches, and broken sentence/chapter numbering logic that will affect resume functionality and VTT generation.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| start = sentence_num | ||
| if c in missing_chapters: | ||
| msg = f'********* Recovering missing block {c} *********' | ||
| print(msg) | ||
| elif resume_chapter == c and c > 0: | ||
| msg = f'********* Resuming from block {resume_chapter} *********' | ||
| print(msg) | ||
| msg = f'Block {chapter_idx} containing {len(sentences)} sentences…' | ||
| print(msg) | ||
| for i, sentence in enumerate(sentences): | ||
| for sentence_num, sentence in enumerate(sentences): |
There was a problem hiding this comment.
Missing initialization of sentence_num before the loop. Line 1697 uses sentence_num in an enumerate, but the variable sentence_num is initialized inside the loop as the loop variable. However, line 1688 references start = sentence_num before sentence_num is assigned in the enumerate. This should use a different variable name or be initialized to 0 before line 1684.
| if chapter_num <= resume_chapter: | ||
| msg = f'**Recovering missing file block {chapter_num}' | ||
| print(msg) | ||
| if chapter_idx in missing_chapters or sentence_num > resume_sentence: |
There was a problem hiding this comment.
The condition on line 1724 checks if chapter_idx in missing_chapters, but missing_chapters contains 0-indexed chapter numbers (from the range check on lines 1638), while chapter_idx is 1-indexed (c + 1). This will cause incorrect detection of which chapters need to be combined.
| if chapter_idx in missing_chapters or sentence_num > resume_sentence: | |
| if c in missing_chapters or sentence_num > resume_sentence: |
No description provided.