Skip to content

Add small.en-tdrz checkpoint and initial support for speakerturn in decoding results#4

Merged
akashmjn merged 4 commits intomainfrom
decode-spkturn
Mar 30, 2023
Merged

Add small.en-tdrz checkpoint and initial support for speakerturn in decoding results#4
akashmjn merged 4 commits intomainfrom
decode-spkturn

Conversation

@akashmjn
Copy link
Copy Markdown
Owner

@akashmjn akashmjn commented Mar 30, 2023

This PR adds a support for a new --model small.en-tdrz

  • This model was finetuned to output <|speakerturn|> tokens.
  • As a quick hack, it simply replaces the <|startoflm|> token that was unused in the original whisper models how to use <|startoflm|> in whisper openai/whisper#414. This allows keeping the model structure exactly the same.
  • Small changes are made to tokenizer, transcribe and decoding so that this special token is not suppressed during decoding.
  • A follow up PR will control this via a flag (for now these tokens are always decoded), and force sampling of a timestamp token after a speaker turn token.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant