-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Tokenizer saving/loading #15
Copy link
Copy link
Closed
Description
We need to provide a way to save and load tokenizers to/from files.
Things that need to be saved:
- Each part (Normalizer, PreTokenizer, ..) and their options
- Added tokens / special tokens
- The model's vocabulary
We can approach this in multiple ways, but in the end, we would like to have a single self-contained file that represents a tokenizer. We will probably need to have some scripts to convert existing models to this new format.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels