🚀 Feature request
BertTokenizerFast has the option to specify strip_accents=False. The BertTokenizer does not have this option. This inconsistency should be removed by adding the strip_accents parameter to BertTokenizer.
Motivation
Without adding this, the BertTokenizer can not be used for language models which are lowercase but have accents.
In case of a language model with lowercase and with accents you are forced to load the tokenizer by this:
tokenizer = AutoTokenizer.from_pretrained("<model_name_or_path>", use_fast=True, strip_accents=False)
This will NOT work: tokenizer = AutoTokenizer.from_pretrained("<model_name_or_path>")
And even this would not work: tokenizer = AutoTokenizer.from_pretrained("<model_name_or_path>", strip_accents=False)
Your contribution
With some hints I am willing to contribute.