Remove inconsistency between BertTokenizer and BertTokenizerFast 

# 🚀 Feature request
`BertTokenizerFast` has the option to specify `strip_accents=False`. The `BertTokenizer` does not have this option. This inconsistency should be removed by adding the `strip_accents` parameter to `BertTokenizer`.

## Motivation
Without adding this, the `BertTokenizer` can not be used for language models which are lowercase but have accents.

In case of a language model with lowercase and with accents you are forced to load the tokenizer by this:

```python
tokenizer = AutoTokenizer.from_pretrained("<model_name_or_path>", use_fast=True, strip_accents=False)
```

This will NOT work: `tokenizer = AutoTokenizer.from_pretrained("<model_name_or_path>")`

And even this would not work: `tokenizer = AutoTokenizer.from_pretrained("<model_name_or_path>", strip_accents=False)`

## Your contribution
With some hints I am willing to contribute.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove inconsistency between BertTokenizer and BertTokenizerFast #6186

🚀 Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Remove inconsistency between BertTokenizer and BertTokenizerFast #6186

Description

🚀 Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions