HTR models have to be used with Kraken.
Except for preliminary experimentations (bleu.mlmodel), we use the same test set, available here, to compare the efficiency of the different models.
Models are named after cheeses, following an alphabetical order.
This model was produced with the v 1.0 of the dataset. It was divided in three sets : train (training set), val (evaluation set) and test (test set), created with the following script.
traincontained 82.76% of total dataset.valcontained 7.61% of total dataset.testcontained 9.62% of total dataset.
Commands used are:
ketos train -t train.txt -e val.txt -u NFKD -f altofor trainingketos test -m model -f alto -e test.txtfor testing
Accuracy is:
- 96% on the evaluation set
- 91% on the test set.
This model was produced with the v. 2.0 of the dataset. It was divided in three sets : train (training set), val (evaluation set) and test (test set). The first two were created with train_val_prep.py. The test set is available here.
traincontained 75% of total dataset.valcontained 10% of total dataset.testcontained 15% of total dataset.
Note: a problem occured during training. This model should not be used.
Commands used are:
ketos train -t train.txt -e val.txt -f alto -d cuda --normalization NFDfor trainingketos test -m model -f alto -e test.txtfor testing
Accuracy is:
- 96.3% on the evaluation set
This model was produced with the v. 2.0 of the dataset. It was divided in three sets : train (training set), val (evaluation set) and test (test set). The first two were created with train_val_prep.py. The test set is available here.
traincontained 75% of total dataset.valcontained 10% of total dataset.testcontained 15% of total dataset.
Commands used are:
ketos train -t train.txt -e val.txt -f alto -d cuda --normalization NFDfor trainingketos test -m model -f alto -e test.txtfor testing
Accuracy is:
- 96.6% on the evaluation set