Skip to content
This repository was archived by the owner on Apr 8, 2025. It is now read-only.

Add support for ELECTRA#349

Merged
tholor merged 4 commits intodeepset-ai:masterfrom
stefan-it:master
May 7, 2020
Merged

Add support for ELECTRA#349
tholor merged 4 commits intodeepset-ai:masterfrom
stefan-it:master

Conversation

@stefan-it
Copy link
Copy Markdown
Contributor

Hi,

this PR adds the previously introduced ELECTRA pre-training approach into FARM:

The ELECTRA model was proposed in the paper: ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. ELECTRA is a new pre-training approach which trains two transformer models: the generator and the discriminator. The generator’s role is to replace tokens in a sequence, and is therefore trained as a masked language model. The discriminator, which is the model we’re interested in, tries to identify which tokens were replaced by the generator in the sequence.

Implementation notes

The Hugging Face Transformer library was updated to the latest 2.8 version to support ELECTRA model. Like DistilBERT, an additional pooler needs to be defined to get a one vector per sequence representation.

Experiments

I did pre-liminary experiments with CoNLL-2003 for NER. The configuration can be found under experiments/electra_eval/conll2003_en_config.json.

Result for one run on using the ELECTRA base model: 94.30% (dev) and 89.86% (test).

@tholor tholor requested a review from brandenchan May 7, 2020 08:02
@tholor tholor added enhancement New feature or request part: model labels May 7, 2020
Copy link
Copy Markdown
Contributor

@brandenchan brandenchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! The structure of the Electra LM is like that of XLNet and in my tests I was able to get Electra to train on doc classification.

Thanks for your effort on this PR @stefan-it !

Copy link
Copy Markdown
Member

@tholor tholor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Thanks for working on this!

@tholor tholor merged commit 2a33382 into deepset-ai:master May 7, 2020
@PhilipMay
Copy link
Copy Markdown
Contributor

Thanks @stefan-it 🥇

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

enhancement New feature or request part: model

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants