Fix activation function for pooled ELECTRA output

Hi,

inspired by https://github.com/huggingface/transformers/pull/4257 I just had a look at the original ELECTRA implementation, and it seems that they're using `gelu` instead of `tanh` for getting a pooled output:

https://github.com/google-research/electra/blob/79111328070e491b287c307906701ebc61091eb2/model/modeling.py#L247

and:

https://github.com/google-research/electra/blob/79111328070e491b287c307906701ebc61091eb2/model/modeling.py#L45

Unfortunately, I used `tanh` in the ELECTRA model integration:

https://github.com/deepset-ai/FARM/blob/37591d198813c999afe3766be2fa281af26a25e5/farm/modeling/language_model.py#L1240

I will open a PR to fix that :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix activation function for pooled ELECTRA output #362

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fix activation function for pooled ELECTRA output #362

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions