Skip to content
This repository was archived by the owner on Apr 8, 2025. It is now read-only.
This repository was archived by the owner on Apr 8, 2025. It is now read-only.

Fix activation function for pooled ELECTRA output #362

@stefan-it

Description

@stefan-it

Hi,

inspired by huggingface/transformers#4257 I just had a look at the original ELECTRA implementation, and it seems that they're using gelu instead of tanh for getting a pooled output:

https://github.com/google-research/electra/blob/79111328070e491b287c307906701ebc61091eb2/model/modeling.py#L247

and:

https://github.com/google-research/electra/blob/79111328070e491b287c307906701ebc61091eb2/model/modeling.py#L45

Unfortunately, I used tanh in the ELECTRA model integration:

config.summary_activation = 'tanh'

I will open a PR to fix that :)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions