Incoherence in the pooled_output definition in ElectraBackbone and BertBackbone

**Describe the bug**

There is an incoherence in the definition of the `pooled_output` in ElectraBackbone and BertBackbone vs AlbertBackbone and FNet.

- In ElectraBackbone and BertBackbone the pooled_output is defined as the pooling of the cls token before the dense layer. 
- In AlbertBackbone and FNet the pooled_output is defined as the output of dense layer which takes the cls token from the sequence output. 

**Expected behavior**

The pooled_output should have one definition or follow the original implementation. 

**Additional context**

The original implementation of [Bert](https://github.com/google-research/bert/blob/master/modeling.py#L224-L232), [Fnet](https://github.com/google-research/google-research/blob/master/f_net/models.py#L119), [Albert](https://github.com/google-research/albert/blob/master/modeling.py#L247-L255)

**Would you like to help us fix it?**
I would like to work on this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incoherence in the pooled_output definition in ElectraBackbone and BertBackbone #1358

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incoherence in the pooled_output definition in ElectraBackbone and BertBackbone #1358

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions