Model | Number of Filters/Units/Encoders | Embedding Dimension | Max Sequence Length | Dropout | Activation Function | Optimizer | Total Parameters |
---|---|---|---|---|---|---|---|
CNN | 8 | 200 | 557 | 0.3 | ReLU | Adam | 5.51Â M |
RNN | 8 | 200 | 557 | 0.3 | ReLU | Adam | 5.50Â M |
GRU | 8 | 200 | 557 | 0.3 | ReLU | Adam | 5.50Â M |
LSTM | 8 | 200 | 557 | 0.3 | ReLU | Adam | 5.50Â M |
Bi-LSTM | 8 | 200 | 557 | 0.3 | ReLU | Adam | 5.51Â M |
Transformer Encoder | 1 encoder (2 heads) | 200 | 557 | 0.3 | ReLU | Adam | 5.94Â M |
BERT-Base | 12 encoders (12 heads) | 768 | 512 | 0.3 (fine-tune layer) | ReLU (fine-tune layer) | Adam (fine-tune layer) | 110Â M |