Home / Deep Learning / Keras / Machine Learning / Python / Tensorflow

Big Loss And Low Accuracy On Training Data In Both Bert And Albert

March 31, 2024 Post a Comment

I am using huggingface TFBertModel to do a classification task (from here: ), I am using the bare TFBertModel with an added head dense layer and not TFBertForSequenceClassification

Solution 1:

The default learning rate is too high for BERT. Try setting it to one of the recommended learning rates from the original paper Appendix A.3 of 5e-5, 3e-5 or 2e-5.

Python Dummy

Big Loss And Low Accuracy On Training Data In Both Bert And Albert

Solution 1:

Post a Comment for "Big Loss And Low Accuracy On Training Data In Both Bert And Albert"