Lessons Learned from Applying off-the-shelf BERT: There is no Silver Bullet,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Lessons Learned from Applying off-the-shelf BERT: There is no Silver Bullet
arXiv - CS - Artificial Intelligence Pub Date : 2020-09-15 , DOI: arxiv-2009.07238
Victor Makarenkov and Lior Rokach

One of the challenges in the NLP field is training large classification models, a task that is both difficult and tedious. It is even harder when GPU hardware is unavailable. The increased availability of pre-trained and off-the-shelf word embeddings, models, and modules aim at easing the process of training large models and achieving a competitive performance. We explore the use of off-the-shelf BERT models and share the results of our experiments and compare their results to those of LSTM networks and more simple baselines. We show that the complexity and computational cost of BERT is not a guarantee for enhanced predictive performance in the classification tasks at hand.

中文翻译：

应用现成 BERT 的经验教训：没有银弹

NLP 领域的挑战之一是训练大型分类模型，这是一项既困难又乏味的任务。当 GPU 硬件不可用时，就更难了。预训练和现成的词嵌入、模型和模块的可用性增加旨在简化训练大型模型的过程并实现具有竞争力的性能。我们探索了现成的 BERT 模型的使用，并分享了我们的实验结果，并将其结果与 LSTM 网络和更简单的基线的结果进行了比较。我们表明，BERT 的复杂性和计算成本并不能保证在手头的分类任务中提高预测性能。

更新日期：2020-09-21

点击分享查看原文

点击收藏

阅读更多本刊最新论文