Patent classification by fine-tuning BERT language model,World Patent Information

当前位置： X-MOL 学术 › World Patent Information › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Patent classification by fine-tuning BERT language model
World Patent Information Pub Date : 2020-06-01 , DOI: 10.1016/j.wpi.2020.101965
Jieh-Sheng Lee , Jieh Hsiang

Abstract In this work we focus on fine-tuning a pre-trained BERT model and applying it to patent classification. When applied to large datasets of over two million patents, our approach outperforms the state of the art by an approach using CNN with word embeddings. Besides, we focus on patent claims without other parts in patent documents. Our contributions include: (1) a new state-of-the-art result based on pre-trained BERT model and fine-tuning for patent classification, (2) a large dataset USPTO-3M at the CPC subclass level with SQL statements that can be used by future researchers, (3) showing that patent claims alone are sufficient to achieve state-of-the-art results for classification task, in contrast to conventional wisdom.

中文翻译：

通过微调BERT语言模型进行专利分类

摘要在这项工作中，我们专注于微调预训练的 BERT 模型并将其应用于专利分类。当应用于超过 200 万专利的大型数据集时，我们的方法通过使用带有词嵌入的 CNN 的方法优于现有技术。此外，我们专注于专利权利要求，专利文件中没有其他部分。我们的贡献包括：(1) 基于预训练的 BERT 模型和专利分类微调的最新最新结果，(2) CPC 子类级别的大型数据集 USPTO-3M，其中包含 SQL 语句可以被未来的研究人员使用，(3) 表明与传统智慧相反，仅专利权利要求就足以实现分类任务的最新结果。

更新日期：2020-06-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>