当前位置:
X-MOL 学术
›
arXiv.cs.IR
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
NOTE: Solution for KDD-CUP 2021 WikiKG90M-LSC
arXiv - CS - Information Retrieval Pub Date : 2021-07-05 , DOI: arxiv-2107.01892 Weiyue Su, Zeyang Fang, Hui Zhong, Huijuan Wang, Siming Dai, Zhengjie Huang, Yunsheng Shi, Shikun Feng, Zeyu Chen
arXiv - CS - Information Retrieval Pub Date : 2021-07-05 , DOI: arxiv-2107.01892 Weiyue Su, Zeyang Fang, Hui Zhong, Huijuan Wang, Siming Dai, Zhengjie Huang, Yunsheng Shi, Shikun Feng, Zeyu Chen
WikiKG90M in KDD Cup 2021 is a large encyclopedic knowledge graph, which
could benefit various downstream applications such as question answering and
recommender systems. Participants are invited to complete the knowledge graph
by predicting missing triplets. Recent representation learning methods have
achieved great success on standard datasets like FB15k-237. Thus, we train the
advanced algorithms in different domains to learn the triplets, including OTE,
QuatE, RotatE and TransE. Significantly, we modified OTE into NOTE (short for
Norm-OTE) for better performance. Besides, we use both the DeepWalk and the
post-smoothing technique to capture the graph structure for supplementation. In
addition to the representations, we also use various statistical probabilities
among the head entities, the relations and the tail entities for the final
prediction. Experimental results show that the ensemble of state-of-the-art
representation learning methods could draw on each others strengths. And we
develop feature engineering from validation candidates for further
improvements. Please note that we apply the same strategy on the test set for
final inference. And these features may not be practical in the real world when
considering ranking against all the entities.
中文翻译:
注意:KDD-CUP 2021 WikiKG90M-LSC 的解决方案
KDD Cup 2021 中的 WikiKG90M 是一个大型的百科全书知识图谱,它可以使各种下游应用受益,例如问答和推荐系统。邀请参与者通过预测缺失的三元组来完成知识图谱。最近的表示学习方法在 FB15k-237 等标准数据集上取得了巨大成功。因此,我们在不同领域训练高级算法来学习三元组,包括 OTE、QuatE、RotatE 和 TransE。值得注意的是,我们将 OTE 修改为 NOTE(Norm-OTE 的缩写)以获得更好的性能。此外,我们同时使用 DeepWalk 和后平滑技术来捕获图结构以进行补充。除了表示之外,我们还使用头部实体之间的各种统计概率,最终预测的关系和尾部实体。实验结果表明,最先进的表示学习方法的集合可以相互借鉴。我们从验证候选中开发特征工程以进一步改进。请注意,我们在测试集上应用相同的策略以进行最终推理。在考虑对所有实体进行排名时,这些功能在现实世界中可能并不实用。
更新日期:2021-07-06
中文翻译:
注意:KDD-CUP 2021 WikiKG90M-LSC 的解决方案
KDD Cup 2021 中的 WikiKG90M 是一个大型的百科全书知识图谱,它可以使各种下游应用受益,例如问答和推荐系统。邀请参与者通过预测缺失的三元组来完成知识图谱。最近的表示学习方法在 FB15k-237 等标准数据集上取得了巨大成功。因此,我们在不同领域训练高级算法来学习三元组,包括 OTE、QuatE、RotatE 和 TransE。值得注意的是,我们将 OTE 修改为 NOTE(Norm-OTE 的缩写)以获得更好的性能。此外,我们同时使用 DeepWalk 和后平滑技术来捕获图结构以进行补充。除了表示之外,我们还使用头部实体之间的各种统计概率,最终预测的关系和尾部实体。实验结果表明,最先进的表示学习方法的集合可以相互借鉴。我们从验证候选中开发特征工程以进一步改进。请注意,我们在测试集上应用相同的策略以进行最终推理。在考虑对所有实体进行排名时,这些功能在现实世界中可能并不实用。