当前位置: X-MOL 学术Comput. Ind. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
i-Dataquest: A heterogeneous information retrieval tool using data graph for the manufacturing industry
Computers in Industry ( IF 10.0 ) Pub Date : 2021-08-07 , DOI: 10.1016/j.compind.2021.103527
Lise Kim 1 , Esma Yahia 1 , Frédéric Segonds 2 , Philippe Véron 1 , Antoine Mallet 3
Affiliation  

Manufacturing industry needs access to the data in order to realise its activities but also to generate new value-added knowledge. Nevertheless, it is confronted with a large and growing volume of heterogeneous data which limits its ability to exploit them optimally. Moreover, the data are distributed within different heterogeneous information systems, which limits the relationship exploration under the information retrieval process. Usually, the challenge is addressed by trying to manage and normalize the data structure in order to faster searching and exploiting them in a manufacturing context. For their part, the authors present i-Dataquest, an information retrieval system supported by (i) a graph-oriented model built from the structured and unstructured data of the company and (ii) a query system answering ‘what’ and ‘about what’ and (iii) generating three different results: a list of items, a list of property values and a list of sentences. The i-Dataquest prototype is built using Neo4J for the graph system generation, ConceptNet for lexical resource management and StandfordNLP for natural language processing. An evaluation of the prototype’s performance is conducted through a data set representing a drone manufacturer. The results show that the transformation of specific content such as tables in the graph and the semantic expansion of queries significantly improves the recall and precision measures. The results also suggest improving filtering less relevant results by considering particularly queries looking for a specific value.



中文翻译:

i-Dataquest:一种使用数据图的制造业异构信息检索工具

制造业需要访问数据才能实现其活动,同时也需要产生新的增值知识。然而,它面临着大量且不断增长的异构数据,这限制了其最佳利用它们的能力。此外,数据分布在不同的异构信息系统中,限制了信息检索过程下的关系探索。通常,通过尝试管理和规范数据结构以在制造环境中更快地搜索和利用它们来解决挑战。就他们而言,作者介绍了 i-Dataquest,一个信息检索系统,由 (i) 从公司的结构化和非结构化数据构建的面向图形的模型和 (ii) 一个回答“什么”和“关于什么”的查询系统以及 (iii) 生成三种不同的结果支持:项目列表、属性值列表和句子列表。i-Dataquest 原型使用​​ Neo4J 构建图形系统,ConceptNet 用于词法资源管理,StandfordNLP 用于自然语言处理。原型机性能的评估是通过代表无人机制造商的数据集进行的。结果表明,图形中表格等特定内容的转换和查询的语义扩展显着提高了召回率和准确率。

更新日期:2021-08-07
down
wechat
bug