当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Framework for Capturing and Analyzing Unstructured and Semi-structured Data for a Knowledge Management System
arXiv - CS - Information Retrieval Pub Date : 2020-07-14 , DOI: arxiv-2007.07102
Gerald Onwujekwe, Kweku-Muata Osei-Bryson, Nnatubemugo Ngwum

Mainstream knowledge management researchers generally agree that knowledge extracted from unstructured data and semi-structured data have become imperative for organizational strategic decision making. In this research, we develop a framework that captures and analyses unstructured data using machine learning techniques and integrates knowledge and insight gained from the data into traditional knowledge management systems. Unlike most frameworks published in the literature that focuses on a specific type of unstructured data, our frameworks cut across the varieties of unstructured data ranging from textual data from social network sites, online forums, discussion boards, reviews to audio data, image data and video data. We highlight some pre-processing and processing techniques for these data and also highlight some standard output. We evaluate the framework by developing a textual data application programming interface (API) using python and beautiful soup and we perform sentiment analysis on the students review data collected through the API.

中文翻译:

为知识管理系统捕获和分析非结构化和半结构化数据的框架

主流知识管理研究人员普遍认为,从非结构化数据和半结构化数据中提取的知识已成为组织战略决策的必要条件。在这项研究中,我们开发了一个框架,该框架使用机器学习技术捕获和分析非结构化数据,并将从数据中获得的知识和洞察力整合到传统知识管理系统中。与文献中发表的大多数框架专注于特定类型的非结构化数据不同,我们的框架跨越了各种非结构化数据,从社交网站、在线论坛、讨论区、评论到音频数据、图像数据和视频的文本数据。数据。我们重点介绍了这些数据的一些预处理和处理技术,并重点介绍了一些标准输出。
更新日期:2020-07-15
down
wechat
bug