Knowledge models from PDF textbooks,New Review of Hypermedia and Multimedia

当前位置： X-MOL 学术 › New Rev. Hypermedia Multimed. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Knowledge models from PDF textbooks
New Review of Hypermedia and Multimedia ( IF 1.2 ) Pub Date : 2021-02-28 , DOI: 10.1080/13614568.2021.1889692
Isaac Alpizar-Chacon ₁ , Sergey Sosnovsky ₁

Affiliation

ABSTRACT

Textbooks are educational documents created, structured and formatted by domain experts with the primary purpose to explain the knowledge in the domain to a novice. Authors use their understanding of the domain when structuring and formatting the content of a textbook to facilitate this explanation. As a result, the formatting and structural elements of textbooks carry the elements of domain knowledge implicitly encoded by their authors. Our paper presents an extensible approach towards automated extraction of knowledge models from textbooks and enrichment of their content with additional links (both internal and external). The textbooks themselves essentially become hypertext documents where individual pages are annotated with important concepts in the domain. The evaluation experiments examine several aspects and stages of the approach, including the accuracy of model extraction, the pragmatic quality of extracted models using one of their possible applications— semantic linking of textbooks in the same domain, the accuracy of linking models to external knowledge sources and the effect of integration of multiple textbooks from the same domain. The results indicate high accuracy of model extraction on symbolic, syntactic and structural levels across textbooks and domains, and demonstrate the added value of the extracted models on the semantic level.

中文翻译：

来自PDF教科书的知识模型

摘要

教科书是由领域专家创建、结构化和格式化的教育文档，主要目的是向新手解释领域中的知识。作者在组织和格式化教科书内容时利用他们对领域的理解来促进这种解释。因此，教科书的格式和结构元素带有由作者隐式编码的领域知识元素。我们的论文提出了一种可扩展的方法，用于从教科书中自动提取知识模型，并通过附加链接（内部和外部）丰富其内容。教科书本身本质上变成了超文本文档，其中各个页面都用领域中的重要概念进行了注释。评估实验检查了该方法的几个方面和阶段，包括模型提取的准确性、提取模型使用其可能应用之一的语用质量——同领域教科书的语义链接、模型与外部知识源链接的准确性以及来自同一领域的多本教科书的集成效果. 结果表明，跨教科书和领域的符号、句法和结构级别的模型提取具有很高的准确性，并证明了提取的模型在语义级别上的附加值。

更新日期：2021-02-28

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>