当前位置: X-MOL 学术Comput. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Machine Learning-Based Model to Evaluate Readability and Assess Grade Level for the Web Pages
The Computer Journal ( IF 1.5 ) Pub Date : 2020-09-07 , DOI: 10.1093/comjnl/bxaa113
Muralidhar Pantula 1 , K S Kuppusamy 2
Affiliation  

Evaluating readability of web documents has gained attention due to several factors such as improving the effectiveness of writing and to reach a wider spectrum of audience. Current practices in this direction follow several statistical measures in evaluating readability of the document. In this paper, we have proposed a machine learning-based model to compute readability of web pages. The minimum educational standards required (grade level) to understand the contents of a web page are also computed. The proposed model classifies the web pages into highly readable, readable or less readable using specified feature set. To classify a web page with the aforementioned categories, we have incorporated the features such as sentence count, word count, syllable count, type-token ratio and lexical ambiguity. To increase the usability of the proposed model, we have developed an accessible browser extension to perform the assessments of every web page loaded into the browser.

中文翻译:

基于机器学习的模型,用于评估网页的可读性和评估等级级别

评估Web文档的可读性已受到关注,这归因于多种因素,例如提高写作效率和覆盖更广泛的受众。在此方向上的当前实践在评估文档的可读性方面遵循几种统计措施。在本文中,我们提出了一种基于机器学习的模型来计算网页的可读性。还计算了理解网页内容所需的最低教育标准(年级)。所提出的模型使用指定的功能集将网页分为高可读性,可读性或低可读性。为了对具有上述类别的网页进行分类,我们结合了诸如句子数,单词数,音节数,类型标记比和词义歧义之类的功能。为了提高建议模型的可用性,
更新日期:2020-09-08
down
wechat
bug