当前位置: X-MOL 学术J. Res. Educ. Eff. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Gather-Narrow-Extract: A Framework for Studying Local Policy Variation Using Web-Scraping and Natural Language Processing
Journal of Research on Educational Effectiveness ( IF 2.217 ) Pub Date : 2019-12-06 , DOI: 10.1080/19345747.2019.1654576
Kylie L. Anglin 1
Affiliation  

Abstract

Education researchers have traditionally faced severe data limitations in studying local policy variation; administrative data sets capture only a fraction of districts’ policy decisions, and it can be expensive to collect more nuanced implementation data from teachers and leaders. Natural language processing and web-scraping techniques can help address these challenges by assisting researchers in locating and processing policy documents located online. School district policies and practices are commonly documented in student and staff manuals, school improvement plans, and meeting minutes that are posted for the public. This article introduces an end-to-end framework for collecting these sorts of policy documents and extracting structured policy data: The researcher gathers all potentially relevant documents from district websites, narrows the text corpus to spans of interest using a text classifier, and then extracts specific policy data using additional natural language processing techniques. Through this framework, a researcher can describe variation in policy implementation at the local level, aggregated across state- or nationwide populations even as policies evolve over time.



中文翻译:

Gather-Narrow-Extract:使用网络抓取和自然语言处理研究本地政策变化的框架

摘要

传统上,教育研究人员在研究当地政策变化时面临着严格的数据限制;行政数据集仅捕获了学区政策决策的一小部分,从教师和领导者那里收集更多细微的实施数据可能会很昂贵。自然语言处理和网络抓取技术可以通过协助研究人员查找和处理在线政策文件来帮助应对这些挑战。学区的政策和做法通常记录在公开的学生和教职工手册,学校改善计划和会议记录中。本文介绍了一个端到端框架,用于收集这些类型的政策文档并提取结构化的政策数据:研究人员从地区网站收集所有可能相关的文档,使用文本分类器将文本语料库缩小到感兴趣的范围,然后使用其他自然语言处理技术提取特定的策略数据。通过这个框架,研究人员可以描述地方政策实施的变化,这些变化在州或全国人口中汇总,甚至随着政策的发展而变化。

更新日期:2019-12-06
down
wechat
bug