当前位置: X-MOL 学术Explor. Econ. Hist. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The census place project: A method for geolocating unstructured place names
Explorations in Economic History ( IF 1.857 ) Pub Date : 2022-09-20 , DOI: 10.1016/j.eeh.2022.101477
Enrico Berkes , Ezra Karger , Peter Nencka

Researchers use microdata to study the economic development of the United States and the causal effects of historical policies. Much of this research focuses on county- and state-level patterns and policies because comprehensive sub-county data is not consistently available. We describe a new method that geocodes and standardizes the towns and cities of residence for individuals and households in decennial census microdata from 1790–1940. We release public crosswalks linking individuals and households to consistently-defined place names, longitude-latitude pairs, counties, and states. Our method dramatically increases the number of individuals and households assigned to a sub-county location relative to standard publicly available data: we geocode an average of 83% of the individuals and households in 1790–1940 census microdata, compared to 23% in widely-used crosswalks. In years with individual-level microdata (1850–1940), our average match rate is 94% relative to 33% in widely-used crosswalks. To illustrate the value of our crosswalks, we measure place-level population growth across the United States between 1870 and 1940 at a sub-county level, confirming predictions of Zipf’s Law and Gibrat’s Law for large cities but rejecting similar predictions for small towns. We describe how our approach can be used to accurately geocode other historical datasets.



中文翻译:

人口普查地点项目:一种非结构化地名的地理定位方法

研究人员使用微观数据来研究美国的经济发展和历史政策的因果效应。这项研究的大部分集中在县级和州级的模式和政策上,因为全面的县级数据并非始终如一。我们描述了一种新方法,该方法对 1790 年至 1940 年十年一次的人口普查微数据中的个人和家庭的居住城镇和城市进行地理编码和标准化。我们发布公共人行横道,将个人和家庭与一致定义的地名、经纬度对、县和州联系起来。相对于标准的公开可用数据,我们的方法显着增加了分配到子县位置的个人和家庭的数量:我们对 1790-1940 年人口普查微数据中平均 83% 的个人和家庭进行地理编码,相比之下,在广泛使用的人行横道中为 23%。在拥有个人层面微观数据的年份(1850-1940 年),我们的平均匹配率为 94%,而在广泛使用的人行横道中为 33%。为了说明我们的人行横道的价值,我们测量了 1870 年至 1940 年间美国在县一级的地方级人口增长,证实了 Zipf 定律和 Gibrat 定律对大城市的预测,但拒绝了对小城镇的类似预测。我们描述了如何使用我们的方法对其他历史数据集进行准确地理编码。确认 Zipf 定律和 Gibrat 定律对大城市的预测,但拒绝对小城镇的类似预测。我们描述了如何使用我们的方法对其他历史数据集进行准确地理编码。确认 Zipf 定律和 Gibrat 定律对大城市的预测,但拒绝对小城镇的类似预测。我们描述了如何使用我们的方法对其他历史数据集进行准确地理编码。

更新日期:2022-09-20
down
wechat
bug