当前位置: X-MOL 学术Information Technology and Libraries › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
HathiTrust as a Data Source for Researching Early Nineteenth-Century Library Collections
Information Technology and Libraries ( IF 1.5 ) Pub Date : 2019-12-16 , DOI: 10.6017/ital.v38i4.11251
Julia Bauder

An intriguing new opportunity for research into the nineteenth-century history of print culture, libraries, and local communities is performing full-text analyses on the corpus of books held by a specific library or group of libraries. Creating corpora using books that are known to have been owned by a given library at a given point in time is potentially feasible because digitized records of the books in several hundred nineteenth-century library collections are available in the form of scanned book catalogs: a book or pamphlet listing all of the books available in a particular library. However, there are two potential problems with using those book catalogs to create corpora. First, it is not clear whether most or all of the books that were in these collections have been digitized. Second, the prospect of identifying the digital representations of the books listed in the catalogs is daunting, given the diversity of cataloging practices at the time. This article will report on progress towards developing an automated method to match entries in early nineteenth-century book catalogs with digitized versions of those books, and will also provide estimates of the fractions of the library holdings that have been digitized and made available in the Google Books/HathiTrust corpus.

中文翻译:

HathiTrust作为研究19世纪初期图书馆藏书的数据源

研究19世纪印刷文化,图书馆和当地社区历史的一个有趣的新机会是对特定图书馆或一组图书馆所拥有的书籍的语料库进行全文分析。使用已知图书馆在给定时间点拥有的书籍来创建语料库是可行的,因为可以通过扫描的书籍目录的形式来获取几百世纪图书馆藏书中书籍的数字化记录:或列出特定图书馆中所有可用书籍的小册子。但是,使用这些书籍目录创建语料库存在两个潜在的问题。首先,尚不清楚这些藏书中的大部分或全部书籍是否已被数字化。第二,考虑到当时编目实践的多样性,识别目录中所列书籍的数字表示形式的前景令人望而生畏。本文将报告开发一种自动方法以将19世纪初期的书籍目录中的条目与这些书籍的数字化版本进行匹配的进展,还将提供已数字化并在Google中提供的图书馆馆藏份额的估计值书籍/ HathiTrust语料库。
更新日期:2019-12-16
down
wechat
bug