当前位置: X-MOL 学术arXiv.cs.DL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
What Library Digitization Leaves Out: Predicting the Availability of Digital Surrogates of English Novels
arXiv - CS - Digital Libraries Pub Date : 2020-09-01 , DOI: arxiv-2009.00513
Allen Riddell and Troy J. Bassett

Library digitization has made more than a hundred thousand 19th-century English-language books available to the public. Do the books which have been digitized reflect the population of published books? An affirmative answer would allow book and literary historians to use holdings of major digital libraries as proxies for the population of published works, sparing them the labor of collecting a representative sample. We address this question by taking advantage of exhaustive bibliographies of novels published for the first time in the British Isles in 1836 and 1838, identifying which of these novels have at least one digital surrogate in the Internet Archive, HathiTrust, Google Books, and the British Library. We find that digital surrogate availability is not random. Certain kinds of novels, notably novels written by men and novels published in multivolume format, have digital surrogates available at distinctly higher rates than other kinds of novels. As the processes leading to this outcome are unlikely to be isolated to the novel and the late 1830s, these findings suggest that similar patterns will likely be observed during adjacent decades and in other genres of publishing (e.g., non-fiction).

中文翻译:

图书馆数字化遗漏了什么:预测英语小说数字替代品的可用性

图书馆数字化已经向公众提供了超过 10 万本 19 世纪的英语书籍。数字化的图书是否反映了已出版图书的数量?一个肯定的答案将允许书籍和文学历史学家使用主要数字图书馆的馆藏作为已出版作品数量的代理,从而免去他们收集代表性样本的工作。我们通过利用 1836 年和 1838 年首次在不列颠群岛出版的详尽的小说书目来解决这个问题,确定这些小说中哪些在 Internet Archive、HathiTrust、Google Books 和 British图书馆。我们发现数字代理的可用性不是随机的。某些类型的小说,尤其是男性写的小说和以多卷格式出版的小说,数字替代品的可用率明显高于其他类型的小说。由于导致这一结果的过程不太可能与小说和 1830 年代后期孤立,这些发现表明,在相邻的几十年和其他类型的出版(例如,非小说)中可能会观察到类似的模式。
更新日期:2020-09-02
down
wechat
bug