当前位置:
X-MOL 学术
›
arXiv.cs.DL
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
What Library Digitization Leaves Out: Predicting the Availability of Digital Surrogates of English Novels
arXiv - CS - Digital Libraries Pub Date : 2020-09-01 , DOI: arxiv-2009.00513 Allen Riddell and Troy J. Bassett
arXiv - CS - Digital Libraries Pub Date : 2020-09-01 , DOI: arxiv-2009.00513 Allen Riddell and Troy J. Bassett
Library digitization has made more than a hundred thousand 19th-century
English-language books available to the public. Do the books which have been
digitized reflect the population of published books? An affirmative answer
would allow book and literary historians to use holdings of major digital
libraries as proxies for the population of published works, sparing them the
labor of collecting a representative sample. We address this question by taking
advantage of exhaustive bibliographies of novels published for the first time
in the British Isles in 1836 and 1838, identifying which of these novels have
at least one digital surrogate in the Internet Archive, HathiTrust, Google
Books, and the British Library. We find that digital surrogate availability is
not random. Certain kinds of novels, notably novels written by men and novels
published in multivolume format, have digital surrogates available at
distinctly higher rates than other kinds of novels. As the processes leading to
this outcome are unlikely to be isolated to the novel and the late 1830s, these
findings suggest that similar patterns will likely be observed during adjacent
decades and in other genres of publishing (e.g., non-fiction).
中文翻译:
图书馆数字化遗漏了什么:预测英语小说数字替代品的可用性
图书馆数字化已经向公众提供了超过 10 万本 19 世纪的英语书籍。数字化的图书是否反映了已出版图书的数量?一个肯定的答案将允许书籍和文学历史学家使用主要数字图书馆的馆藏作为已出版作品数量的代理,从而免去他们收集代表性样本的工作。我们通过利用 1836 年和 1838 年首次在不列颠群岛出版的详尽的小说书目来解决这个问题,确定这些小说中哪些在 Internet Archive、HathiTrust、Google Books 和 British图书馆。我们发现数字代理的可用性不是随机的。某些类型的小说,尤其是男性写的小说和以多卷格式出版的小说,数字替代品的可用率明显高于其他类型的小说。由于导致这一结果的过程不太可能与小说和 1830 年代后期孤立,这些发现表明,在相邻的几十年和其他类型的出版(例如,非小说)中可能会观察到类似的模式。
更新日期:2020-09-02
中文翻译:
图书馆数字化遗漏了什么:预测英语小说数字替代品的可用性
图书馆数字化已经向公众提供了超过 10 万本 19 世纪的英语书籍。数字化的图书是否反映了已出版图书的数量?一个肯定的答案将允许书籍和文学历史学家使用主要数字图书馆的馆藏作为已出版作品数量的代理,从而免去他们收集代表性样本的工作。我们通过利用 1836 年和 1838 年首次在不列颠群岛出版的详尽的小说书目来解决这个问题,确定这些小说中哪些在 Internet Archive、HathiTrust、Google Books 和 British图书馆。我们发现数字代理的可用性不是随机的。某些类型的小说,尤其是男性写的小说和以多卷格式出版的小说,数字替代品的可用率明显高于其他类型的小说。由于导致这一结果的过程不太可能与小说和 1830 年代后期孤立,这些发现表明,在相邻的几十年和其他类型的出版(例如,非小说)中可能会观察到类似的模式。