当前位置: X-MOL 学术Digit. Scholarsh. Hum.it. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Medieval Epigraphic Corpus and its Retro-Developments (CIFM-CBMA): The Exploratory Research of the Cosme2 Consortium
Digital Scholarship in the Humanities ( IF 1.299 ) Pub Date : 2020-12-26 , DOI: 10.1093/llc/fqaa069
Eliana Magnani 1 , Nicolas Perreaux 2
Affiliation  

The digital ‘Burgundian Epigraphic Corpus’ is the result of a collaboration between two teams, the Corpus of Inscriptions of Medieval France (CIFM) and the Corpus of Medieval Burgundian Texts (CBMA), as part of the Cosme2 (Consortium Sources Médiévales—linked to TGIR Huma-Num from CNRS—France), dedicated to digital approaches to historical corpora. This article explains how a complex set of documents mixing Latin, Greek, and Old French texts, accompanied by rich metadata, has been processed in order to allow new surveys by humanists. It shows how the corpus is constantly reinvested and how its exploitation, thanks to digital methods, generates new data and metadata that can be reinjected into the corpus and in turn operated, creating a kind of virtuous circle. Three retro-developments are briefly discussed here: (1) semantic web, connectivity, and named entities; (2) geographic information system (GIS) and automated extraction of new metadata; (3) lemmatization and automatic language detection.

中文翻译:

中世纪金文语料库及其追溯发展(CIFM-CBMA):Cosme2 联盟的探索性研究

数字“勃艮第金文语料库”是两个团队合作的结果,即中世纪法国铭文语料库(CIFM) 和中世纪勃艮第文本语料库(CBMA),作为 Cosme 2 ( Consortium Sources Médiévales) 的一部分— 与法国 CNRS 的 TGIR Huma-Num 相关),致力于历史语料库的数字化方法。本文解释了如何处理一组混合了拉丁语、希腊语和古法语文本的复杂文档,并附有丰富的元数据,以便人文主义者进行新的调查。它展示了语料库如何不断地再投资,以及如何利用数字方法生成新的数据和元数据,这些数据和元数据可以重新注入语料库并反过来操作,从而形成一种良性循环。这里简要讨论三个逆向发展:(1)语义网络、连接性和命名实体;(2)地理信息系统(GIS)和新元数据的自动提取;(3)词形还原和自动语言检测。
更新日期:2020-12-26
down
wechat
bug