-
Digital assemblages with AI for creative interpretation of short stories Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-03-06 Kieran O'Halloran
I demonstrate an approach fostering inventive interpretation of short stories in Literary Studies and higher education generally. It involves constructing an ‘assemblage’—at its simplest, an evolving network of unusual connections for creative outcome. The assemblage of this article combines freshly located research literature, directly and indirectly related to a story’s themes, and/or the personality
-
Using deep learning to analyse the times of the UN Security Council Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-02-29 Tobias Blanke
This article analyses how digital humanities scholarship can make use of recent advances in deep learning to analyse the temporal relations in an online textual archive. We use transfer learning as well as data augmentation techniques to investigate changes in United Nations Security Council resolutions. Instead of pre-defined periods, as it is common, we target the years directly. Such a text regression
-
Gender relations in Spanish theatre during the Silver Age: a quantitative comparison of works in the Spanish Drama Corpus Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-02-27 Monika Dabrowska, María Teresa Santa María Fernández
One of the many changes witnessed by Spanish society at the beginning of the 20th century was the early reshaping of the role of women, including in the realm of theatre. During the first three decades of the new century, Spanish theatre was thriving, favouring the emergence of new gender roles: there were new female playwrights, professional actresses, stage designers, costume designers, theatre company
-
Disentangling semantic and prosodic features of English poetry Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-02-27 Wenyi Shang, Ted Underwood
The distinction between genre and form is still contested in literary studies. While scholars associated with the New Formalism are criticized for perceiving everything as a form, digital humanists tend to argue that everything is a genre. In this research, we employed machine learning models to classify 36,635 English poems in the Chadwyck-Healey Literature Collections into twenty-seven categories
-
What drives non-linguists’ hands (or mouse) when drawing mental dialect maps? Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-02-27 Péter Jeszenszky, Carina Steiner, Nina von Allmen, Adrian Leemann
In perceptual dialectology, mental mapping is a popular tool used for eliciting attitudes and the spatial imprint of linguistic cognition from non-linguists, through tasking them with drawing about linguistic variations on maps. Despite the popularity of this method, research on the geometrical parameters of the shapes drawn on these maps has been limited. In our study, we utilized 500 mental maps
-
Whose Anthropocene?: a data-driven look at the prospects for collaboration between natural science, social science, and the humanities Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-02-09 Carlos Santana, Kathryn Petrozzo, T J Perkins
Although the idea of the Anthropocene originated in the earth sciences, there have been increasing calls for questions about the Anthropocene to be addressed by pan-disciplinary groups of researchers from across the natural sciences, social sciences, and humanities. We use data analysis techniques from corpus linguistics to examine academic texts about the Anthropocene from these disciplinary families
-
Understanding poetry using natural language processing tools: a survey Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-02-07 Mirella De Sisto, Laura Hernández-Lorenzo, Javier De la Rosa, Salvador Ros, Elena González-Blanco
Analyzing poetry with automatic tools has great potential for improving verse-related research. Over the last few decades, this field has expanded notably and a large number of tools aiming at analyzing various aspects of poetry have been developed. However, the concrete connection between these tools and traditional scholars investigating poetry and metrics is often missing. The purpose of this article
-
Linguistic annotation of cuneiform texts using treebanks and deep learning Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-02-02 Matthew Ong, Shai Gordin
We describe an efficient pipeline for morpho-syntactically annotating an ancient language corpus which takes advantage of bootstrapping techniques. This pipeline is designed for ancient language scholars looking to jump-start their own treebank projects, which can in turn serve further pedagogical research projects in the target language. We situate our work in the field of similar ancient language
-
Epistemic consequences of unfair tools Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-01-24 Ida Marie S Lassen, Ross Deans Kristensen-McLachlan, Mina Almasi, Kenneth Enevoldsen, Kristoffer L Nielbo
This article examines the epistemic consequences of unfair technologies used in digital humanities (DH). We connect bias analysis informed by the field of algorithmic fairness with perspectives on knowledge production in DH. We examine the fairness of Danish Named Entity Recognition tools through an innovative experimental method involving data augmentation and evaluate the performance disparities
-
The analogy of computing Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-01-21 Willard McCarty
The digital machine is analogical by design: with it, we construct models of phenomena that by definition of that term are necessarily partial approximations. For that reason, we learn more by conceiving of them as analogues rather than imperfect copies. As the foofaraw over AI would make clear to anyone who bothered to separate its strange wheat from the common chaff, analogy is key to the digital
-
AGREE: a new benchmark for the evaluation of distributional semantic models of ancient Greek Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-01-15 Silvia Stopponi, Saskia Peels-Matthey, Malvina Nissim
The last years have seen the application of Natural Language Processing, in particular, language models, to the study of the Semantics of ancient Greek, but only a little work has been done to create gold data for the evaluation of such models. In this contribution we introduce AGREE, the first benchmark for intrinsic evaluation of semantic models of ancient Greek created from expert judgements. In
-
Digitizing the USPTO patent backfile Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-01-15 Simon Rowberry
The digitization of the US Patent and Trademark Office’s (USPTO) backfile of six million patents undertaken between 1951 and 2001 was a five-decade struggle, featuring several media transitions from print and microfilm to CD-ROMs and, finally, the Web. This mass digitization project is on a similar scale to Google Books and the Internet Archive, but it is rarely discussed within critical digitization
-
Unsigned play by Milan Kundera? An authorship attribution study Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-01-11 Lenka Jungmannová, Petr Plecháč
In addition to being a widely recognized novelist, Milan Kundera has also authored three pieces for theatre: The Owners of the Keys (Majitelé klíčů 1961), The Blunder (Ptákovina 1967), and Jacques and his Master (Jakub a jeho pán 1971). In recent years, however, the hypothesis has been raised that Kundera was the true author of a fourth play, Juro Jánošík, first performed in a 1974 production under
-
Mapping Germanness in early 20th century USA: topic modeling and GIS within a small corpus framework Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-01-11 Sijie Wang, Maciej Kurzynski
The increased emphasis on language and ethnicity among German immigrants in the USA at the beginning of the 20th century resulted from inter-ethnic competition as well as assimilation pressures on Germans as a minority in American society. Following the unification of Germany and the improvement of German international status, Germans in America claimed superiority of German culture; middle-class advocates
-
The internal structure of medieval Latin legendaries: a computational analysis Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-01-11 Sébastien de Valeriola, Bastien Dubuisson
Since the middle of the 17th century, scholars have been systematically describing numerous medieval manuscripts preserved in libraries and religious institutions that contain hagiographic texts, that is texts recounting the lives of saints. In this article, we apply quantitative tools to the resulting database to consider these codices from a new point of view. As such, we study their internal organization
-
Topic modelling literary interviews from The Paris Review Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-01-11 Derek Greene, James O'Sullivan, Daragh O'Reilly
The interview has always proved to be a rich source for those hoping to better understand the figures behind a text, as well as any social contexts and writing practices which might have informed their aesthetic sentiments. Over the past two decades, research into the literary interview has made significant strides, both in terms of how this literary genre is conceptualized and how its emergence and
-
Networks as interpretative frameworks: using co-citation analysis to explore large corpora of early modern letters Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2024-01-06 Paolo Rossini
The analysis of co-citations, which occurs when two publications or authors are mentioned together in the same text, has long been established as a practice within scientometrics, particularly in the field of “science mapping”. However, historiography has shown less openness to utilizing co-citation analysis for distant reading purposes. To address this gap, this article presents a comprehensive methodology
-
Using Bayesian phylogenetics to infer manuscript transmission history Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-12-19 Joey McCollum, Robert Turnbull
Bayesian phylogenetic methods offer various models that would be especially suitable in the reconstruction of textual traditions, but text-critical applications of phylogenetics to date have generally not taken advantage of these features. In this article, we offer a way forward for text-critical phylogenetics. On the side of theory, we highlight multiple Bayesian phylogenetic models and discuss their
-
A methodology for building domain ontology of cultural heritage Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-12-01 Tong Wei, Yuqi Chen
Ontology plays a vital role in linking and publishing heritage data. However, the main difficulties of building ontology faced by heritage institutions are how the cultural heritage information is integrated into a fine-grained ontology. Therefore, following the ISO principles of Terminology (ISO 1087 and 704), this article proposed a term-concept-characteristic methodology that is user-friendly for
-
Principal components analysis in stylometry Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-11-29 Hugh Craig
Principal components analysis (PCA) has been one of the staple methods used in stylometry. In a 2021 article, Pervez Rizvi casts doubt on this method and argues that some widely cited results based on it should be set aside. In the current article, I show that none of Rizvi’s theoretical claims or experimental results stand up to examination. Rizvi argues that discarding the principal components beyond
-
Film dialogue and R-stylo Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-11-29 Barry Salt
The dialogue of a large number of American feature films of the last 30 years is analysed with the stylometric tools contained in the R-stylo package. Various interesting results showing the capabilities and restrictions of this statistical package emerge.
-
All the world’s a (hyper)graph: A data drama Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-11-20 Corinna Coupette, Jilles Vreeken, Bastian Rieck
We introduce Hyperbard, a dataset of diverse relational data representations derived from Shakespeare’s plays. Our representations range from simple graphs capturing character co-occurrence in single scenes to hypergraphs encoding complex communication settings and character contributions as hyperedges with edge-specific node weights. By making multiple intuitive representations readily available for
-
Methodological observations concerning word rankings and z-score refinements Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-11-08 Hartmut Ilsemann
This article evaluates word rankings suggested by Ary L. Goldberger, Albert C. Yang, and C. Peng as a means of establishing the authorship of texts in the light of Delta, developed by John Burrows at about the same time. The tests carried out with high ranking function words and results established with the more modern approaches of Rolling Delta, Rolling Classify, and the General Imposters method
-
VR as a metaleptic possible world of global citizenship embodiment: a cognitive stylistic approach Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-11-08 Rania Magdi Fawzy
Bringing together narrative elements, virtual affordances, and participants’ embodied interactions, virtual reality (VR) movies instantiate new narrative techniques by offering an immersive experience. This study examines virtual narrative beyond mere interactional engagement and extends the phenomenon to include worlding, metaleptic embodiment, and instantiated possible selves. It aims at exploring
-
One-third of a century on: the state of the art, pitfalls, and the way ahead relating to digital humanities approaches to translation and interpreting studies Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-19 Chonglong Gu
The year 1993 represents a momentous milestone in the not-so-long history of translation and interpreting studies (TIS). The foundational paper published by Mona Baker entitled ‘Corpus linguistics and translation studies: Implications and applications’ in 1993 has signalled a defining moment in the application of digital humanities (DH) approaches in TIS. Since then, corpus-based TIS, as a most visible
-
Who could be behind QAnon? Authorship attribution with supervised machine-learning Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-19 Florian Cafiero, Jean-Baptiste Camps
A series of social media posts on 4chan then 8chan, signed under the pseudonym ‘Q’, started a movement known as QAnon, which led some of its most radical supporters to violent and illegal actions. To identify the person(s) behind Q, we evaluate the coincidence between the linguistic properties of the texts written by Q and to those written by a list of suspects provided by journalistic investigation
-
A quantitative window on the history of statistics: topic-modelling 120 years of Biometrika Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-14 Nicola Bertoldi, Francis Lareau, Charles H Pence, Christophe Malaterre
As one of the oldest continuously publishing journals in statistics (published since 1901), Biometrika provides a unique window onto the history of statistics and its epistemic development throughout the 20th and the beginning of the 21st centuries. While the early history of the discipline, with the works of key figures, such as Karl Pearson, Francis Galton, or Ronald Fisher, is relatively well known
-
What can digital humanities do for literary adaptation studies: distant reading of children’s editions of Robinson Crusoe Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-11 Haifeng Hui
While digital humanities has emerged as a cutting-edge research trend in the humanities over the past two decades, its application in literary research is still scarce. At present, the field of digital humanities for literary studies is largely focused on theoretical development, critical reflections, and infrastructure building. This article aims to explore the potential of digital humanities in advancing
-
Corpus philology: Using the Dictionary of Old English to get bigger data for Old English spelling variation Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-11 Mark Faulkner
This article presents a methodology for obtaining large datasets for the spelling of individual phonological segments in Old English texts, based on searching the Dictionary of Old English Corpus for the attested spellings listed in the Dictionary of Old English A-H. It exemplifies this ‘corpus philology’ through a study of 216,526 spellings for words beginning with h followed by a vowel, using a variety
-
Transmission problems? An embedded approach for unification of Latin prefixes and text variants during text matching Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-11 Franziska Schropp, Thomas E Konrad, Marie Revellio, Barbara Feichtinger
The manuscript tradition of pre-modern texts poses a specific problem for scholars in the field of Digital Humanities: before printing made the production of standardized editions of texts feasible, copying texts by hand (and often by different people) was inherently an error-prone process, which not only led to differences in wording but also in spelling—across multiple transmitted variants. This
-
The battle plans in the 17th century on the example of the ‘ordres de bataille’ album by Eric Dahlbergh. Research model proposal Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-10 Mariusz Balcerek
The article’s purpose is to present an original method of analysis of battle plans (orders of battle, battle orders, battle patterns, and battle settings) from the 17th century. We analysed 253 patterns from Eric Dahlbergh’s Album, which was a gift for King Charles XI of Sweden. In this text, we will take a look at the settings presented in Dahlbergh’s album. We hope that it will allow us to learn
-
R Stylo and the authorship determination of Henry V Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-10 Hartmut Ilsemann
Over 25 years, Thomas Merriam has argued that Henry V was co-authored by Shakespeare and Christopher Marlowe, and in his most recent publication ‘Is it time to reconsider Henry V’ (2023), he established differences in word length, which gives clear evidence. This article makes use of the R Stylo suite of stylometric tools and employs the Rolling Delta, Rolling Classify, and the General Imposters methods
-
‘This app is evil forest true true’: metaphor-based metadiscursive evaluations of Twitter by Nigerians Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-07 Onwu Inya
Previous linguistic studies on Nigeria-based Twitter discourse have investigated radicalist, terrorist, campaign, and electioneering discourses. These previous studies focus on discourses produced on Twitter, and not metadiscursive reflections about the social media space itself, and the discursive practice of dragging, in the context of celebrity–newcomer socialization, drawing theoretical insights
-
Tracing connections: using network analysis to study trade and movement in the Mediterranean in the 11th to 14th centuries Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-07 Annabel Hancock
This study uses network approaches to study late medieval Mediterranean trade and movement and test the validity of using network methods to investigate the past. Historical literature largely focuses on merchant communities and which cities were most central for trade. In this article, two networks, one created from archaeological finds and the other from the writings of four medieval travellers,
-
Deep learning-based lexical character identification in TV series Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-07 Paola Dalla Torre, Paolo Fantozzi, Maurizio Naldi
Automated character identification in movies and TV series has been typically carried out through face detection in video and the association of faces with characters’ names extracted from dialogues or cast lists. We propose a deep learning architecture to identify characters based on subtitles only, precisely through the lexicon those characters employ. The identification task is formalized as a multi-class
-
Metaphor repositories: the case of the mental health metaphor dictionary Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-10-07 Marta Coll-Florit, Salvador Climent
In recent years, there has been an emergence of online metaphor repositories. The purpose of this article is 2-fold. First, we present a review and comparison of the existing online databases of conceptual metaphors, showing that although there are a good number of domain-independent conceptual metaphor repositories based on English texts, repositories that are field-specific and/or in other languages
-
Personality recognition in Digital Humanities: A review of computational approaches in the humanities Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-25 Davide Picca, Jocelin Pitteloud
One of the most fascinating aspects of human beings is their personality. Two models that are currently being researched and widely used in computational approaches are the Myers–Briggs Type Indicator and the Big Five (or OCEAN). In this study, we will briefly examine the history of these two models and the current state of their applications in the Digital Humanities field. Although categorizing research
-
Revealing ‘invisible’ poetry by W. H. Auden through computer vision: Using photometric stereo to visualize indented impressions Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-25 Simon Brenner, Timo Frühwirth, Sandra Mayer
This article explores the use of computer-vision technologies in the context of digitally editing and researching letters and literary papers by the British-American poet W. H. Auden. Two documents in the previously inaccessible ‘Auden Musulin Papers’ contain colourless indented typewriter impressions of poetry. These impressions result from the papers’ original use as ‘backing sheets’, inserted into
-
A crossroad between lexicography and terminology work: Knowledge organization and domain labelling Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-21 Rute Costa, Ana Salgado, Margarida Ramos, Bruno Almeida, Raquel Silva, Sara Carvalho, Fahad Khan, Toma Tasovac, Mohamed Khemakhem, Laurent Romary
MORDigital project aims to encode the selected editions of Diccionario de Lingua Portugueza by António de Morais Silva, first published in 1789. Our ultimate goals are, on the one hand, to promote accessibility to cultural heritage while fostering reusability and, on the other hand, to contribute towards a more significant presence of lexicographic digital content in Portuguese through open tools and
-
TBX and ‘Lemon’: What perspectives in terminology? Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-21 Silvia Piccini, Federica Vezzani, Andrea Bellandi
Different solutions are offered today for modelling multilingual terminological data. In this article, we focus on the description of two approaches: on the one hand, the model proposed in the context of ISO TC 37/SC 3, based on the adoption of the Terminological Markup Framework/TermBase eXchange standards; on the other hand, the ‘Lemon’ model and, more generally, the Ontology Web Language adopted
-
A systematic review of Automatic Term Extraction: What happened in 2022? Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-21 Giorgio Maria Di Nunzio, Stefano Marchesin, Gianmaria Silvello
Automatic Term Extraction (ATE) systems have been studied for many decades as, among other things, one of the most important tools for tasks such as information retrieval, sentiment analysis, named entity recognition, and others. The interest in this topic has even increased in recent years given the support and improvement of the new neural approaches. In this article, we present a follow-up on the
-
Terminologie collaborative: analyse d'un projet inter-universitaire outillé en contexte européen Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-21 Pascale Elbaz, Elpida Loupaki
The purpose of this article is to analyze a collaborative terminology project undertaken jointly by three academic institutions as well as a Terminology Unit, in the framework of a European cooperation. For this purpose, we will analyze the theoretical and practical aspects necessary for a successful collaboration, including methodology and workflow; we will evaluate the performance of the different
-
Des corpus aux bases de données… et retour. Quelle architecture pour une base de données socioterminologiques? Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-21 Valérie Delavigne
The fact is now accepted that specialised discourses do not deviate from the rules of language: variation in all its forms emerges from corpora. One of the questions that now arises is that of “implementing” socioterminological data in terminological resources: how can sociolinguistic diversity be integrated into a descriptive program? After reviewing the types of variation, this article explores an
-
Towards terminological resources tailored to the users’ needs: Terminology extraction based on appositive constructions Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-21 Giulia Speranza, Maria Pia di Buono, Johanna Monti
Terminological resources (TRs) are indispensable tools for accessing a specialized domain of knowledge. In this article, we propose a methodology for extracting terms and relevant linguistic information, useful for different users, hinging on the nature of some special linguistic structures: appositive constructions. The case study for our proof of concept in this investigation is the domain of Cultural
-
La Théorie du Concept des Normes ISO à l’Ere Numérique Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-21 Christophe Roche
Many ΙΤ applications rely on the operationalisation of terminologies, such as multilingual semantic search engines. By operationalisation of terminology, we mean a computational representation of the conceptual system as an ontology of knowledge engineering. This conceptual approach of terminology is close to the ISO principles on terminology as defined by the core standards ISO 1087:2019 and ISO 704:2022
-
Machine versus corpus-based translation of multiword terms Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-21 Melania Cabezas-García, Pilar León-Araúz
Machine translation (MT) post-editing is an increasingly common practice in the translation industry which is also slowly being applied in the development of terminological resources. However, more studies have been devoted to analyze the practice in a translation scenario than in a terminographic context. Consequently, term-oriented post-editing guidelines are a current need if terminographers are
-
Time and space as two basic attributes of Buddhist monuments: An introduction to the design, implementation, and application of the data platform of Buddhist monuments in China Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-15 Weiqiao Wang, Kequan Li
The Data Platform of Buddhist Monuments in Time and Space (buddhist.wiki) is a research infrastructure that focuses on collecting and presenting related materials of Buddhist monuments, which are presented by Geo map and complete archive from five aspects: overview, architecture, history, art, and personage. Time and space are extracted as two basic attributes and applied as design dimensions to organize
-
Comparative network analysis as a new approach to the editorship profiling task: A case study of the Mishnah and Tosefta from Rabbinic literature Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-14 Avital Zadok, Maayan Zhitomirsky-Geffet, Jonathan Schler, Binyamin Katzoff
Social network analysis of characters in historical works is a popular research methodology in the study of historical literature. This article proposes using this methodology to characterize and comparatively analyze editing styles of similar historical literary works to determine whether they were edited by the same hand. To that end, the study proposes constructing a network of characters for each
-
Lexical diversity as a lens into the classification of Slavic languages: A quantitative typology perspective Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-06-10 Chenliang Zhou, Haitao Liu
This study proposes a linguistic classification method based on quantitative typology, which leverages a large-scale multilingual parallel corpus to obtain valid language classification result by excluding the influence of covariates such as text genre and semantic content in cross-language comparison. To achieve this, we model the type–token relationships of each Slavic parallel text and calculate
-
“I would I had that corporal soundness”: Pervez Rizvi's Analysis of the Word Adjacency Network Method of Authorship Attribution Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-04-28 Gabriel Egan, Mark Eisen, Alejandro Ribeiro, Santiago Segarra
In his two-part article ‘An Analysis of the Word Adjacency Network Method—Part 1—The evidence of its unsoundness’ and ‘Part 2—A true understanding of the method’ Digital Scholarship in the Humanities, 38: 347-78 (2022), Pervez Rizvi attempts to replicate the Word Adjacency Network (WAN) method for authorship attribution and show that it does not produce the new knowledge that we, its inventors, claim
-
Provenance visualization: Tracing people, processes, and practices through a data-driven approach to provenance Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-04-25 Tomas Vancisin, Loraine Clarke, Mary Orr, Uta Hinrichs
Provenance disclosure—the documentation of an artifact’s origin and how it was produced—is an important aspect to consider when working with historical records which undergo multiple transformations in preparation for and during digitization. Provenance in this context is commonly communicated through explanatory text or static diagrams. However, the methodological and curatorial decisions that have
-
Proverbs as indicators of proficiency for art-generating AI Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-04-23 Luis J Tosina Fernández
Art generated by Artificial Intelligence (AI) is currently having great repercussion online. The reason for this is the fact that it allows people without creative talent to produce outstanding works by just typing in the description of what they want to illustrate. However, the appearance of this technology has also caused some discomfort among artists and graphic designers, who see their craft threatened
-
A new approach for the construction of historical databases—NoSQL Document-oriented databases: the example of AtlantoCracies Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-04-23 Manuel Diaz-Ordoñez, Domingo Savio Rodríguez Baena, Bartolomé Yun-Casalilla
This article proposes, and justifies, the use of the Document-oriented databases as a flexible, easy to use, and powerful digital tool in the field of historical research. First, the reasons that have made relational databases the predominant instrument among historians are studied, while detailing the problems involved in their use. Next, the way in which historians have tried to face these problems
-
Web archive analytics: Blind spots and silences in distant readings of the archived web Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-04-19 Simon Donig, Markus Eckl, Sebastian Gassner, Malte Rehbein
In this article, we discuss epistemological and methodological aspects of web archive analytics, a recent development towards more data-centred access to web archives. More specifically, we suggest understanding both the process of archiving and subsequent steps of analysis at scale as acts of observation that can be questioned for their epistemological priori. Therefore, we propose the concepts of
-
NEAT—Named Entities in Archaeological Texts: A semantic approach to term extraction and classification Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-04-13 Maria Pia di Buono, Gennaro Nolano, Johanna Monti
The lack of annotated datasets affects the development of Natural Language Processing applications and heavily impacts the access to textual data, in particular for specific domains and specific languages. In this paper, we propose a methodology to annotate texts concerning domain-specific knowledge, to provide a reliable source of data for the task of Named Entity Recognition (NER) in the domain of
-
Unravelling interlanguage facts via explainable machine learning Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-04-10 Barbara Berti, Andrea Esuli, Fabrizio Sebastiani
Native language identification (NLI) is the task of training (via supervised machine learning) a classifier that guesses the native language of the author of a text. This task has been extensively researched in the last decade, and the performance of NLI systems has steadily improved over the years. We focus on a different facet of the NLI task, i.e. that of analysing the internals of an NLI classifier
-
Sagas and genre: A case for application of network analysis to manuscripts preserving Old Norse-Icelandic saga literature Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-04-07 Katarzyna Anna Kapitan, Tarrin Wills
This study applies statistical approaches to the analysis of the genre relationships of Old Norse-Icelandic literature in order to expand our understanding of the relationships between works, their transmission, and their possible modes of reception, as manifested in the extant manuscripts. This article contributes to the ongoing discussion of the genre boundaries of Old Norse-Icelandic literature
-
A workflow model for holistic data management and semantic interoperability in quantitative archival research Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-04-06 Pavlos Fafalios, Yannis Marketakis, Anastasia Axaridou, Yannis Tzitzikas, Martin Doerr
Archival research is a complicated task that involves several diverse activities for the extraction of evidence and knowledge from a set of archival documents. The involved activities are usually unconnected, in terms of data connection and flow, making difficult their recursive revision and execution, as well as the inspection of provenance information at data element level. This article proposes
-
A learning approach towards metre-based classification of similar Hindi poems using proposed two-level data transformation Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-03-20 Komal Naaz, Niraj Kumar Singh
With the advancement in technology and digitalization of resources, computation of humanities problems is no exception to remain untouched. Automatic poetry classification is now a well-defined problem which can be solved using various approaches. Mood-based poetry classification is one of the popular ones. We propose a learning approach towards metre-based classification of Hindi metrical poetry.
-
Is it time to reconsider Henry V? Digit. Scholarsh. Hum.it. (IF 1.299) Pub Date : 2023-03-17 Thomas Merriam
The average word length of the verse sections of Henry V is greater than that of the prose sections. Contrary to the assumption that word length primarily reflects literary medium (prose–verse), a more extended examination reveals that differences in word length can also reflect differences in authorship. Since 1998, a variety of stylistic analyses have pointed in the same direction: the verse of Henry