当前位置: X-MOL 学术Studia Neophilologica › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identifying speech acts in a corpus of historical migrant correspondence
Studia Neophilologica Pub Date : 2019-05-04 , DOI: 10.1080/00393274.2019.1616216
Rachele De Felice 1 , Emma Moreton 2
Affiliation  

ABSTRACT A full account of the pragmatics of personal correspondence requires speech act annotation, and as manual annotation of large datasets can be extremely difficult, this study proposes to use an automated speech act tagger developed by the first author. It was originally designed for use with business emails; however, the latest iteration of the tagger can be applied to other datasets – such as personal correspondence – providing a useful resource for the corpus linguistics community. In this study, the speech act tagger is tested on a collection of letters written by Irish migrants at the end of the nineteenth century. After discussing issues to do with the digitisation, transcription and annotation of historical migrant correspondence, the article will report on the results of this trial study, demonstrating how the tagger can perform with some success even on corpora with very different characteristics. Although the dataset used for this trial study is small, the findings show the potential for carrying out this type of analysis across larger digital archives allowing for different datasets to be compared, taking into consideration sociobiographic variables such as the author’s sex, class and role within the notional familial hierarchy.

中文翻译:

识别历史移民通信语料中的言语行为

摘要 个人通信语用学的完整说明需要语音行为注释,并且由于大型数据集的手动注释可能非常困难,本研究建议使用由第一作者开发的自动语音行为标记器。它最初设计用于商业电子邮件;然而,标注器的最新迭代可以应用于其他数据集——例如个人通信——为语料库语言学社区提供了有用的资源。在这项研究中,言语行为标记器在 19 世纪末爱尔兰移民写的信件集上进行了测试。在讨论了与历史移民信件的数字化、转录和注释有关的问题后,本文将报告本次试验研究的结果,展示了标注器如何在具有截然不同特征的语料库上取得一些成功。尽管用于本试验研究的数据集很小,但研究结果表明在更大的数字档案中进行此类分析的潜力,允许比较不同的数据集,同时考虑到社会传记变量,例如作者的性别、阶级和在其中的角色名义上的家庭等级制度。
更新日期:2019-05-04
down
wechat
bug