当前位置: X-MOL 学术J. Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Large scale analysis of violent death count in daily newspapers to quantify bias and censorship
Journal of Big Data ( IF 8.1 ) Pub Date : 2020-08-11 , DOI: 10.1186/s40537-020-00338-1
Marco Casolino

In this work we develop a series of techniques and tools to determine and quantify the presence of bias and censorship in newspapers. These algorithms are tested analyzing the occurrence of keywords ‘killed’ and ‘suicide’ (‘morti’’, ‘suicidio’ in Italian) and their changes over time, gender and reported location on the complete online archives (42 million records) of the major US newspaper (The New York Times) and the three major Italian ones (Il Corriere della Sera, La Repubblica, La Stampa). Using these tools, since the Italian language distinguishes between the female and male cases, we find the presence of gender bias in all Italian newspapers, with reported single female deaths to be about one-third of those involving single men. Analyzing the historical trends, we show evidence of censorship in Italian newspapers both during World War 1 and during the Italian Fascist regime. Censorship in all countries during World Wars and in Italy during the Fascist period is a historically ascertained fact, but so far there was no estimate on the amount on censorship in newspaper reporting: in this work we estimate that about 75% of domestic deaths and suicides were not reported. This is also confirmed by statistical analysis of the distribution of the least significant digit of the number of reported deaths. We also find that the distribution function of the number of articles vs. the number of deaths reported in articles follows a power law, which is broken (with fewer articles being written) when reporting on few deaths occurring in foreign countries. The lack of articles is found to grow with geographical distance from the nation where the newspaper is being printed. Whereas the assessment of the truth of a single article or the debunking of what are now called ‘fake news’ requires specific fact-checking and becomes more difficult as time goes by, these methods can be used in historical analysis and to evaluate quantitatively the amount of bias and censorship present in other printed or online publication and can thus contribute to quantitatively assess the freedom of the press in a given country. Furthermore, they can be applied in wider contexts such as the evaluation of bias toward specific ethnic groups or specific accidents.

中文翻译:

对日报中的暴力死亡人数进行大规模分析,以量化偏见和审查制度

在这项工作中,我们开发了一系列技术和工具来确定和量化报纸中存在偏见和审查制度。测试了这些算法,分析了关键字“ killed”和“ suicide”(意大利语中的“ morti”,“ suicidio ”)的出现以及其随时间,性别和报告位置的完整在线档案(4200万条记录)的变化。美国主要报纸(纽约时报)和意大利三大报纸(Il Corriere della Sera,La Repubblica,La Stampa)。使用这些工具,由于意大利语区分了男性和男性,因此我们发现所有意大利报纸都存在性别偏见,据报道单身女性死亡人数约为单身男性死亡人数的三分之一。分析历史趋势,我们在第一次世界大战期间和意大利法西斯政权期间都显示出对意大利报纸进行审查的证据。历史上已经确定了世界大战期间所有国家和法西斯时期意大利的审查制度,但是到目前为止,报纸报道中没有对审查制度进行估计:在这项工作中,我们估计约有75%的家庭死亡和自杀没有报道。对报告死亡人数的最低有效位数的分布进行统计分析也证实了这一点。我们还发现,文章数量与文章中报告的死亡人数的分布函数遵循幂律,当报告外国发生的死亡人数很少时,该幂律被打破(撰写的文章更少)。人们发现,与报纸印刷国的地理距离越远,文章的缺乏就越严重。评估单个文章的真实性或对现在称为“假新闻”的揭穿需要进行特定的事实检查,并且随着时间的流逝而变得更加困难,但是这些方法可用于历史分析并定量评估数量其他印刷或在线出版物中存在的偏见和审查制度,因此可以有助于定量评估给定国家/地区的新闻自由。此外,
更新日期:2020-08-11
down
wechat
bug