当前位置: X-MOL 学术arXiv.cs.DL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Replaying Archived Twitter: When your bird is broken, will it bring you down?
arXiv - CS - Digital Libraries Pub Date : 2021-08-27 , DOI: arxiv-2108.12092
Kritika Garg, Himarsha R. Jayanetti, Sawood Alam, Michele C. Weigle, Michael L. Nelson

Historians and researchers trust web archives to preserve social media content that no longer exists on the live web. However, what we see on the live web and how it is replayed in the archive are not always the same. In this paper, we document and analyze the problems in archiving Twitter ever since Twitter forced the use of its new UI in June 2020. Most web archives were unable to archive the new UI, resulting in archived Twitter pages displaying Twitter's "Something went wrong" error. The challenges in archiving the new UI forced web archives to continue using the old UI. To analyze the potential loss of information in web archival data due to this change, we used the personal Twitter account of the 45th President of the United States, @realDonaldTrump, which was suspended by Twitter on January 8, 2021. Trump's account was heavily labeled by Twitter for spreading misinformation, however we discovered that there is no evidence in web archives to prove that some of his tweets ever had a label assigned to them. We also studied the possibility of temporal violations in archived versions of the new UI, which may result in the replay of pages that never existed on the live web. Our goal is to educate researchers who may use web archives and caution them when drawing conclusions based on archived Twitter pages.

中文翻译:

重播存档的推特:当你的鸟被打破时,它会让你失望吗?

历史学家和研究人员相信网络档案可以保存不再存在于实时网络上的社交媒体内容。但是,我们在实时网络上看到的内容以及在存档中重播的方式并不总是相同的。在本文中,我们记录并分析了自 Twitter 于 2020 年 6 月强制使用其新 UI 以来归档 Twitter 的问题。 大多数网络归档无法归档新 UI,导致归档的 Twitter 页面显示 Twitter 的“出现问题”错误。归档新 UI 的挑战迫使 Web 归档继续使用旧 UI。为了分析这一变化导致网络档案数据中信息的潜在损失,我们使用了美国第 45 任总统 @realDonaldTrump 的个人 Twitter 帐户,该帐户于 2021 年 1 月 8 日被 Twitter 暂停。他的帐户因传播错误信息而被推特严重贴上标签,但我们发现网络档案中没有证据证明他的某些推文曾被贴上标签。我们还研究了新 UI 的存档版本中出现时间违规的可能性,这可能会导致重播从未存在于实时网络上的页面。我们的目标是教育可能使用网络档案的研究人员,并在根据存档的 Twitter 页面得出结论时提醒他们。
更新日期:2021-08-30
down
wechat
bug