当前位置: X-MOL 学术arXiv.cs.ET › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Machine Learning-based Approach to Detect Threats in Bio-Cyber DNA Storage Systems
arXiv - CS - Emerging Technologies Pub Date : 2020-09-28 , DOI: arxiv-2009.13380
Federico Tavella, Alberto Giaretta, Mauro Conti, Sasitharan Balasubramaniam

Data storage is one of the main computing issues of this century. Not only storage devices are converging to strict physical limits, but also the amount of data generated by users is growing at an unbelievable rate. To face these challenges, data centres grew constantly over the past decades. However, this growth comes with a price, particularly from the environmental point of view. Among various promising media, DNA is one of the most fascinating candidate. In our previous work, we have proposed an automated archival architecture which uses bioengineered bacteria to store and retrieve data, previously encoded into DNA. This storage technique is one example of how biological media can deliver power-efficient storing solutions. The similarities between these biological media and classical ones can also be a drawback, as malicious parties might replicate traditional attacks on the former archival system, using biological instruments and techniques. In this paper, first we analyse the main characteristics of our storage system and the different types of attacks that could be executed on it. Then, aiming at identifying on-going attacks, we propose and evaluate detection techniques, which rely on traditional metrics and machine learning algorithms. We identify and adapt two suitable metrics for this purpose, namely generalized entropy and information distance. Moreover, our trained models achieve an AUROC over 0.99 and AUPRC over 0.91.

中文翻译:

一种基于机器学习的方法来检测生物网络 DNA 存储系统中的威胁

数据存储是本世纪的主要计算问题之一。不仅存储设备正在收敛到严格的物理限制,而且用户生成的数据量也在以令人难以置信的速度增长。为了应对这些挑战,数据中心在过去几十年中不断发展。然而,这种增长是有代价的,特别是从环境的角度来看。在各种有前途的媒体中,DNA 是最吸引人的候选者之一。在我们之前的工作中,我们提出了一种自动化档案架构,它使用生物工程细菌来存储和检索先前编码到 DNA 中的数据。这种存储技术是生物介质如何提供节能存储解决方案的一个例子。这些生物媒体与经典媒体之间的相似性也可能是一个缺点,因为恶意方可能会使用生物仪器和技术复制对以前的档案系统的传统攻击。在本文中,我们首先分析了我们存储系统的主要特征以及可以在其上执行的不同类型的攻击。然后,为了识别正在进行的攻击,我们提出并评估了依赖于传统指标和机器学习算法的检测技术。为此,我们确定并调整了两个合适的度量标准,即广义熵和信息距离。此外,我们训练的模型实现了超过 0.99 的 AUROC 和超过 0.91 的 AUPRC。首先,我们分析存储系统的主要特征以及可以在其上执行的不同类型的攻击。然后,为了识别正在进行的攻击,我们提出并评估了依赖于传统指标和机器学习算法的检测技术。为此,我们确定并调整了两个合适的度量标准,即广义熵和信息距离。此外,我们训练的模型实现了超过 0.99 的 AUROC 和超过 0.91 的 AUPRC。首先,我们分析存储系统的主要特征以及可以在其上执行的不同类型的攻击。然后,为了识别正在进行的攻击,我们提出并评估了依赖于传统指标和机器学习算法的检测技术。为此,我们确定并调整了两个合适的度量标准,即广义熵和信息距离。此外,我们训练的模型实现了超过 0.99 的 AUROC 和超过 0.91 的 AUPRC。
更新日期:2020-09-29
down
wechat
bug