当前位置: X-MOL 学术arXiv.cs.OH › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Computer Architecture-Aware Optimisation of DNA Analysis Systems
arXiv - CS - Other Computer Science Pub Date : 2021-01-13 , DOI: arxiv-2101.05012
Hasindu Gamaarachchi

DNA sequencing is revolutionising the field of medicine. DNA sequencers, the machines which perform DNA sequencing, have evolved from the size of a fridge to that of a mobile phone over the last two decades. The cost of sequencing a human genome also has reduced from billions of dollars to hundreds of dollars. Despite these improvements, DNA sequencers output hundreds or thousands of gigabytes of data that must be analysed on computers to discover meaningful information with biological implications. Unfortunately, the analysis techniques have not kept the pace with rapidly improving sequencing technologies. Consequently, even today, the process of DNA analysis is performed on high-performance computers, just as it was a couple of decades ago. Such high-performance computers are not portable. Consequently, the full utility of an ultra-portable sequencer for sequencing in-the-field or at the point-of-care is limited by the lack of portable lightweight analytic techniques. This thesis proposes computer architecture-aware optimisation of DNA analysis software. DNA analysis software is inevitably convoluted due to the complexity associated with biological data. Modern computer architectures are also complex. Performing architecture-aware optimisations requires the synergistic use of knowledge from both domains, (i.e, DNA sequence analysis and computer architecture). This thesis aims to draw the two domains together. In this thesis, gold-standard DNA sequence analysis workflows are systematically examined for algorithmic components that cause performance bottlenecks. Identified bottlenecks are resolved through architecture-aware optimisations at different levels, i.e., memory, cache, register and processor. The optimised software tools are used in complete end-to-end analysis workflows and their efficacy is demonstrated by running on prototypical embedded systems.

中文翻译:

DNA分析系统的计算机架构感知优化

DNA测序正在彻底改变医学领域。在过去的二十年中,执行DNA测序的机器DN​​A测序仪已经从冰箱的大小发展到了手机的大小。人类基因组测序的费用也从数十亿美元减少到数百美元。尽管有了这些改进,DNA测序仪仍可输出数百或数千GB的数据,必须在计算机上进行分析才能发现具有生物学意义的有意义的信息。不幸的是,分析技术未能跟上快速改进的测序技术的步伐。因此,即使在今天,就像几十年前一样,DNA分析过程还是在高性能计算机上进行的。这种高性能计算机不是便携式的。所以,缺乏便携式轻量级分析技术限制了超便携式定序器在现场或现场即时定序的全部功能。本文提出了DNA分析软件的计算机体系结构感知优化。由于与生物数据相关的复杂性,DNA分析软件不可避免地令人费解。现代计算机体系结构也很复杂。执行了解架构的优化需要协同使用来自两个领域的知识(即DNA序列分析和计算机架构)。本文旨在将这两个领域结合在一起。在本文中,系统地检查了金标准DNA序列分析工作流程中导致性能瓶颈的算法组件。通过在不同级别(即内存,缓存,寄存器和处理器)的可感知体系结构的优化来解决已识别的瓶颈。经过优化的软件工具可用于完整的端到端分析工作流程,其功效可通过在典型的嵌入式系统上运行来证明。
更新日期:2021-01-14
down
wechat
bug