当前位置: X-MOL 学术Commun. Stat. Simul. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Exact probability of fixed patterns occurring in a random sequence
Communications in Statistics - Simulation and Computation ( IF 0.9 ) Pub Date : 2020-06-30 , DOI: 10.1080/03610918.2020.1766500
Ke-Ning Sheng 1 , Joseph I. Naus 1
Affiliation  

Abstract

We derive a procedure to obtain the exact probability that a specific pattern of letters occurs in a longer random sequence of letters. The procedure is generalized to find the exact probability of a fixed (specific) single pattern, and a union or intersection of multiple fixed (specific) patterns within a random sequence perfectly for any distributions of a cell in the random sequence, and can handle patterns with uncertain letters (including missing, blank, unclear, ambiguous, transposition, etc.). The procedure also finds the probability that a pattern that is randomly picked will appear in a separate longer random sequence of letters. These methods are of particular applicability in genetic sequence analysis, diagnostics, anthropology, clinical medicine, data mining, computational molecular biology, and pattern analysis and recognition.



中文翻译:

随机序列中出现固定模式的准确概率

摘要

我们推导出一个程序来获得特定字母模式出现在较长的随机字母序列中的确切概率。该过程被概括为找到固定(特定)单一模式的确切概率,以及随机序列中多个固定(特定)模式的联合或交集,完美地适用于随机序列中细胞的任何分布,并且可以处理模式带有不确定的字母(包括缺失、空白、不清楚、模棱两可、换位等)。该过程还发现随机挑选的图案出现在单独的较长随机字母序列中的概率。这些方法特别适用于基因序列分析、诊断学、人类学、临床医学、数据挖掘、计算分子生物学以及模式分析和识别。

更新日期:2020-06-30
down
wechat
bug