当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hypercluster: a flexible tool for parallelized unsupervised clustering optimization
BMC Bioinformatics ( IF 2.9 ) Pub Date : 2020-09-29 , DOI: 10.1186/s12859-020-03774-1
Lili Blumenberg 1, 2 , Kelly V Ruggles 1, 2
Affiliation  

Unsupervised clustering is a common and exceptionally useful tool for large biological datasets. However, clustering requires upfront algorithm and hyperparameter selection, which can introduce bias into the final clustering labels. It is therefore advisable to obtain a range of clustering results from multiple models and hyperparameters, which can be cumbersome and slow. We present hypercluster, a python package and SnakeMake pipeline for flexible and parallelized clustering evaluation and selection. Users can efficiently evaluate a huge range of clustering results from multiple models and hyperparameters to identify an optimal model. Hypercluster improves ease of use, robustness and reproducibility for unsupervised clustering application for high throughput biology. Hypercluster is available on pip and bioconda; installation, documentation and example workflows can be found at: https://github.com/ruggleslab/hypercluster .

中文翻译:


Hypercluster:用于并行无监督聚类优化的灵活工具



无监督聚类是大型生物数据集的一种常见且非常有用的工具。然而,聚类需要预先的算法和超参数选择,这可能会给最终的聚类标签带来偏差。因此,建议从多个模型和超参数中获取一系列聚类结果,这可能很麻烦且缓慢。我们提出了 hypercluster、一个 python 包和 SnakeMake 管道,用于灵活且并行的聚类评估和选择。用户可以有效地评估来自多个模型和超参数的大量聚类结果,以识别最佳模型。 Hypercluster 提高了高通量生物学无监督聚类应用的易用性、稳健性和重现性。 Hypercluster 可在 pip 和 bioconda 上使用;安装、文档和示例工作流程可以在以下位置找到:https://github.com/rugleslab/hypercluster。
更新日期:2020-09-29
down
wechat
bug