当前位置: X-MOL 学术Chem. Mater. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Effect of Increasing the Descriptor Set on Machine Learning Prediction of Small Molecule-Based Organic Solar Cells
Chemistry of Materials ( IF 7.2 ) Pub Date : 2020-08-25 , DOI: 10.1021/acs.chemmater.0c02325
Zhi-Wen Zhao 1, 2 , Marcos del Cueto 1 , Yun Geng 2 , Alessandro Troisi 1
Affiliation  

In this work, we analyzed a data set formed by 566 donor/acceptor pairs, which are part of organic solar cells recently reported. We explored the effect of different descriptors in machine learning (ML) models to predict the power conversion efficiency (PCE) of these cells. The investigated descriptors are classified into two main categories: structural (topology properties) and physical descriptors (energy levels, molecular size, light absorption, and mixing properties). In line with previous observations, ML predictions are more accurate when using both structural and physical descriptors, as opposed to only using one of them. We observed that ML predictions are also improved by using larger and more varied data sets. Importantly, the structural descriptors are the ones contributing the most to the ML models. Some physical properties are highly correlated with PCE, although they do not improve notably the ML prediction accuracy as they carry information already encoded in the structural descriptors. Given that various descriptors have significantly different computational costs, the analysis presented here can be used as a guide to construct ML models that maximize predictive power and minimize computational costs for screening large sets of candidates.

中文翻译:

描述符集的增加对基于小分子的有机太阳能电池的机器学习预测的影响

在这项工作中,我们分析了由566个供体/受体对组成的数据集,这些对是最近报道的有机太阳能电池的一部分。我们探索了机器学习(ML)模型中不同描述符的影响,以预测这些电池的功率转换效率(PCE)。研究的描述符分为两大类:结构(拓扑特性)和物理描述符(能级,分子大小,光吸收和混合特性)。与以前的观察结果一致,与同时使用结构描述符和物理描述符相比,ML预测更为准确。我们观察到,通过使用更大和更多变化的数据集,机器学习的预测也得到了改善。重要的是,结构描述符是对ML模型贡献最大的描述符。一些物理属性与PCE高度相关,尽管它们携带的信息已经在结构描述符中编码,但并不能显着提高ML预测的准确性。鉴于各种描述符具有明显不同的计算成本,此处介绍的分析可以用作构建ML模型的指南,该模型可以最大化预测能力并最小化筛选大量候选对象的计算成本。
更新日期:2020-09-22
down
wechat
bug