当前位置: X-MOL 学术Chemometr. Intell. Lab. Systems › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data Preprocessing for Multiblock Modelling – A Systematization with New Methods
Chemometrics and Intelligent Laboratory Systems ( IF 3.7 ) Pub Date : 2020-04-01 , DOI: 10.1016/j.chemolab.2020.103959
Maria P. Campos , Marco S. Reis

Abstract With the advance of Industry 4.0, new data collectors are appearing at different points of the process generating blocks of data whose integrity should be preserved during data analysis. This is the scope of multiblock methods, whose potential has been recognized in several areas of application where they are becoming increasingly popular. Multiblock methods can be applied to a wide range of data-driven problems that practitioners face nowadays such as plant-wide process monitoring and diagnosis, process optimization and quality prediction of key product properties. These methods have the ability to find associations and interpretative connections between different data blocks from different sources and carrying complementary or overlapping information, as well as assessing the blocks’ relative contributions to the final outcome. A critical stage in the application of multiblock methods is the selection of the appropriate preprocessing to apply to each block, before proceeding to the modelling. The preprocessing strategy can exponentiate the information extracted from the blocks and their mutual interactions or hide/mask/distort them if inappropriately done. In this article, we present a systematic workflow where both the intra-block and inter-block variation components are considered during preprocessing. We illustrate the application of the framework using two real case studies where a critical comparison is presented for the different preprocessing alternatives.

中文翻译:

多块建模的数据预处理——新方法的系统化

摘要 随着工业 4.0 的进步,新的数据收集器出现在生成数据块的过程的不同点,在数据分析过程中应保持其完整性。这是多块方法的范围,其潜力已在多个应用领域得到认可,并且越来越受欢迎。多块方法可以应用于从业者目前面临的广泛的数据驱动问题,例如全厂过程监控和诊断、过程优化和关键产品特性的质量预测。这些方法能够找到来自不同来源并携带互补或重叠信息的不同数据块之间的关联和解释性联系,以及评估块对最终结果的相对贡献。在进行建模之前,应用多块方法的一个关键阶段是选择适用于每个块的适当预处理。如果处理不当,预处理策略可以对从块中提取的信息及其相互交互进行指数化,或者隐藏/屏蔽/扭曲它们。在本文中,我们提出了一个系统的工作流程,其中在预处理过程中同时考虑了块内和块间变化分量。我们使用两个真实案例研究来说明该框架的应用,其中针对不同的预处理替代方案进行了关键比较。如果处理不当,预处理策略可以对从块中提取的信息及其相互交互进行指数化,或者隐藏/屏蔽/扭曲它们。在本文中,我们提出了一个系统的工作流程,其中在预处理过程中同时考虑了块内和块间变化分量。我们使用两个真实案例研究来说明该框架的应用,其中针对不同的预处理替代方案进行了关键比较。如果处理不当,预处理策略可以对从块中提取的信息及其相互交互进行指数化,或者隐藏/屏蔽/扭曲它们。在本文中,我们提出了一个系统的工作流程,其中在预处理过程中同时考虑了块内和块间变化分量。我们使用两个真实案例研究来说明该框架的应用,其中针对不同的预处理替代方案进行了关键比较。
更新日期:2020-04-01
down
wechat
bug