当前位置: X-MOL 学术Indoor Air › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Automating the interpretation of PM2.5 time‐resolved measurements using a data‐driven approach
Indoor Air ( IF 4.3 ) Pub Date : 2020-12-28 , DOI: 10.1111/ina.12780
Hao Tang 1 , Wanyu Rengie Chan 2 , Michael D Sohn 2
Affiliation  

The rapid development of automated measurement equipment enables researchers to collect greater quantities of time‐resolved data from indoor and outdoor environments. While significant, the interpretation of the resulting data can be a time‐consuming effort. This paper introduces an automated process of interpreting PM2.5 time‐resolved data and differentiating PM2.5 emissions resulting from indoor and outdoor sources. We use Random Forest (RF), a machine learning approach, to study a dataset of 836 indoor emission events that occurred over a 2‐week period in 18 apartments in California. In this paper, we show model development and evaluate its performance as the sample size and source vary. We discuss the characteristics of the dataset that tended to help the source identification and why. For example, we show that data from many events and from different apartments are essential for the model to be suitable for analyzing a new separate dataset. We also show that longitudinal data appear to be more helpful than the time frequency of measurements within a given apartment. We use the resulting RF model to analyze PM2.5 data of an entirely separate dataset collected from 65 new homes in California. The RF model identifies 442 indoor emission events, with only a few misidentifications.

中文翻译:

使用数据驱动方法自动解释 PM2.5 时间分辨测量值

自动化测量设备的快速发展使研究人员能够从室内和室外环境中收集更大量的时间分辨数据。虽然意义重大,但对结果数据的解释可能是一项耗时的工作。本文介绍了解释 PM 2.5时间分辨数据和区分 PM 2.5的自动化过程由室内和室外源产生的排放。我们使用机器学习方法随机森林 (RF) 来研究加利福尼亚州 18 间公寓在两周内发生的 836 个室内排放事件的数据集。在本文中,我们展示了模型开发并在样本大小和来源不同时评估其性能。我们讨论了有助于识别来源的数据集的特征以及原因。例如,我们展示了来自许多事件和不同公寓的数据对于模型适用于分析新的单独数据集至关重要。我们还表明,纵向数据似乎比给定公寓内测量的时间频率更有帮助。我们使用生成的 RF 模型来分析 PM 2.5从加利福尼亚的 65 个新住宅收集的完全独立的数据集的数据。RF 模型识别了 442 个室内发射事件,只有少数错误识别。
更新日期:2020-12-28
down
wechat
bug