当前位置: X-MOL 学术Energy Build. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Unsupervised learning for feature projection: Extracting patterns from multidimensional building measurements
Energy and Buildings ( IF 6.6 ) Pub Date : 2020-06-11 , DOI: 10.1016/j.enbuild.2020.110228
Chunze Xiao , Fazel Khayatian , Giuliano Dall'O'

Data visualization is an important resource for decision makers to obtain information from large datasets. Based on the data obtained from either predictions or measurements, different strategies are combined and tested to reduce the energy demand, whilst keeping the indoor comfort at suitable level. Although the information expressed from data representation can significantly influence the decisions, little research has focused on extracting features from building measurements. This paper provides an in-depth view into representation of building data, and applies three dimensionality reduction algorithms Principle Component Analysis (PCA), autoencoder and t-Distributed Stochastic Neighbour Embedding (t-SNE) on measurements from a teaching building. Results show that whilst PCA returns linear representations, it also has the least data compression, which can be useful for obtaining more general features. On the other hand, t-SNE returns the most compressed data, which is suitable for seeking large margins within a dataset. However, t-SNE may be unsuitable for datasets with recurring step-like temporal profiles. Autoencoder is the best overall option, as they capture the nonlinearities within a dataset whilst avoiding excessive data compression. Fine-tuning the hyperparameters of studied the algorithms, and the perils of relying on poorly tuned models is discussed at the end of the study.



中文翻译:

用于特征投影的无监督学习:从多维建筑测量中提取模式

数据可视化是决策者从大型数据集中获取信息的重要资源。基于从预测或测量获得的数据,可以组合并测试不同的策略,以减少能源需求,同时将室内舒适度保持在适当水平。尽管从数据表示形式表达的信息可以显着影响决策,但是很少有研究集中在从建筑物测量中提取特征。本文对建筑物数据的表示方法进行了深入研究,并将三维降维算法主成分分析(PCA),自动编码器和t分布随机邻居嵌入(t-SNE)应用于教学建筑物的测量。结果表明,尽管PCA返回线性表示形式,但它的数据压缩率也最低,这对于获取更多常规功能很有用。另一方面,t-SNE返回压缩率最高的数据,适用于在数据集中寻找较大的边距。但是,t-SNE可能不适用于具有重复阶梯状时间轮廓的数据集。自动编码器是最佳的总体选择,因为它们可以捕获数据集中的非线性,同时避免过度的数据压缩。在研究结束时,将对所研究算法的超参数进行微调,并讨论依赖于不良调优模型的风险。因为它们捕获了数据集中的非线性,同时避免了过度的数据压缩。在研究结束时,将对所研究算法的超参数进行微调,并讨论依赖于不良调优模型的风险。因为它们捕获了数据集中的非线性,同时避免了过度的数据压缩。在研究结束时,将对所研究算法的超参数进行微调,并讨论依赖于不良调优模型的风险。

更新日期:2020-06-24
down
wechat
bug