当前位置: X-MOL 学术Stat › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Statistical insights into deep neural network learning in subspace classification
Stat ( IF 0.7 ) Pub Date : 2020-06-25 , DOI: 10.1002/sta4.273
Hao Wu 1 , Yingying Fan 2 , Jinchi Lv 2
Affiliation  

Deep learning has benefited almost every aspect of modern big data applications. Yet its statistical properties still remain largely unexplored. It is commonly believed nowadays that deep neural networks (DNNs) benefit from representational learning. To gain some statistical insights into this, we design a simple simulation setting where we generate data from some latent subspace structure with each subspace regarded as a cluster. We empirically demonstrate that the performance of DNN is very similar to that of the two‐step procedure of clustering followed by classification (unsupervised plus supervised). This motivates us to ask: Does DNN indeed mimic the two‐step procedure statistically? That is, do bottom layers in DNN try to cluster first and then top layers classify within each cluster? To answer this question, we conduct a series of simulation studies, and to our surprise, none of the hidden layers in DNN conduct successful clustering. In some sense, our results provide an important complement to the common belief of representational learning, suggesting that at least in some model settings, although the performance of DNN is comparable with that of the ideal two‐step procedure knowing the true latent cluster information a priori, it does not really do clustering in any of its layers. We also provide some statistical insights and heuristic arguments to support our empirical discoveries and further demonstrate the revealed phenomenon on the real data application of traffic sign recognition.

中文翻译:

子空间分类中对深度神经网络学习的统计见解

深度学习已使现代大数据应用程序的几乎所有方面受益。然而,它的统计特性仍然很大程度上未被开发。如今,通常认为深度神经网络(DNN)受益于代表性学习。为了获得对此的一些统计见解,我们设计了一个简单的仿真设置,在该设置中,我们从某个潜在子空间结构生成数据,每个子空间都被视为一个群集。我们凭经验证明,DNN的性能与聚类然后分类的两步过程(无监督加监督)非常相似。这促使我们提出以下问题:DNN是否确实在统计上模仿了两步过程?也就是说,DNN中的底层是否先尝试进行聚类,然后在每个聚类中对顶层进行分类?为了回答这个问题,我们进行了一系列模拟研究,令我们惊讶的是,DNN中的任何隐藏层都没有成功进行聚类。从某种意义上说,我们的结果为代表性学习的普遍信念提供了重要的补充,这表明至少在某些模型设置中,尽管DNN的性能与了解真正的潜在聚类信息先验地,它实际上并没有在其任何层中进行集群。我们还提供一些统计见解和启发式论据,以支持我们的经验发现,并进一步证明在交通标志识别的实际数据应用中所揭示的现象。我们的结果为代表性学习的普遍信念提供了重要的补充,这表明至少在某些模型设置中,尽管DNN的性能与理想的两步过程的性能相当,后者先验地了解了真实的潜在聚类信息,但它确实并没有真正在其任何层中进行集群。我们还提供一些统计见解和启发式论据,以支持我们的经验发现,并进一步证明在交通标志识别的实际数据应用中所揭示的现象。我们的结果为代表性学习的普遍信念提供了重要的补充,这表明至少在某些模型设置中,尽管DNN的性能与理想的两步过程相当,后者先验地了解了真实的潜在聚类信息,但它确实并没有真正在其任何层中进行集群。我们还提供一些统计见解和启发式论据,以支持我们的经验发现,并进一步证明在交通标志识别的实际数据应用中所揭示的现象。
更新日期:2020-06-25
down
wechat
bug