当前位置: X-MOL 学术Int. J. Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust Detection and Affine Rectification of Planar Homogeneous Texture for Scene Understanding
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2018-03-22 , DOI: 10.1007/s11263-018-1078-2
Shahzor Ahmad , Loong-Fah Cheong

Man-made environments tend to be abundant with planar homogeneous texture, which manifests as regularly repeating scene elements along a plane. In this work, we propose to exploit such structure to facilitate high-level scene understanding. By robustly fitting a texture projection model to optimal dominant frequency estimates in image patches, we arrive at a projective-invariant method to localize such generic, semantically meaningful regions in multi-planar scenes. The recovered projective parameters also allow an affine-ambiguous rectification in real-world images marred with outliers, room clutter, and photometric severities. Comprehensive qualitative and quantitative evaluations are performed that show our method outperforms existing representative work for both rectification and detection. The potential of homogeneous texture for two scene understanding tasks is then explored. Firstly, in environments where vanishing points cannot be reliably detected, or the Manhattan assumption is not satisfied, homogeneous texture detected by the proposed approach is shown to provide alternative cues to obtain a scene geometric layout. Second, low-level feature descriptors extracted upon affine rectification of detected texture are found to be not only class-discriminative but also complementary to features without rectification, improving recognition performance on the 67-category MIT benchmark of indoor scenes. One of our configurations involving deep ConvNet features outperforms most current state-of-the-art work on this dataset, achieving a classification accuracy of 76.90%. The approach is additionally validated on a set of 31 categories (mostly outdoor man-made environments exhibiting regular, repeating structure), being a subset of the large-scale Places2 scene dataset.

中文翻译:

用于场景理解的平面均匀纹理的鲁棒检测和仿射校正

人造环境往往具有丰富的平面均匀纹理,表现为沿平面有规律地重复场景元素。在这项工作中,我们建议利用这种结构来促进高级场景理解。通过将纹理投影模型稳健地拟合到图像块中的最佳主频估计,我们得出了一种投影不变的方法来定位多平面场景中的这种通用的、语义上有意义的区域。恢复的投影参数还允许对带有异常值、房间杂乱和光度严重性的真实世界图像进行仿射模糊校正。进行了全面的定性和定量评估,表明我们的方法在整改和检测方面都优于现有的代表性工作。然后探索了均匀纹理在两个场景理解任务中的潜力。首先,在无法可靠地检测到消失点或不满足曼哈顿假设的环境中,所提出的方法检测到的同质纹理显示为提供替代线索以获得场景几何布局。其次,发现对检测到的纹理进行仿射校正后提取的低级特征描述符不仅具有类别判别性,而且还与未校正的特征互补,从而提高了室内场景 67 类 MIT 基准的识别性能。我们涉及深度 ConvNet 特征的配置之一在该数据集上的表现优于当前最先进的工作,实现了 76.90% 的分类准确率。
更新日期:2018-03-22
down
wechat
bug