Abstract
Nowadays, how to effectively evaluate visual properties has become a popular topic for fine-grained visual comprehension. In this paper we study the problem of how to estimate such visual properties from a ranking perspective with the help of the annotators from online crowdsourcing platforms. The main challenges of our task are two-fold. On one hand, the annotations often contain contaminated information, where a small fraction of label flips might ruin the global ranking of the whole dataset. On the other hand, considering the large data capacity, the annotations are often far from being complete. What is worse, there might even exist imbalanced annotations where a small subset of samples are frequently annotated. Facing such challenges, we propose a robust ranking framework based on the principle of Hodge decomposition of imbalanced and incomplete ranking data. According to the HodgeRank theory, we find that the major source of the contamination comes from the cyclic ranking component of the Hodge decomposition. This leads us to an outlier detection formulation as sparse approximations of the cyclic ranking projection. Taking a step further, it facilitates a novel outlier detection model as Huber’s LASSO in robust statistics. Moreover, simple yet scalable algorithms are developed based on Linearized Bregman Iteration to achieve an even less biased estimator. Statistical consistency of outlier detection is established in both cases under nearly the same conditions. Our studies are supported by experiments with both simulated examples and real-world data. The proposed framework provides us a promising tool for robust ranking with large scale crowdsourcing data arising from computer vision.
Similar content being viewed by others
References
Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1), 183–202.
Bickel, P. J., Ritov, Y., & Tsybakov, A. B. (2009). Simultaneous analysis of lasso and dantzig selector. The Annals of Statistics, 37(4), 1705–1732.
Chen, K. T., Wu, C. C., Chang, Y.C., & Lei, C. L. (2009). A crowdsourceable QoE evaluation framework for multimedia content. In ACM International Conference on Multimedia (pp. 491–500).
Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794).
Cortes, C., Mohri, M., & Rastogi, A. (2007). Magnitude-preserving ranking algorithms. In International Conference on Machine learning (pp. 169–176).
Dereich, S., & Müller-Gronbach, T. (2019). General multilevel adaptations for stochastic approximation algorithms of robbins-monro and polyak-ruppert type. Numerische Mathematik, 142(2), 279–328.
Donoho, D. L., & Huo, X. (2001). Uncertainty principles and ideal atomic decomposition. IEEE Transactions on Information Theory, 47(7), 2845–2862.
Eichhorn, A., Ni, P., & Eg, R. (2010). Randomised pair comparison: an economic and robust method for audiovisual quality assessment. In ACM workshop on network and operating systems support for digital audio and video (pp. 63–68).
Fan, J., & Li, R. (2001a). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360.
Fan, J., & Li, R. (2001b). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360.
Fu, Y., Xiang, T., Jiang, Y., Xue, X., Sigal, L., & Gong, S. (2018). Recent advances in zero-shot recognition: Toward data-efficient understanding of visual content. IEEE Signal Processing Magazine, 35(1), 112–125.
Fu, Y., Liu, C., Li, D., Sun, X., Zeng, J., & Yao, Y. (2019). Parsimonious deep learning: A differential inclusion approach with global convergence. http://arxiv.org/abs/1905.09449.
Grandvalet, Y., Chiquet, J., & Ambroise, C. (2018). Sparsity by worst-case quadratic penalties. http://arxiv.org/abs/1210.2077.
He, Z., Zuo, W., Kan, M., Shan, S., & Chen, X. (2019). Attgan: Facial attribute editing by only changing what you want. IEEE Transactions on Image Processing, 28(11), 5464–5478.
Huang, C., & Yao, Y. (2018). A unified dynamic approach to sparse model selection. In International conference on artificial intelligence and statistics (pp. 2047–2055).
Huber, P. J. (1973). Robust regression: Asymptotics, conjectures and monte carlo. The Annals of Statistics, 1(5), 799–821.
Huber, P. J. (1981). Robust Statistics. Hoboken: Wiley.
Jiang, X., Lim, L. H., Yao, Y., & Ye, Y. (2010). Statistical ranking and combinatorial Hodge theory. Mathematical Programming, 127(1), 203–244.
Kahle, M. (2009). Topology of random clique complexes. Discrete Mathematics, 309(6), 1658–1671.
Kahle, M. (2014). Sharp vanishing thresholds for cohomology of random flag complexes. The Annals of Mathematics, 179(3), 1085–1107.
Kong, S., Shen, X., Lin, Z. L., Mech, R., & Fowlkes, C. C. (2016). Photo aesthetics ranking network with attributes and content adaptation. In European conference on computer vision (pp. 662–679).
Kovashka, A., & Grauman, K. (2015). Discovering attribute shades of meaning with the crowd. International Journal of Computer Vision, 114(1), 56–73.
Le Callet, P., & Autrusseau, F. (2005). Subjective quality assessment irccyn/ivc database. http://www2.irccyn.ec-nantes.fr/ivcdb/.
Li, G., Wang, J., Zheng, Y., & Franklin, M. J. (2016). Crowdsourced data management: A survey. IEEE Transactions on Knowledge and Data Engineering, 28(9), 2296–2319.
Li, W., Lu, J., Feng, J., Xu, C., Zhou, J., & Tian, Q. (2019). Bridgenet: A continuity-aware probabilistic network for age estimation. In IEEE conference on computer vision and pattern recognition (pp. 1145–1154).
Liang, J., & Schönlieb, C. (2018). Faster FISTA. In European signal processing conference (pp. 1–9).
Miao, Y., & Dong, M. (2018). Asymptotic behavior for the robbins-monro process. Journal of Applied Probability, 55(2), 559–570.
Osher, S., Ruan, F., Xiong, J., Yao, Y., & Yin, W. (2016). Sparse recovery via differential inclusions. Applied and Computational Harmonic Analysis, 41(2), 436–469.
Osting, B., Darbon, J., & Osher, S. (2013). Statistical ranking using the \(l_1\)-norm on graphs. AIMS Journal on Inverse Problems and Imaging, 7(3), 907–926.
Pan, H., Han, H., Shan, S., & Chen, X. (2018). Mean-variance loss for deep age estimation from a face. In IEEE conference on computer vision and pattern recognition (pp. 5285–5294).
Parthasarathy, S., Lotfian, R., & Busso, C. (2017). Ranking emotional attributes with deep neural networks. In International conference on acoustics, speech and signal processing (pp. 4995–4999).
She, Y., & Owen, A. B. (2011). Outlier detection using nonconvex penalized regression. Journal of the American Statistical Association, 106(494), 626–639.
Sheikh, H. R., Wang, Z., Cormack, L., & Bovik, A. C. (2008). LIVE image & video quality assessment database. http://live.ece.utexas.edu/research/quality/.
Shen, F., Zhou, X., Yu, J., Yang, Y., Liu, L., & Shen, H. T. (2019). Scalable zero-shot learning via binary visual-semantic embeddings. IEEE Transactions on Image Processing, 28(7), 3662–3674.
Stanley Osher, D. G. J. X., Burger, Martin, & Yin, W. (2005). An iterative regularization method for total variation-based image restoration. SIAM Journal on Multiscale Modeling and Simulation, 4(2), 460–489.
Sun, T., & Zhang, C. (2012). Scaled sparse linear regression. Biometrika, 99(4), 879–898.
Tao, S., Boley, D., & Zhang, S. (2016). Local linear convergence of ISTA and FISTA on the LASSO problem. SIAM Journal on Optimization, 26(1), 313–336.
Tay, J. K., Friedman, J., & Tibshirani, R. (2018). Principal component-guided sparse regression. http://arxiv.org/abs/1810.04651.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58(1), 267–288.
Tibshirani, R., & Friedman, J. (2019). A pliable lasso. Journal of Computational and Graphical Statistics, 1(1), 1–11.
Tropp, J. A. (2004). Greed is good: Algorithmic results for sparse approximation. IEEE Transactions on Information Theory, 50(10), 2231–2242.
Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using \(l_1\)-constrained quadratic programming. IEEE Transactions on Information Theory, 55(5), 2183–2202.
Wu, C. C., Chen, K. T., Chang, Y. C., & Lei, C. L. (2013). Crowdsourcing multimedia QoE evaluation: A trusted framework. IEEE Transactions on Multimedia, 15(5), 1121–1137.
Xu, Q., Jiang, T., Yao, Y., Huang, Q., Yan, B., & Lin, W. (2011). Random partial paired comparison for subjective video quality assessment via HodgeRank. In ACM international conference on multimedia (pp. 393–402).
Xu, Q., Huang, Q., Jiang, T., Yan, B., Lin, W., & Yao, Y. (2012a). HodgeRank on random graphs for subjective video quality assessment. IEEE Transactions on Multimedia, 14(3), 844–857.
Xu, Q., Huang, Q., & Yao, Y. (2012b). Online crowdsourcing subjective image quality assessment. In ACM International conference on multimedia (pp. 359–368).
Xu, Q., Xiong, J., Huang, Q., & Yao, Y. (2013). Robust evaluation for quality of experience in crowdsourcing. In ACM international conference on multimedia (pp. 43–52).
Xu, Q., Xiong, J., Huang, Q., & Yao, Y. (2014). Online hodgerank on random graphs for crowdsourceable QoE evaluation. IEEE Transactions on Multimedia, 16(2), 373–386.
Xu, Q., Xiong, J., Cao, X., & Yao, Y. (2016). False discovery rate control and statistical quality assessment of annotators in crowdsourced ranking. In International conference on machine learning (pp. 1282–1291).
Xu, Q., Yan, M., Huang, C., Xiong, J., Huang, Q., & Yao, Y. (2017). Exploring outliers in crowdsourced ranking for qoe. In ACM international conference on multimedia (pp 1540–1548).
Yin, W., Osher, S., Darbon, J., & Goldfarb, D. (2008). Bregman iterative algorithms for compressed sensing and related problems. SIAM Journal on Imaging Sciences, 1(1), 143–168.
Yu, S. X. (2012). Angular embedding: A robust quadratic criterion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(1), 158–173.
Zhang, G., Kan, M., Shan, S., & Chen, X. (2018). Generative adversarial network with spatial attention for face attribute editing. In European conference on computer vision (pp. 422–437).
Zhao, B., Sun, X., Fu, Y., Yao, Y., & Wang, Y. (2018). Msplit LBI: realizing feature selection and dense estimation simultaneously in few-shot and zero-shot learning. In International conference on machine learning (pp. 5907–5916).
Zhao, P., & Yu, B. (2006). On model selection consistency of lasso. Journal of Machine Learning Research, 7(12), 2541–2567.
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320.
Acknowledgements
This work was supported in part by the National Key R&D Program of China under Grant No. 2018AAA0102003, in part by National Natural Science Foundation of China: 61861166002, U1736219, 61976202, U1803264, 61620106009, 61931008 and 61836002, in part by Youth Innovation Promotion Association CAS, and in part by the Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDB28000000. The research of Yuan Yao was supported in part by Hong Kong Research Grant Council (HKRGC) Grant 16303817, ITF UIM/390, as well as awards from Tencent AI Lab, Si Family Foundation, and Microsoft Research-Asia.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Communicated by Subhransu Maji.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Xu, Q., Xiong, J., Cao, X. et al. Evaluating Visual Properties via Robust HodgeRank. Int J Comput Vis 129, 1732–1753 (2021). https://doi.org/10.1007/s11263-021-01438-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-021-01438-y