An experimental study of graph-based semi-supervised classification with additional node information

Lebichot, Bertrand; Saerens, Marco

doi:10.1007/s10115-020-01500-0

An experimental study of graph-based semi-supervised classification with additional node information

Regular Paper
Published: 09 October 2020

Volume 62, pages 4337–4371, (2020)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

223 Accesses
3 Citations
Explore all metrics

Abstract

The volume of data generated by internet and social networks is increasing every day, and there is a clear need for efficient ways of extracting useful information from them. As this information can take different forms, it is important to use all the available data representations for prediction; this is often referred to multi-view learning. In this paper, we consider semi-supervised classification using both regular, plain, tabular, data and structural information coming from a network structure (feature-rich networks). Sixteen techniques are compared and can be divided in three families: the first one uses only the plain features to fit a classification model, the second uses only the network structure, and the last combines both information sources. These three settings are investigated on 10 real-world datasets. Furthermore, network embedding and well-known autocorrelation indicators from spatial statistics are also studied. Possible applications are automatic classification of web pages or other linked documents, of nodes in a social network, or of proteins in a biological complex system, to name a few. Based on our findings, we draw some general conclusions and advice to tackle this particular classification task: it is clearly observed that some dataset labelings can be better explained by their graph structure or by their features set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Robust classification of graph-based data

Article 27 November 2018

Carlos M. Alaíz, Michaël Fanuel & Johan A. K. Suykens

On the Choice of Kernel and Labelled Data in Semi-supervised Learning Methods

Influence of Graph Construction on Semi-supervised Learning

Notes

Graph and network will be used interchangeably.
Recall that autocorrelation means that neighboring nodes tend to take similar values.
Hence the name autologistic.
The datasets are available at http://github.com/B-Lebichot/Research.

References

Abney S (2008) Semisupervised learning for computational linguistics. Chapman and Hall/CRC, Boca Raton
MATH Google Scholar
Akamatsu T (1996) Cyclic flows, Markov process and stochastic traffic assignment. Transp Res B 30(5):369–386
Google Scholar
Anselin L (1988) Spatial econometrics: methods and models. Kluwer Academic Press, New York
MATH Google Scholar
Augustin NH, Mugglestone MA, Buckland ST (1996) An autologistic model for the spatial distribution of wildlife. J Appl Ecol 33(2):339–347
Google Scholar
Augustin NH, Mugglestone MA, Buckland ST (1998) The role of simulation in modelling spatially correlated data. Environmetrics 9(2):175–196
Google Scholar
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from examples. J Mach Learn Res 7:2399–2434
MathSciNet MATH Google Scholar
Benali H, Escofier B (1990) Analyse factorielle lissee et analyse des differences locales. Revue de Statistique Appliquee 38(2):55–76
Google Scholar
Besag JE (1972) Nearest-neighbour systems and the auto-logistic model for binary data. J R Stat Soc Ser B (Methodol) 34(1):75–83
MathSciNet MATH Google Scholar
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on computational learning theory, COLT’ 98, pp 92–100. ACM, New York
Borcard D, Legendre P (2002) All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices. Ecol Model 153(1–2):51–68
Google Scholar
Bottou L, Lin CJ (2007) Support vector machine solvers. In: Bottou L et al (eds) Large scale kernel machines. MIT Press, Cambridge, pp 1–28
Google Scholar
Chapelle O, Scholkopf B, Zien A (eds) (2006) Semi-supervised learning. MIT Press, Cambridge
Google Scholar
Chen D, Cheng X (2001) An asymptotic analysis of some expert fusion methods. Pattern Recognit Lett 22:901–904
MATH Google Scholar
Chung FR (1997) Spectral graph theory. American Mathematical Society, Providence
MATH Google Scholar
Cooke RM (1991) Experts in uncertainty. Oxford University Press, Oxford
Google Scholar
Courtain S, Lebichot B, Kivimaki I, Saerens M (2019) Graph-based fraud detection with the free energy distance. In: Proceedings of the 8th international conference on complex networks and their applications (complex networks 2019). Springer, pp 40–52
de Jong P, Sprenger C, van Veen F (1984) On extreme values of Moran’s I and Geary’s c. Geogr Anal 16(1):17–24
Google Scholar
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc B 39(1):1–38
MATH Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Devooght R, Mantrach A, Kivimaki I, Bersini H, Jaimes A, Saerens M (2014) Random walks based modularity: application to semi-supervised learning. In: Proceedings of the 23rd international conference on World Wide Web, WWW ’14, pp 213–224
Dray S, Legendre P, Peres-Neto P (2006) Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices. Ecol Model 196(3–4):483–493
Google Scholar
Dubois D, Grabisch M, Prade H, Smets P (1999) Assessing the value of a candidate: comparing belief function and possibility theories. In: Proceedings of the 15th international conference on uncertainty in artificial intelligence, pp 170–177
Fan R, Chang K, Hsieh C, Wang X, Lin C (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874
MATH Google Scholar
Fouss F, Francoisse K, Yen L, Pirotte A, Saerens M (2012) An experimental investigation of kernels on a graph on collaborative recommendation and semisupervised classification. Neural Netw 31:53–72
MATH Google Scholar
Fouss F, Pirotte A, Renders JM, Saerens M (2007) Random-walk computation of similarities between nodes of a graph, with application to collaborative recommendation. IEEE Trans Knowl Data Eng 19(3):355–369
Google Scholar
Fouss F, Saerens M (2004) Yet another method for combining classifiers outputs: a maximum entropy approach. In: Proceedings of the 5th international workshop on multiple classifier systems (MCS 2004), lecture notes in computer science, vol 3077. Springer, pp 82–91
Fouss F, Saerens M, Shimbo M (2016) Algorithms and models for network data and link analysis. Cambridge University Press, Cambridge
Google Scholar
Francoisse K, Kivimaki I, Mantrach A, Rossi F, Saerens M (2017) A bag-of-paths framework for network data analysis. Neural Netw 90:90–111
MATH Google Scholar
Gammerman A, Vapnik V, Vowk V (1998) Learning by tranduction. In: Proceedings of the 14th conference on uncertainty in artificial intelligence. Wisconsin, pp 273–297
Gartner T (2008) Kernels for structured data. World Scientific Publishing, Singapore
MATH Google Scholar
Geary RC (1954) The contiguity ratio and statistical mapping. Incorp Stat 5(3):115–146
Google Scholar
Gómez-Chova L, Camps-Valls G, Munoz-Mari J, Calpe J (2008) Semisupervised image classification with Laplacian support vector machines. IEEE Geosci Remote Sens Lett 5(3):336–340
Google Scholar
Green P, Silverman B (1994) Nonparametric regression and generalized linear models. A roughness penalty approach. Chapman & Hall, London
MATH Google Scholar
Haining R (2003) Spatial data analysis. Cambridge University Press, Cambridge
Google Scholar
Hardoon DR, Szedmak SR, Shawe-taylor JR (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16(12):2639–2664
MATH Google Scholar
He X (2010) Laplacian regularized d-optimal design for active learning and its application to image retrieval. IEEE Trans Image Process 19(1):254–263
MathSciNet MATH Google Scholar
Hill S, Provost F, Volinsky C (2006) Network-based marketing: identifying likely adopters via consumer networks. Stat Sci 21(2):256–276
MathSciNet MATH Google Scholar
Hofmann T, Schölkopf B, Smola AJ (2008) Kernel methods in machine learning. Ann Stat 36(3):1171–1220
MathSciNet MATH Google Scholar
Hsu CW, Lin CJ (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425
Google Scholar
Jacobs RA (1995) Methods for combining experts’ probability assessments. Neural Comput 7:867–888
Google Scholar
Jiang X, Gold D, Kolaczyk E (2011) Network-based auto-probit modeling for protein function prediction. Biometrics 67(3):958–966
MathSciNet MATH Google Scholar
Johnson R, Wichern D (2007) Applied multivariate statistical analysis, 6th edn. Prentice Hall, Upper Saddle River
MATH Google Scholar
Kittler J, Alkoot FM (2003) Sum versus vote fusion in multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 25(1):110–115
Google Scholar
Klir GJ, Folger TA (1988) Fuzzy sets, uncertainty, and information. Prentice-Hall, Upper Saddle River
MATH Google Scholar
Kolaczyk ED (2009) Statistical analysis of network data: methods and models. Springer, Berlin
MATH Google Scholar
Kuncheva L (2004) Combining pattern classifiers: methods and algorithms. Wiley, Hoboken
MATH Google Scholar
Lad F (1996) Operational subjective statistical methods. Wiley, Hoboken
MATH Google Scholar
Lebart L (2000) Contiguity analysis and classification. In: Gaul W, Opitz O, Schader M (eds) Data analysis, studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 233–243
Google Scholar
Lebichot B, Braun F, Caelen O, Saerens M (2016) A graph-based, semi-supervised, credit card fraud detection system. In: Proceedings of the 5th international workshop on complex networks and their applications (complex networks 2016). Springer, pp 721–733
Lebichot B, Kivimaki I, Françoisse K, Saerens M (2014) Semi-supervised classification through the bag-of-paths group betweenness. IEEE Trans Neural Netw Learn Syst 25:1173–1186
Google Scholar
LeSage J, Pace RK (2009) Introduction to spatial econometrics. Chapman & Hall, London
MATH Google Scholar
Levy WB, Delic H (1994) Maximum entropy aggregation of individual opinions. IEEE Trans Syst Man Cybern 24(4):606–613
MathSciNet MATH Google Scholar
Lu Q, Getoor L (2001) Link-based classification. In: Proceedings of the 20th international conference on machine learning (ICML 2003), pp 496–503
Macskassy SA, Provost F (2007) Classification in networked data: a toolkit and a univariate case study. J Mach Learn Res 8:935–983
Google Scholar
Mantrach A, van Zeebroeck N, Francq P, Shimbo M, Bersini H, Saerens M (2011) Semi-supervised classification and betweenness computation on large, sparse, directed graphs. Pattern Recognit 44(6):1212–1224
MATH Google Scholar
Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, New York
MATH Google Scholar
McAuley J, Leskovec J (2012) Learning to discover social circles in ego networks. Advances in neural information processing systems (NIPS 25), pp 539–547
McLachlan G, Krishnan T (2008) The EM algorithm and extensions, 2nd edn. Wiley, Hoboken
MATH Google Scholar
Meot A, Chessel D, Sabatier R (1993) Operateurs de voisinage et analyse des donnees spatio-temporelles (in french). In: Lebreton D, Asselain B (eds) Biometrie et environnement. Masson, Paris, pp 45–72
Google Scholar
Merz C (1999) Using correspondence analysis to combine classifiers. Mach Learn 36:226–239
Google Scholar
Moran P (1948) The interpretation of statistical maps. J R Stat Soc B 10:243–251
MathSciNet MATH Google Scholar
Moran P (1950) Notes on continuous stochastic phenomena. Biometrika 37(1/2):17–23
MathSciNet MATH Google Scholar
Mulders D, de Bodt C, Bjelland J, Pentland A, Verleysen M, de Montjoye Y (2019) Inference of node attributes from social network assortativity. Neural Comput Appl 1433–3058:1–21
Google Scholar
Myung IJ, Ramamoorti S, Andrew D, Bailey J (1996) Maximum entropy aggregation of expert predictions. Manag Sci 42(10):1420–1436
MATH Google Scholar
Newman M (2006) Modularity and community structure in networks. Proc Natl Acad Sci U S A 103(23):8577–8582
Google Scholar
Newman M (2018) Networks: an introduction, 2nd edn. Oxford University Press, Oxford
Google Scholar
Newman M, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113
Google Scholar
Pawitan Y (2001) In all likelihood: statistical modelling and inference using likelihood. Oxford University Press, Oxford
MATH Google Scholar
Pfeiffer D, Robinson T, Stevenson M, Stevens K, Rogers D, Clements A (2008) Spatial analysis in epidemiology. Oxford University Press, Oxford
MATH Google Scholar
Prithviraj S, Galileo G, Bilgic M, Getoor L, Gallagher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93–106
Google Scholar
Roth V (2001) Probabilistic discriminative kernel classifiers for multi-class problems. In: Radig B, Florczyk S (eds) Pattern recognition: proceedings of the 23rd DAGM symposium, lecture notes in computer science, vol 2191. Springer, Berlin, pp 246–253
Google Scholar
Saerens M, Achbany Y, Fouss F, Yen L (2009) Randomized shortest-path problems: two related models. Neural Comput 21(8):2363–2404
MathSciNet MATH Google Scholar
Scholkopf B, Smola A (2002) Learning with kernels. The MIT Press, Cambridge
MATH Google Scholar
Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge
MATH Google Scholar
Silva T, Zhao L (2016) Machine learning in complex networks. Springer, Berlin
MATH Google Scholar
Subramanya A, Pratim Talukdar P (2014) Graph-based semi-supervised learning. Morgan & Claypool Publishers, San Rafael
MATH Google Scholar
Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23:2031–2038
Google Scholar
Tang L, Liu H (2009) Relational learning via latent social dimensions. In: Proceedings of the ACM conference on knowledge discovery and data mining (KDD 2009), pp 817–826
Tang L, Liu H (2009) Scalable learning of collective behavior based on sparse social dimensions. In: Proceedings of the ACM conference on information and knowledge management (CIKM 2009), pp 1107–1116
Tang L, Liu H (2010) Toward predicting collective behavior via social dimension extraction. IEEE Intell Syst 25(4):19–25
Google Scholar
Van Vlasselaer V, Bravo C, Caelen O, Eliassi-Rad T, Akogu L, Snoeck M, Baesens B (2015) APATE: a novel approach for automated credit card transaction fraud detection using network-based extensions. Decis Support Syst 75:38–48
Google Scholar
von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
MathSciNet Google Scholar
Waldhor T (2006) Moran’s spatial autocorrelation coefficient. In: Kotz S, Balakrishnana N, Read C, Vidakovic B, Johnson N (eds) Encyclopedia of statistical sciences, vol 12, 2nd edn. Wiley, Hoboken, pp 7875–7878
Google Scholar
Waller L, Gotway C (2004) Applied spatial statistics for public health data. Wiley, Hoboken
MATH Google Scholar
Zhang D, Mao R (2008) Classifying networked entities with modularity kernels. In: Proceedings of the 17th ACM conference on information and knowledge management (CIKM 2008). ACM, pp 113–122
Zhao J, Xie X, Xu X, Sun S (2017) Multi-view learning overview: recent progress and new challenges. Inf Fusion 38(C):43–54
Google Scholar
Zhou D, Bousquet O, Lal T, Weston J, Scholkopf B (2003) Learning with local and global consistency. In: Proceedings of the neural information processing systems conference (NIPS 2003), pp 237–244
Zhu X (2008) Semi-supervised learning literature survey. Unpublished manuscript from the Computer Science Department of the University of Wisconsin-Madison. http://pages.cs.wisc.edu/~jerryzhu/research/ssl/semireview.html
Zhu X, Goldberg A (2009) Introduction to semi-supervised learning. Morgan & Claypool Publishers, San Rafael
MATH Google Scholar

Download references

Acknowledgements

This work was partially supported by the Elis-IT project funded by the “Région wallonne” and the Brufence project supported by INNOVIRIS (“Région bruxelloise”), Belgium. We thank this institution for giving us the opportunity to conduct both fundamental and applied research. We also thank the anonymous reviewers for their relevant remarks and suggestions that helped us to improve significantly the manuscript.

Author information

Authors and Affiliations

Machine Learning Group – ICTEAM & LSM, Université catholique de Louvain, Place des Doyens 1, 1348, Louvain-la-Neuve, Belgium
Bertrand Lebichot & Marco Saerens

Authors

Bertrand Lebichot
View author publications
You can also search for this author in PubMed Google Scholar
Marco Saerens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bertrand Lebichot.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lebichot, B., Saerens, M. An experimental study of graph-based semi-supervised classification with additional node information. Knowl Inf Syst 62, 4337–4371 (2020). https://doi.org/10.1007/s10115-020-01500-0

Download citation

Received: 26 May 2018
Revised: 21 July 2020
Accepted: 25 July 2020
Published: 09 October 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s10115-020-01500-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An experimental study of graph-based semi-supervised classification with additional node information

Abstract

Access this article

Similar content being viewed by others

Robust classification of graph-based data

On the Choice of Kernel and Labelled Data in Semi-supervised Learning Methods

Influence of Graph Construction on Semi-supervised Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An experimental study of graph-based semi-supervised classification with additional node information

Abstract

Access this article

Similar content being viewed by others

Robust classification of graph-based data

On the Choice of Kernel and Labelled Data in Semi-supervised Learning Methods

Influence of Graph Construction on Semi-supervised Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation