Classification of origin with feature selection and network construction for folk tunes
Introduction
In folk music analysis, there is an active research in building quantitative approaches to describe and locate melodies, attribute authorship of manuscripts and to study differences and similarities between individual songs, and songs of different geographical regions, ethnic groups, or other origins. Machine learning methods provide powerful tools for this, and benefit from various tools to digitize music [10], [30] and existing databases [11], [36]. Classification methods can not only solidify or refute arguments by musicologists on the basis of more data than a human could analyze, but they can also direct attention towards informative patterns in data, and raise new questions in musicology.
The idea of describing folk music of various countries by extracted features dates back to the cantometrics by Alan Lomax [33]. It has since been addressed with various methods (see the survey by [42]). Classification of folk music has been addressed by [22] and [44], who grouped 360 songs into 26 tune families based on human annotations both global and local features.
Melody n-grams have been used widely for music similarity based on melodies (e.g. by [23], for genre classification [49] and attribution of composers [27]. [8] use melodic, harmonic and rhythmic features of short sequences for style identification. [7] uses extracted patterns based on pitch and duration contours to discriminate Swiss and Austrian music.
A different approach to quantify folk tune similarity has been taken by [45], [46], [47] by using multilevel comparison, and local alignment to construct a similarity graph. Empirical evaluation of similarity measures have been introduced by [23] and by [43].
Statistical and Machine learning algorithms on have been used on extracted features by [48], in particular on folk songs [4], [24], [40]. [15] classified 3367 folk songs from 6 geographical European regions use statistics on note counts, intervals and n-grams for pitch. [18] did a study on audio data where they detected tune families with k-nearest neighbours, using feature selection.
Whereas many similarity measures between songs use the edit distance, this is not easily applicable for songs of varying length. [12], [34] and [13] construct song similarity networks from feature vectors, and propose a general similarity measure that avoids the problem of highly connected songs, however without using the content of the data.
In this paper, we focus on two questions. Firstly, we explore to what extent the features we use to describe melody are good predictors of a song’s origin. We show some statistical differences between musical groups of origin. We train random forest classifiers to predict the origin of songs based on their features, and we use the features’ importance values to explore which musical elements distinguish between pairs of origins. Secondly, we are interested in song similarity, independently of the origin. We construct novel similarity networks that avoid hubs while using information from classification about the importance of features. These networks allow interactive exploration of our dataset of songs by similarity, and for us to study song similarity across different origins.
Section snippets
Data set
We use 80 tunes per group of origin in MIDI format, which were collected from various sources. Some were generated with the software musescore [25] from sheet music, and most were taken from existing collections [9], [14], [37], [38]. Only the tune itself without accompaniment was used. A list of songs used is given in https://github.com/cmetzig/Data_PatternRecognitionLetters2020. The tonic note was annotated manually, since some of the features require it, and neither the key in the MIDI files
Classification
Random forests [5] are a supervised learning method based on decision trees. Individual decision trees, as classifiers, tend to have a low bias but high variance, since they have a tendency to overfit the training data [32]. In contrast, random forests collect many decision trees (here 500) which are constructed from a random subset of features. This procedure decreases variance while keeping a low bias (high quality of fit). It is possible to calculate the importance values for each feature,
Feature extraction
Fig. 1 shows several example features and their distribution across the dataset. The features occur with different frequencies in songs of different origin (see Fig. 2). This allows some insight on the scales used, for example the fact that American and Spiritual songs spend relatively little time on the fourth (the fifth halftone) reflects that the pentatonic scale is widely used. German songs are mainly in a major key, and they spend a large fraction of time on the major third (fourth
Discussion
We have explored a wide range of numerical features of songs, but of course there remain many features we have not included - for example, the sequence of first notes of each bar, sequences of every 2nd, 3rd, or 4th note, analogous sequences of intervals and rhythms and many more. Furthermore, we have some limitations within our feature set that prevent expansion to very large data sets; for example, there is no systematic way to check the tonic note, which is necessary for the note length
Declaration of Competing Interest
we hereby state that we have no conflict of interest.
Acknowledgements
This work is supported by the Fusing Semantic and Audio Technologies for Intelligent Music Production and Consump- tion (FAST-IMPACt) project by the EPSRC (EP/L019981/1), by the Government of Canada’s Canada 150 Research Chair Program, and by the EPSRC project EP/K026003/1. The support of Climate-KIC/European Institute of Innovation and Technology (ARISE project) is gratefully acknowledged.
We thank Yannick Wey, Celia Pendlebury, Julia Bishop, Thomas Binns and Michael McLoughlin for helpful
References (49)
- et al.
A scale-free distribution of false positives for a large class of audio similarity measures
Pattern Recognit.
(2008) Decision forest: twenty years of research
Inf. Fusion
(2016)- et al.
Join my party! how can we enhance social interactions in music streaming
Proceedings of Web Audio Conference
(2019) - et al.
The New Penguin Book of English Folk Songs
(2012) - et al.
Calculating similarity of folk song variants with melody-based features.
ISMIR
(2009) Random forests
Mach. Learn.
(2001)- et al.
An empirical comparison of supervised learning algorithms
Proceedings of the 23rd international conference on Machine learning
(2006) Melody classification using patterns
Second international workshop on machine learning and music
(2009)- et al.
Musical style classification from symbolic data: A two-styles case study
International Symposium on Computer Music Modeling and Retrieval
(2003) - F. Dorfer,...
Tunepal: searching a digital library of traditional music scores
OCLC Syst. Serv.
A mirex meta-analysis of hubness in audio music similarity.
ISMIR
Mutual proximity graphs for improved reachability in music recommendation
J. New Music Res.
Global feature versus event models for folk song classification.
ISMIR
Can shared-neighbor distances defeat the curse of dimensionality?
International Conference on Scientific and Statistical Database Management
Evaluation of melody similarity measures
A comparison between global and local features for computational classification of folk song melodies
J. New Music Res.
Building predictive models in r using the caret package
J. Stat. Softw.
Der Naturjodel in der Schweiz: Wesen, Entstehung, Charakteristik, Verbreitung: ein Forschungsergebnis über den Naturjodel in der Schweiz
Classification and regression by randomforest
R News
Universality and diversity in human song
Science
Optimizing measures of melodic similarity for the exploration of a large folk song database.
ISMIR
Cited by (0)
Editor: Prof. G. Sanniti di Baja.