Application of Machine Learning to Electroencephalography for the Diagnosis of Primary Progressive Aphasia: A Pilot Study

Moral-Rubio, Carlos; Balugo, Paloma; Fraile-Pereda, Adela; Pytel, Vanesa; Fernández-Romero, Lucía; Delgado-Alonso, Cristina; Delgado-Álvarez, Alfonso; Matias-Guiu, Jorge; Matias-Guiu, Jordi A.; Ayala, José Luis

doi:10.3390/brainsci11101262

Open AccessArticle

Application of Machine Learning to Electroencephalography for the Diagnosis of Primary Progressive Aphasia: A Pilot Study

by

Carlos Moral-Rubio

¹

,

Paloma Balugo

²,

Adela Fraile-Pereda

²,

Vanesa Pytel

³,

Lucía Fernández-Romero

³,

Cristina Delgado-Alonso

³,

Alfonso Delgado-Álvarez

³

,

Jorge Matias-Guiu

³,

Jordi A. Matias-Guiu

^3,*,† and

José Luis Ayala

^1,†

¹

Department of Computer Arquitecture and Automation, Faculty of Informatics, Universidad Complutense de Madrid, 28040 Madrid, Spain

²

Department of Neurophysiology, Institute of Neuroscience, Hospital Clínico San Carlos (IdISCC), Universidad Complutense de Madrid, 28040 Madrid, Spain

³

Department of Neurology, Institute of Neuroscience, Hospital Clínico San Carlos (IdISCC), Universidad Complutense de Madrid, 28040 Madrid, Spain

^*

Author to whom correspondence should be addressed.

^†

Jordi A. Matias-Guiu and José Luis Ayala contributed equally to this work as senior authors.

Brain Sci. 2021, 11(10), 1262; https://doi.org/10.3390/brainsci11101262

Submission received: 12 August 2021 / Revised: 22 August 2021 / Accepted: 23 September 2021 / Published: 24 September 2021

(This article belongs to the Special Issue Advances in Primary Progressive Aphasia)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Background. Primary progressive aphasia (PPA) is a neurodegenerative syndrome in which diagnosis is usually challenging. Biomarkers are needed for diagnosis and monitoring. In this study, we aimed to evaluate Electroencephalography (EEG) as a biomarker for the diagnosis of PPA. Methods. We conducted a cross-sectional study with 40 PPA patients categorized as non-fluent, semantic, and logopenic variants, and 20 controls. Resting-state EEG with 32 channels was acquired and preprocessed using several procedures (quantitative EEG, wavelet transformation, autoencoders, and graph theory analysis). Seven machine learning algorithms were evaluated (Decision Tree, Elastic Net, Support Vector Machines, Random Forest, K-Nearest Neighbors, Gaussian Naive Bayes, and Multinomial Naive Bayes). Results. Diagnostic capacity to distinguish between PPA and controls was high (accuracy 75%, F1-score 83% for kNN algorithm). The most important features in the classification were derived from network analysis based on graph theory. Conversely, discrimination between PPA variants was lower (Accuracy 58% and F1-score 60% for kNN). Conclusions. The application of ML to resting-state EEG may have a role in the diagnosis of PPA, especially in the differentiation from controls. Future studies with high-density EEG should explore the capacity to distinguish between PPA variants.

Keywords:

electroencephalography; resting-state; primary progressive aphasia; biomarkers machine learning; K-Nearest Neighbors; frontotemporal dementia; Alzheimer’s disease; graph theory

1. Introduction

Primary progressive aphasia (PPA) is a clinical syndrome secondary to the neurodegeneration of language brain regions and networks [1]. There are currently three main variants of PPA recognized in the literature: non-fluent (nfvPPA), semantic (svPPA), and logopenic variants (lvPPA) [2]. Diagnosis of PPA in the early stages is usually challenging. On the one hand, very mild word-finding difficulties may be present in aging, and the insidious onset of PPA symptoms limit an early identification [3]. On the other hand, there is a certain overlap between the PPA variants, especially between nfvPPA and lvPPA [4,5]. Neuropsychological batteries and language assessments are usually time consuming, and a high level of expertise is necessary for an adequate interpretation of the clinical findings. Although some novel and brief cognitive tests are being developed [6,7], neuroimaging and cerebrospinal fluid biomarkers are usually performed to confirm PPA diagnosis and the specific variant in each case. Structural magnetic resonance imaging (MRI) have shown adequate values of sensitivity and specificity for the diagnosis of svPPA, but diagnostic properties for the other variants are poorer [8,9]. Other more advanced MRI sequences show different patterns between PPA variants, but are not generally applicable routinely. Regarding positron emission tomography imaging, the 18F-FDG tracer has shown adequate values for diagnosis of the three variants of PPA, especially svPPA and lvPPA [10]. Amyloid tracers may distinguish between patients with amyloid deposition (generally associated with lvPPA) or not, but it does not have enough sensitivity to discriminate between subtypes of non-Alzheimer’s disease variants. Novel tracers, such as tau tracers, are still under investigation and are not usually available beyond research settings [11]. Consequently, the combination of several tools is often necessary to conduct an adequate diagnosis. However, some of these techniques are not available in all clinical settings, which jeopardizes the equality of opportunities. In recent years, there is an increasing interest in an accurate diagnosis of neurodegenerative disorders. Furthermore, early diagnosis may imply early access to language therapies, which have shown positive effects in PPA [12,13]. In addition, the classification of PPA into three clinical variants improves the prediction of the underlying pathology [14]. Thus, novel and cost-effective biomarkers are necessary for early detection and differential diagnosis between PPA variants.

One of the key processes in neurodegenerative disorders comprises the alterations in brain activity and network disruptions [15]. There are several methods for measuring brain activity, with differences in the spatiotemporal resolution and applicability. Some methods, such as single-unit recordings, have high spatial and temporal precision, but are invasive and are not applicable to large networks and clinical practice. Among the non-invasive methods, functional MRI, magnetoencephalography and electroencephalography (EEG) permit the assessment of brain activity across the entire brain [16]. These methods are generally well tolerated and applicable in clinical practice, and evaluate brain activity with a resolution on the scale of millimeters and centimeters. This means that each voxel of a conventional MRI or a channel of an EEG reflect the activity of thousands of neurons and billions of synapses [17]. In comparison to functional MRI, EEG shows lower spatial but higher temporal resolution. However, both techniques are regarded as useful for the assessment of brain activity and connectivity. As advanced computational algorithms promise to improve signal processing and filtering, noninvasive recording devices are increasingly being investigated and applied. Some approaches record neural potentials from the scalp, and depending on the intensity of recording, they can capture the activity of thousands of neurons. Multiple layers obstruct information transmission from the cerebral cortex to the scalp, resulting in signal amplitudes and spatial resolution that are reduced. The electrodes are also sensitive to external interferences such as eye movements, face movements, chewing, or swallowing, among others [18].

EEG is a widely available technique very useful for the diagnosis of epileptic disease. In the last years, quantitative EEG has also been confirmed as a helpful biomarker in the assessment of several neurodegenerative disorders [19]. These studies suggest a potential clinical application of EEG in the assessment of neurodegenerative disorders, either in the differential diagnosis between them or with other non-neurodegenerative causes, including psychiatric conditions [20]. Data regarding the application of EEG signal in PPA are scarce. In a recent study [21], three patients with nfvPPA and five with primary progressive apraxia of speech (two of them also showing aphasia) underwent EEG. A theta slowing was detected in almost all patients with nfvPPA, suggesting a potential clinical application. Another recent study has detected some particular findings in the analysis of EEG microstates in 8 patients with svPPA in comparison with controls and Alzheimer’s dementia [22].

Machine learning (ML) techniques may be helpful in improving the diagnostic performance of EEG, as has been shown in predicting epileptic seizures [23,24,25,26], Alzheimer’s disease [27], or depression [28,29]. The rationale for the application of ML to EEG is based on the following factors. First, the visual analysis of EEG is time-consuming and requires high levels of expertise. Second, changes in neurodegenerative disorders may be less visually evident than epileptiform activity, which also limits the inter-rater reliability. Third, filter settings, frequency bands and criteria for thresholds are not clearly defined in the setting of neurodegenerative disorders [30].

ML for EEG analysis may be divided into two approaches: feature-based and end-to-end. On the one hand, feature-based decoding algorithms have a long track record of effectiveness in various EEG decoding challenges [20,31]. The data are often represented by handcrafted and previously selected features in this approach. End-to-end decoding algorithms, on the other hand, allow raw or minimally pre-processed data as inputs [32,33]. To date, end-to-end deep learning has gotten much interest due to its success in other disciplines of research. At least for the extraction of the features, this technique might lead to better solutions or the discovery of unexpectedly informative characteristics, and it does not involve handcrafting. In terms of learning features, end-to-end models have a reputation for being “black boxes”.

In this study, we aimed to evaluate the potential of EEG as a biomarker for the diagnosis of PPA. For that purpose, the EEG raw signal was pre-processed in terms of feature transformations to enlarge the representation domain. We evaluated the diagnostic performance of EEG for the diagnosis of PPA, and the differential diagnosis between the three PPA Variants, applying ML models.

2. Materials and Methods

2.1. Participants

Forty patients with PPA were enrolled in this study. All patients met the current diagnostic criteria for PPA [1]. Patients were evaluated with a comprehensive language and neuropsychological protocol, which has been described elsewhere [34]. Structural MRI and FDG-PET were performed in all cases supporting the clinical diagnosis. Accordingly, patients were categorized as nfvPPA (n = 18), svPPA (n = 10), and lvPPA (n = 12). Twenty controls (control group, CG) were also included for comparison. Table 1 shows the details of the groups participating in the study.

The CG was obtained from patients that underwent EEG because of a previous history of syncope, but visual analysis of EEG, neuroimaging, and clinical follow-up were normal, excluding potential neurological disorders.

2.2. EEG Acquisition

EEGs were recorded in a resting state condition with the eyes closed and under the supervision of trained personnel. EEGs were acquired on a NicoletOne device of 32 channels, using the standard 10/20 system and referenced to A1. Time of acquisition was 20 min.

2.3. Preprocessing

Different signal transformations were applied, aiming to expand the amount of information. Signal preprocessing was performed following the pipeline implemented by [35] in EEGLAB Software (Matlab). These procedures try to minimize the external noise and artefacts that are usually present in a raw EEG signal. The following steps were conducted, which are also summarized in Figure 1:

1.: Time ranges selection. Original signals are too long to be analyzed and can contain some additional noise, so we manually selected those time ranges with higher quality in the signal representation to get the most accurate and clean signal. This process also considered the labels recorded during the EEG acquisition and clinical assessment, which notify about the state of the patient, unexpected events, or activities that could impact on the signal.
2.: High-pass filtering at 1 Hz. This filter was applied to remove baseline noise, remove noise introduced by sweating, and prepare the signal for ICA analysis.
3.: Apply CleanLine process with the following configuration: 10 Hz of bandwidth at 50 Hz line frequency. This preprocessing step removes line noise and related harmonics from each one of the scalp channels using a novel approach, as described in [36]. For that purpose, and for each sliding window over the original data, a multi-taper FFT is applied to transform the signal to the frequency domain; after that, the complex amplitude of the desired frequency is extracted. With that information, a noise signal in the frequency domain is generated and, finally, the time-domain associated noise signal that needs to be extracted from the original one is also created.
4.: Re-reference data to average. This is the most effective and easiest way to re-reference EEG data because it establishes that the summed up power across the scalp topography should sum zero. In other words, we removed the mean over all scalp channels to every single channel to make sure that all channels contribute with the same weight.
5.: Low pass filter at 40 Hz. This step was applied in order to remove any possible undesired high-frequency signal that was not removed by CleanLine. Although other investigations are looking for biomarkers in higher frequency ranges of EEG signal, most recent research works are focusing in lower frequency ranges [37]. For simplicity of our analysis and control of error sources, we have limited our work to the lower frequency bands.
6.: To apply ICA (Independent Component Analysis) to the signal. This method is a linear decomposition technique which aims to find the source signals from a set of mixed signals, as it occurs with EEG. Unlike PCA (Principal Component Analysis), ICA tries to retrieve those original signals that are maximally statistical independent in just one domain [38].
7.: To epoch data into windows of duration equal to one second without overlapping.
8.: Visual artifact rejection of epochs. As a final step, we reviewed manually all signals and all their epochs looking for artefacts or undesired signal events.

2.4. Quantitative EEG

QEEG, quantitative EEG, is the frequency domain transformation of the original EEG signal [39]. To obtain this transformed signal, the Discrete Fourier Transform method was applied over our sampled (at 500 Hz) EEG signal. Given a x(n) discrete EEG signal, the definition of Fourier Transform (FT) is as follows:

X_{k} = \sum_{n = 0}^{N} x_{n} e^{- 2 π i k n / N},

(1)

for 0 ≤k≤

N - 1

.

This transformation gave a transformed domain that increases the representation domains of the original signal and, hence, the information provided. We divided the total frequency range into non-overlapping frequency bands:

Delta from 1–4 Hz.
Ipsilon from 4–8 Hz.
Alpha from 8–14 Hz.
Beta from 14–30 Hz.
Gamma from 30–45 Hz.
OoB (out of bag) for frequencies higher than 45 Hz.

2.5. Wavelet Transformation

Wavelet Transform (WT) is the decomposition of the original signal into a set of basis functions consisting of contractions, expansions, and translations [40]. This is a similar approach to FT but using wavelet functions to achieve the transformed domain.

WT of the EEG signal was obtained by filtering repeatedly until reaching the desired level of decomposition. In each repetition, we applied a low-pass filter to obtain the approximation coefficient (CA) and a high-pass filter to obtain the detailed coefficient (CD). After every filtering stage, the signal was down-sampled by half the sampling frequency of the previous level.

A total of seven sequential subdivisions were applied until a sufficient number of transformations, all correctly subdivided in frequency, was achieved. This process provided eight signals, each one assigned to a different frequency range:

Subband 1 from 125 to 250 Hz.
Subband 2 from 62.5 to 125 Hz.
Subband 3 from 31.2 62.5 Hz.
Subband 4 from 15.6 to 31.2 Hz.
Subband 5 from 7.8 to 15.6 Hz.
Subband 6 from 3.9 to 7.8 Hz.
Subband 7 from 1.9 to 3.9 Hz.
Subband 8 from 0 to 1.9 Hz.

From all the extracted signals, sub-bands 1 and 2 were removed from the pipeline because neither of them offered any information after the application of step 5 in the preprocessing pipeline section (low-pass filter at 40 Hz).

2.6. AutoEncoders

Autoencoders are a specific type of neural networks architecture where the input is the same as the output. They compress the input into a lower-dimensional code and then reconstruct the output from this representation. The code is a compact “summary” or “compression” of the input, also called the latent-space representation. This novel modeling technique is also exploited in dimensionality reduction problems.

This architecture was created by using an Encoder-Decoder system (Figure 2). The first part, the Encoder, used fully-connected layers in which the number of input neurons is higher than the number of output neurons, to achieve that reduction of dimensionality. The second part, the Decoder, used fully-connected layers in which the number of input neurons is lower than the number of output neurons to achieve the reconstruction of the original signal.

In addition to this vanilla configuration, there are other complex configurations with multiple hidden layers for the encoding and decoding part, some of them even add noise to the input or to the intermediate part to force the neural network as a regularization technique to a better generalization.

As stated, this type of neural networks are usually applied as a technique for dimensionality reduction, but we decided to use them as a mechanism to transform the time domain (similarly to the FT and WT). We applied the Encoder to the EEG values along time, creating a new domain of features and information representation.

2.7. Graph Theory Analysis

Graph Theory Analysis (GTA) is a mathematical formalism used to model pairwise relations between objects. Here, we used the EEG sources from the different electrodes to generate the network that merges such data [41].

For that purpose, we generated a graph matrix, also called adjacency matrix, by calculating all pairs of partial correlations between all available channels or electrodes [42]. The absolute value operator was also applied to this matrix to achieve our final result, which in this case was an undirected weighted adjacency matrix.

Once the network was created, it was analyzed using different metrics:

Node degree. This metric represents the number of links detected for every node.
Path length. Mean of the shortest links present in the network.
Clustering coefficient. Number of triangular connections in the network, divided by the theoretical maximum number of triangular connections. This variable represents the clustering capacity of the generated network.

In addition to these metrics, we also created a brain representation that shows each electrode as a node in a graph structure, and the connections that we obtained between those electrodes. An example of this representation ca be found in the Figure 3 and Figure 4 where the connections of the CG and PPA groups, filtered by alpha frequency range (from 8 to 14 Hz), are shown, respectively.

2.8. Data Analysis

2.8.1. Binary Classification Model between PPA and CG

A classification model was generated to differentiate between CG and PPA patients based only on the extracted EEG features. Seven classification algorithms were evaluated: Decision Tree, Elastic Net (EN), Support Vector Machine (SVM), Random Forest (RF), k-Nearest Neighbors (kNN), Gaussian Naive Bayes (Gaussian NB) and Multinomial Naive Bayes (Multinomial NB).

The following pre-processing pipeline was applied to prepare the data for modelling:

Train-test split. In this step we randomly generated train and test samples from the original dataset by applying 80% for training sample and 20% for test. This split was stratified, namely, the proportion of examples in each class is preserved into train and test samples.
Scaling. We applied a MinMaxScaler method to each column in order to transform their range of values into the range [0, 1].
Univariate Feature Selection. A feature selection step was applied to reduce the number of features to only 50 features from the original set (309). ANOVA F-value was computed for each column-target model and only the best 50 scores were selected.

In the training process, the selection of the best hyperparameters for each model was accomplished by a Bayes-search optimization algorithm (this optimization algorithm created a full space with all possible hyperparameter values and applied Bayes Theorem in order to find those exact values that minimized the error function). All models performed a binary classification using a 10-fold cross-validation using F1-Score metric (2) as the main metric. This metric was selected to optimize the classification problem, which was imbalanced. Precision (3), Sensitivity (4), Specificity (5) and Youden Index (6) are also displayed.

F 1 - S c o r e = \frac{2 * P r e c i s i o n * S e n s i t i v i t y}{P r e c i s i o n + S e n s i t i v i t y}

(2)

P r e c i s i o n = \frac{T r u e P o s i t i v e s}{T r u e P o s i t i v e s + F a l s e P o s i t i v e s}

(3)

S e n s i t i v i t y = \frac{T r u e P o s i t i v e s}{T r u e P o s i t i v e s + F a l s e N e g a t i v e s},

(4)

S p e c i f i c i t y = \frac{T r u e N e g a t i v e s}{T r u e N e g a t i v e s + F a l s e P o s i t i v e s},

(5)

Y o u d e n I n d e x = S e n s i t i v i t y + S p e c i f i c i t y - 1,

(6)

2.8.2. Classification Model for All Groups

Following the same pipeline described in the binary classification model, a multiclass classification model was applied in order to distinguish between nfvPPA, svPPA, lvPPA and CG. The same pre-processing steps, hyperparameter tuning techniques and cross-validation options were applied here. All models performed a multiclass classification with 4 different classes (one per each group of patients) and using F1-Score metric as their main aim.

2.8.3. Network Analysis

We transformed each EEG signal into a Network; this allowed to extract additional network metrics and enlarge the set of features per patient, but it also allowed to evaluate the differences between two brains in terms of activity according to these network metrics.

We also visualized the generated connections between EEG channels in each group of patients. This provided meaningful information about interactions of brain regions across the different groups of patients.

2.8.4. Principal Components Analysis

In order to explain the complexity of our working dataset, we applied Principal Component Analysis (PCA) to reduce the dimensionality. PCA aims to find the directions of maximum variance in high-dimensional data and projects them onto a new subspace with equal or fewer dimensions than the original one. Hence, it reduces the number of features by combining them linearly. We performed a dimensionality reduction to only two principal components. Accordingly, the visualization of all subjects as data points is allowed by looking for the linear combination of all the extracted features into these two principal components. In this way, it is possible to visualize the multi-dimensional data distribution and evaluate how mixed are the data instances in the representation space.

3. Results

3.1. Classification Model between CG and PPA

Main metrics are summarized in Table 2.

Seven different models were evaluated for the binary classification: Decision Tree, EN, SVM, RF, kNN, Gaussian NB and Multinomial NB (Table 2). kNN model, a non-parametric supervised classification method, achieved the best performance, showing a Sensitivity of 0.88, an F1-Score value of 0.83 and a Specificity of 0.5. The confusion matrix from the best model (kNN) is shown in Figure 5, as well as its ROC curve in Figure 6.

The 20 most important variables used for training are depicted in Figure 7. A Decision Tree model is included in Figure 8. Regarding the most relevant variables, all variables except one were generated by the network transformations. Specifically, 40% from Node Degree, 50% from Clustering Coefficient, 5% Path Length, and 5% qEEG. Similarly, most features used in the Decision Tree algorithm are associated with network analysis.

3.2. Classification Model between All Groups

A multiclass model (4 classes) was developed to evaluate the possibility of automatic detection among all the PPA variants and CG. The same aforementioned models were evaluated (Table 3). Again, kNN model achieved the best performance. However, Sensitivity was 0.58 and F1-Score was 0.6. Confusion matrix for this model is shown in Figure 9.

As in the previous analysis, Figure 10 shows the graphical representation of the Decision Tree Model. In this case, decisions are mainly based on features obtained from network analysis and Autoencoder transformations.

4. Discussion

In this pilot study, we evaluated the diagnostic performance of a resting-state EEG obtained in clinical practice conditions for the diagnosis of PPA. We applied ML models, as they may be helpful to maximize the diagnostic capacity from many variables with no a priori hypotheses. In this regard, diagnostic performance was relatively high for the detection of patients with PPA in comparison with the control group. This suggests that there are certain EEG abnormalities that may be detected in patients with PPA. The most important features ranked by the algorithms for the classification and included in the decision trees algorithms involve mainly temporal and frontal channels in both hemispheres. Interestingly, features derived from network analysis obtained the best classification, emphasizing the role of graph theory in the analysis of EEG data [32]. These findings are consistent with recent investigations that are exploiting this new area of analysis [43,44].

Conversely, the application of EEG to the diagnosis of the specific variant of PPA did not achieve a satisfactory classification. Previous studies using quantitative data from EEG for the differential diagnosis of neurodegenerative disorders have obtained generally better results [31]. For instance, applying support vector machines, a 91% of accuracy was found to distinguish Alzheimer’s disease and dementia with Lewy bodies, and 88% for Alzheimer’s disease and Frontotemporal dementia [45]. Another study achieved a 93.3% of accuracy to classify between Alzheimer’s disease and Frontotemporal dementia [46]. However, these studies were performed with small samples, and were not replicated in larger studies [47]. The application of EEG to PPA is probably more challenging, due to the regional overlap between PPA variants in contrast to other disorders such as Alzheimer’s disease and Frontotemporal Dementia. In this regard, high-density EEG with a larger number of channels might obtain better results in the classification between PPA variants.

Our key insight is that machine learning itself can deal well with errors, qualitative and corrupted data and, more importantly for our purposes, integrate heterogeneous data from multiple domains. With this aim, our research work has enlarged the dataset to increase the information representation. The applied machine learning algorithms can jointly manage transformations from time to frequency domain, wavelet or network representation provided the setup parameters are carefully selected. In [48], a total of 49 experimental studies published from 2009 until 2020, which apply machine learning algorithms on resting-state EEG recordings from AD patients, were reviewed. These works did not evaluate the benefits of increased information representation in classification accuracy. Most of the studies focused on AD detection incorporating Support Vector Machines (SVM). Conversely, we found that classification algorithms based on distance (similar to kNN, where the function is only approximated locally and all computation is deferred until function evaluation) can improve performance.

The visualization of the multi-dimensional data used in our study, and the complexity of such dataset, has been performed with a PCA. Using this method, we found that patients are not clearly separated (Figure 11). This aspect is important for optimal performance in SVM models in the classification [49]). In contrast, kNN model, which is based on local distances, could lead to better predictions. This explains why we observed better performance of kNN model with respect to SVM model. Additionally, kNN model works better with a small number of features [50], which is our case after the application of the feature selection method. To follow this line, we replicated the same pipeline in the CG vs. PPA classification, skipping the feature selection phase. Thus, we compared the results of all models using only the best 50 features against using all generated features. As shown in Table 4, SVM model obtained better performance than kNN. However, the absolute values of the quality metrics (F1-score, precision, sensitivity and accuracy) explain the need for the feature selection process followed in our study.

The trend of increased information representation may be seen in recent works like [51] (still SVM in AD), or [52] (where, apart from EEG, a wide range of diagnostic tests were included). Our approach has focused exclusively in the classification possibilities of the EEG signal for the PPA, but we expect that our results could be improved by the addition of other tests such as neuroimaging, cognitive assessment, or genetics.

Patients included in this study fulfilled the current diagnostic criteria, with both MRI and FDG-PET supporting the diagnosis. As they were generally in early stages and EEG in comparison with controls was discriminative, these findings raise the possibility to explore in future studies the role of EEG in the clinical follow-up of patients with PPA, especially in the setting of clinical trials, in which reliable, reproducible and non-invasive endpoints are necessary [53].

Our study has some limitations. First, we included 40 patients, generally in early stages but not in the first consultation. Future studies should enroll a larger sample size and specifically focusing on patients in the early stages to confirm a potential role of EEG in the detection of PPA. Second, in this study, we only applied ML to EEG data. One of the main strengths of ML is the combination of multiple sources of information. Thus, the application of ML to studies including multimodality assessments (cognitive testing, MRI, PET) may be of interest to disentangle the best tools (isolated or in combination) for diagnosis of PPA [54].

5. Conclusions and Future Work

Our study shows that the application of ML to resting-state EEG may have a role in the diagnosis of PPA, especially in the differentiation from controls. EEG may have some advantages compared with other biomarkers.

ML techniques were applied to evaluate the possibility to automatically classify EEG data from PPA patients with respect to a control group. Our work showed that a feature expansion process can increase the information representation and achieve good classification accuracy, using mainly features from the graph-network representation of the EEG signal. The capability to classify PPA variants was also evaluated. Although lower, the classification capacity is still promising and advises further development of these automatic techniques for phenotype classification from EEG signals.

We are currently increasing the sample size to improve the classification accuracy of the models. In addition, we aim to enlarge the frequency range in the input dataset (over 45 Hz) to evaluate whether higher-frequency components may help the biomarker discovery with machine learning and deep learning methods.

Author Contributions

Conceptualization, J.A.M.-G. and J.L.A.; methodology, P.B., A.D.-Á., A.F.-P., C.M.-R.; formal analysis, C.M.-R. and J.L.A.; investigation, P.B., A.F.-P., V.P., L.F.-R., C.D.-A. and J.A.M.-G.; data curation, A.D.-Á., P.B.A, A.F.-P., V.P., C.D.-A. and J.A.M.-G.; writing—original draft preparation, C.M.-R., P.B., J.L.A. and J.A.M.-G.; writing—review and editing, All; supervision, J.M.-G.; project administration, J.A.M.-G.; funding acquisition, J.M.-G., J.A.M.-G. and J.L.A. All authors have read and agreed to the published version of the manuscript.

Funding

Jordi A. Matias-Guiu is supported by Instituto de Salud Carlos III through the project INT20/00079 (co-funded by European Regional Development Fund “A way to make Europe”).

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of Hospital Clinico San Carlos (protocol code 17/247, date of approval 27th June 2017).

Informed Consent Statement

Written informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

Jordi A Matias-Guiu is supported by Instituto de Salud Carlos III through the project INT20/00079 (co-funded by European Regional Development Fund “A way to make Europe”). We also acknowledge support from Spanish Ministry of Science and Innovation under project PID2019-110866RB-I00.

Conflicts of Interest

The authors declare that they have no conflict of interests.

Abbreviations

The following abbreviations are used in this manuscript:

ML	Machine Learning
EEG	Electroencephalogram
MRI	Magnetic Resonance Imaging
PET	Positron Emission Tomography
qEEG	quantitative EEG
MCI	Mild Cognitive Impairment
AD	Alzheimer Disease
ICA	Independent Component Analysis
PCA	Principal Component Analysis
WT	Wavelet Transform
GTA	Graph Theory Analysis
OoB	Out of Bag
CG	Control Group
PPA	Primary Progressive Aphasia
nfvPPA	Non-Fluent Primary Progressive Aphasia
svPPA	Semantic Primary Progressive Aphasia
lvPPA	Logopenic Primary Progressive Aphasia
SVM	Support Vector Machine
SD	Standard Deviation
ROC	Receiver Operating Characteristic
kNN	k-Nearest Neighbors
NB	Naive Bayes

References

Gorno-Tempini, M.L.; Hillis, A.E.; Weintraub, S.; Kertesz, A.; Mendez, M.; Cappa, S.F.; Ogar, J.M.; Rohrer, J.D.; Black, S.; Boeve, B.F.; et al. Classification of primary progressive aphasia and its variants. J. Neurol. 2011, 76, 1006–1014. [Google Scholar] [CrossRef] [Green Version]
Marshall, C.R.; Hardy, C.J.D.; Volkmer, A.; Russell, L.L.; Bond, R.L.; Fletcher, P.D.; Clark, C.N.; Mummery, C.J.; Schott, J.M.; Rossor, M.N.; et al. Primary progressive aphasia: A clinical approach. J. Neurol. 2018, 265, 1474–1490. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Stiver, J.; Staffaroni, A.M.; Walters, S.M.; You, M.Y.; Casaletto, K.B.; Erlhoff, S.J.; Possin, K.L.; Lukic, S.; La Joie, R.; Rabinovici, G.D.; et al. The Rapid Naming Test: Development and initial validation in typically aging adults. Clin. Neuropsychol. 2021, 1–22. [Google Scholar] [CrossRef] [PubMed]
Matías-Guiu, J.A.; Cabrera-Martín, M.N.; Moreno-Ramos, T.; Valles-Salgado, M.; Fernandez-Matarrubia, M.; Carreras, J.L.; Matías-Guiu, J. Amyloid and FDG-PET study of logopenic primary progressive aphasia: Evidence for the existence of two subtypes. J. Neurol. 2015, 262, 1463–1472. [Google Scholar] [CrossRef] [PubMed]
Tetzloff, K.A.; Whitwell, J.L.; Utianski, R.L.; Duffy, J.R.; Clark, H.M.; Machulda, M.M.; Strand, E.A.; Josephs, K.A. Quantitative assessment of grammar in amyloid-negative logopenic aphasia. Brain Lang 2018, 186, 26–31. [Google Scholar] [CrossRef]
Matias-Guiu, J.A.; Pytel, V.; Hernández-Lorenzo, L.; Patel, N.; Peterson, K.A.; Matías-Guiu, J.; Garrard, P.; Cuetos, F. Spanish Version of the Mini-Linguistic State Examination for the Diagnosis of Primary Progressive Aphasia. J. Alzheimers Dis. 2021. [Google Scholar] [CrossRef] [PubMed]
Epelbaum, S.; Saade, Y.M.; Flamand Roze, C.; Roze, E.; Ferrieux, S.; Arbizu, C.; Nogues, M.; Azuar, C.; Dubois, B.; Tezenas du Montcel, S.; et al. A Reliable and Rapid Language Tool for the Diagnosis, Classification, and Follow-Up of Primary Progressive Aphasia Variants. Front. Neurol. 2020, 11, 571657. [Google Scholar] [CrossRef]
Sajjadi, S.A.; Sheikh-Bahaei, N.; Cross, J.; Gillard, J.H.; Scoffings, D.; Nestor, P.J. Can MRI Visual Assessment Differentiate the Variants of Primary-Progressive Aphasia? AJNR Am. J. Neuroradiol. 2017, 38, 954–960. [Google Scholar] [CrossRef] [Green Version]
Matias-Guiu, J.A.; Cabrera-Martín, M.N.; Matías-Guiu, J.; Carreras, J.L. FDG-PET/CT or MRI for the Diagnosis of Primary Progressive Aphasia? AJNR Am. J. Neuroradiol. 2017, 38, E63. [Google Scholar] [CrossRef] [Green Version]
Matías-Guiu, J.A.; Cabrera-Martín, M.N.; Pérez-Castejón, M.J.; Moreno-Ramos, T.; Rodríguez-Rey, C.; García-Ramos, R.; Ortega-Candil, A.; Fernandez-Matarrubia, M.; Oreja-Guevara, C.; Matías-Guiu, J.; et al. Visual and statistical analysis of ¹⁸F-FDG PET in primary progressive aphasia. Eur. J. Nucl. Med. Mol. Imaging 2015, 42, 916–927. [Google Scholar] [CrossRef] [PubMed]
Josephs, K.A.; Martin, P.R.; Botha, H.; Schwarz, C.G.; Duffy, J.R.; Clark, H.M.; Machulda, M.M.; Graff-Radford, J.; Weigand, S.D.; Senjem, M.L.; et al. [¹⁸F]AV-1451 tau-PET and primary progressive aphasia. Ann. Neurol. 2018, 83, 599–611. [Google Scholar] [CrossRef] [PubMed]
Henry, M.L.; Hubbard, H.I.; Grasso, S.M.; Mandelli, M.L.; Wilson, S.M.; Sathishkumar, M.T.; Fridriksson, J.; Daigle, W.; Boxer, A.L.; Miller, B.L.; et al. Retraining speech production and fluency in non-fluent/agrammatic primary progressive aphasia. Brain 2018, 141, 1799–1814. [Google Scholar] [CrossRef] [Green Version]
Henry, M.L.; Hubbard, H.I.; Grasso, S.M.; Dial, H.R.; Beeson, P.M.; Miller, B.L.; Gorno-Tempini, M.L. Treatment for Word Retrieval in Semantic and Logopenic Variants of Primary Progressive Aphasia: Immediate and Long-Term Outcomes. J. Speech Lang Hear Res. 2019, 62, 2723–2749. [Google Scholar] [CrossRef] [PubMed]
Bergeron, D.; Gorno-Tempini, M.L.; Rabinovici, G.D.; Santos-Santos, M.A.; Seeley, W.; Miller, B.L.; Pijnenburg, Y.; Keulen, M.A.; Groot, C.; van Berckel, B.N.M.; et al. Prevalence of amyloid-β pathology in distinct variants of primary progressive aphasia. Ann. Neurol. 2018, 84, 729–740. [Google Scholar] [CrossRef] [PubMed] [Green Version]
McMackin, R.; Muthuraman, M.; Groppa, S.; Babiloni, C.; Taylor, J.P.; Kiernan, M.C.; Nasseroleslami, B.; Hardiman, O. Measuring network disruption in neurodegenerative diseases: New approaches using signal analysis. J. Neurol. Neurosurg. Psychiatry 2019, 90, 1011–1020. [Google Scholar] [CrossRef] [PubMed]
Vinjamuri, R. (Ed.) Advances in Neural Signal Processing; IntechOpen: London, UK, 2020. [Google Scholar]
Logothetis, N.K. What we can do and what we cannot do with fMRI. Nature 2008, 453, 869–878. [Google Scholar] [CrossRef]
Paszkiel, S. Analysis and Classification of EEG Signals for Brain–Computer Interfaces; Springer: Berlin, Germany, 2020. [Google Scholar] [CrossRef]
Popa, L.L.; Dragoș, H.-M.; Strilciuc, Ș.; Pantelemon, C.; Mureșanu, I.; Dina, C.; Văcăraș, V.; Muresanu, D. Added Value of QEEG for the Differential Diagnosis of Common Forms of Dementia. Clin. EEG Neurosci. 2021, 52, 201–210. [Google Scholar] [CrossRef] [PubMed]
Metin, S.Z.; Erguzel, T.T.; Ertan, G.; Salcini, C.; Kocarslan, B.; Cebi, M.; Metin, B.; Tanridag, O.; Tarhan, N. The Use of Quantitative EEG for Differentiating Frontotemporal Dementia From Late-Onset Bipolar Disorder. Clin. EEG Neurosci. 2018, 49, 171–176. [Google Scholar] [CrossRef]
Utianski, R.L.; Caviness, J.N.; Worrell, G.A.; Duffy, J.R.; Clark, H.M.; Machulda, M.M.; Withwell, J.L.; Josephs, K.A. Electroencephalography in primary progressive aphasia and apraxia of speech. Aphasiology 2018, 33, 1410–1417. [Google Scholar] [CrossRef]
Grieder, M.; Koenig, T.; Kinoshita, T.; Utsunomiya, K.; Wahlund, L.O.; Dierks, T.; Nishida, K. Discovering EEG resting state alterations of semantic dementia. Clin. Neurophysiol. 2016, 127, 2175–2181. [Google Scholar] [CrossRef] [Green Version]
Subasi, A.; Kevric, J.; Abdullah Canbaz, M. Epileptic seizure detection using hybrid machine learning methods. Neural Comput. Appl. 2019, 31, 317–325. [Google Scholar] [CrossRef]
Hügle, M.; Heller, S.; Watter, M.; Blum, M.; Manzouri, F.; Dumpelmann, M.; Schulze-Bonhage, A.; Woias, P.; Boedecker, J. Early Seizure Detection with an Energy-Efficient Convolutional Neural Network on an Implantable Microcontroller. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–7. [Google Scholar] [CrossRef] [Green Version]
Kiral-Kornek, I.; Roy, S.; Nurse, E.; Mashford, B.; Karoly, P.; Carroll, T.; Payne, D.; Saha, S.; Baldassano, S.; O’Brien, T.; et al. Epileptic Seizure Prediction Using Big Data and Deep Learning: Toward a Mobile System. EBioMedicine 2018, 27, 103–111. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Mirowski, P.; Madhavan, D.; LeCun, Y.; Kuzniecky, R. Classification of patterns of EEG synchronization for seizure prediction. Clin. Neurophysiol. 2009, 120, 1927–1940. [Google Scholar] [CrossRef] [Green Version]
Lehmann, C.; Koenig, T.; Jelic, V.; Prichep, L.; John, R.E.; Wahlund, L.; Dodge, Y.; Dierks, T. Application and comparison of classification algorithms for recognition of Alzheimer’s disease in electrical brain activity (EEG). J. Neurosci. Methods 2007, 161, 342–350. [Google Scholar] [CrossRef]
Cai, H.; Sha, X.; Han, X.; Wei, S.; Hu, B. Pervasive EEG diagnosis of depression using Deep Belief Network with three-electrodes EEG collector. In Proceedings of the 2016 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2016, Shenzhen, China, 15–18 December 2016; pp. 1239–1246. [Google Scholar]
Hosseinifard, B.; Moradi, M.H.; Rostami, R. Classifying depression patients and normal subjects using machine learning techniques and nonlinear features from EEG signal. Comput. Methods Programs Biomed. 2013, 109, 339–345. [Google Scholar] [CrossRef]
Gemein, L.; Schirrmeister, R.; Chrabaszcz, P.; Wilson, D.; Boedecker, J.; Schulze-Bonhage, A.; Hutter, F.; Ball, T. Machine-learning-based diagnostics of EEG pathology. NeuroImage 2020, 220, 117021. [Google Scholar] [CrossRef]
Garn, H.; Coronel, C.; Waser, M.; Caravias, G.; Ransmayr, G. Differential diagnosis between patients with probable Alzheimer’s disease, Parkinson’s disease dementia, or dementia with Lewy bodies and frontotemporal dementia, behavioral variant, using quantitative electroencephalographic features. J. Neural Transm. 2017, 124, 569–581. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Vecchio, F.; Miraglia, F.; Alú, F.; Orticoni, A.; Judica, E.; Cotelli, M.; Rossini, P.M. Contribution of Graph Theory Applied to EEG Data Analysis for Alzheimer’s Disease Versus Vascular Dementia Diagnosis. J. Alzheimers Dis. 2021, 82, 871–879. [Google Scholar] [CrossRef]
Ieracitano, C.; Mammone, N.; Hussain, A.; Morabito, F.C. A Convolutional Neural Network based self-learning approach for classifying neurodegenerative states from EEG signals in dementia. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar] [CrossRef]
Matias-Guiu, J.A.; Suárez-Coalla, P.; Pytel, V.; Cabrera-Martín, M.N.; Moreno-Ramos, T.; Delgado-Alonso, C.; Delgado-Álvarez, A.; Matías-Guiu, J.; Cuetos, F. Reading prosody in the non-fluent and logopenic variants of primary progressive aphasia. Cortex 2020, 132, 63–78. [Google Scholar] [CrossRef]
SCNN. Makoto’s Preprocessing Pipeline. 2021. Available online: https://sccn.ucsd.edu/wiki/Makoto’s_preprocessing_pipeline (accessed on 21 June 2021).
SCNN. CleanLine. 2021. Available online: https://github.com/sccn/cleanline (accessed on 21 June 2021).
Maturana-Candelas, A.; Gómez, C.; Poza, J.; Ruiz-Gómez, S.J.; Hornero, R. Inter-band Bispectral Analysis of EEG Background Activity to Characterize Alzheimer’s Disease Continuum. Front. Comput. Neurosci. 2020, 14, 70. [Google Scholar] [CrossRef]
Beharelle, A.R.; Small, S.L. Chapter 64-Imaging Brain Networks for Language: Methodology and Examples from the Neurobiology of Reading. In Neurobiology of Language; Hickok, G., Small, S.L., Eds.; Academic Press: San Diego, CA, USA, 2016; pp. 805–814. [Google Scholar] [CrossRef]
Smailovic, U.; Jelic, V. Neurophysiological Markers of Alzheimer’s Disease: Quantitative EEG Approach. Neurol. Ther. 2019, 8, 37–55. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cheong, L.C.; Sudirman, R.; Hussin, S. Feature extraction of EEG signal using wavelet transform for autism classification. ARPN J. Eng. Appl. Sci. 2015, 10, 8533–8540. [Google Scholar]
Mulders, P.; Eijndhoven, P.; Beckmann, C. Identifying Large-Scale Neural Networks Using fMRI; Academic Press: Cambridge, MA, USA, 2016; pp. 209–237. [Google Scholar] [CrossRef]
Jalili, M.; Knyazeva, M.G. Constructing brain functional networks from EEG: Partial and unpartial correlations. J. Integr. Neurosci. 2011, 10, 213–232. [Google Scholar] [CrossRef]
Prakash, B.; Baboo, G.K.; Baths, V. A Novel Approach to Learning Models on EEG Data Using Graph Theory Features-A Comparative Study. Big Data Cogn. Comput. 2021, 5, 39. [Google Scholar] [CrossRef]
Wadhera, T. Brain network topology unraveling epilepsy and ASD Association: Automated EEG-based diagnostic model. Expert Syst. Appl. 2021, 186, 115762. [Google Scholar] [CrossRef]
Snaedal, J.; Johannesson, G.H.; Gudmundsson, T.E.; Blin, N.P.; Emilsdottir, A.L.; Einarsson, B.; Johnsen, K. Diagnostic accuracy of statistical pattern recognition of electroencephalogram registration in evaluation of cognitive impairment and dementia. Dement. Geriatr. Cogn. Disord. 2012, 34, 51–60. [Google Scholar] [CrossRef] [PubMed]
Lindau, M.; Jelic, V.; Johansson, S.E.; Andersen, C.; Wahlund, L.O.; Almkvist, O. Quantitative EEG abnormalities and cognitive dysfunctions in frontotemporal dementia and Alzheimer’s disease. Dement. Geriatr. Cogn. Disord. 2003, 15, 106–114. [Google Scholar] [CrossRef]
Caso, F.; Cursi, M.; Magnani, G.; Fanelli, G.; Falautano, M.; Comi, G.; Leocani, L.; Minicucci, F. Quantitative EEG and LORETA: Valuable tools in discerning FTD from AD? Neurobiol. Aging 2012, 33, 2343–2356. [Google Scholar] [CrossRef]
Tzimourta, K.D.; Christou, V.; Tzallas, A.T.; Giannakeas, N.; Astrakas, L.G.; Angelidis, P.; Tsalikakis, D.; Tsipouras, M.G. Machine Learning Algorithms and Statistical Approaches for Alzheimer’s Disease Analysis Based on Resting-State EEG Recordings: A Systematic Review. Int. J. Neural Syst. 2021, 31, 2130002. [Google Scholar] [CrossRef]
Winters-Hilt, S.; Merat, S. SVM clustering. BMC Bioinform. 2007, 8 (Suppl. 7), S18. [Google Scholar] [CrossRef] [Green Version]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer Series in Statistics; Springer New York Inc.: New York, NY, USA, 2001. [Google Scholar]
Vecchio, F.; Miraglia, F.; Alù, F.; Menna, M.; Judica, E.; Cotelli, M.; Rossini, P.M. Classification of Alzheimer’s Disease with Respect to Physiological Aging with Innovative EEG Biomarkers in a Machine Learning Implementation. J. Alzheimers Dis. 2020, 75, 1253–1261. [Google Scholar] [CrossRef]
Gaubert, S.; Houot, M.; Raimondo, F.; Ansart, M.; Corsi, M.C.; Naccache, L.; Sitt, J.D.; Habert, M.O.; Dubois, B.; De Vico Fallani, F.; et al. A machine learning approach to screen for preclinical Alzheimer’s disease. Neurobiol. Aging 2021, 105, 205–216. [Google Scholar] [CrossRef] [PubMed]
Pytel, V.; Cabrera-Martin, M.; Delgado-Alvarez, A.; Ayala, J.; Balugo, P.; Delgado-Alonso, C.; Yus, M.; Carreras, M.; Carreras, J.; Matias-Guiu, J.; et al. Personalized repetitive transcranial magnetic stimulation for primary progressive aphasia. J. Alzheimers Dis. 2021. [Google Scholar] [CrossRef] [PubMed]
Matias-Guiu, J.A.; Díaz-Álvarez, J.; Cuetos, F.; Cabrera-Martín, M.N.; Segovia-Ríos, I.; Pytel, V.; Moreno-Ramos, T.; Carreras, J.L.; Matías-Guiu, J.; Ayala, J.L. Machine learning in the clinical and language characterisation of primary progressive aphasia variants. Cortex 2019, 119, 312–323. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Preprocessing pipeline.

Figure 2. Encoder-Decoder architecture of AutoEncoder.

Figure 3. Network representation for CG group, including nodes, connections and their strength in alpha frequency band.

Figure 4. Network representation for PPA group, including nodes, connections and their strength in alpha frequency band.

Figure 5. Confusion matrix from kNN binary classification (PPA vs. CG) model for the test set.

Figure 6. ROC curve from kNN binary classification (PPA vs. CG) model.

Figure 7. Representation of the 20 most important features.

Figure 8. Representation of decisions in Decision Tree binary model. Scores represent the ANOVA F-value. Five most relevant variables are shown in red.

Figure 9. Confusion matrix from kNN multiclass classification model.

Figure 10. Representation of decisions in Decision Tree multiclass model.

Figure 11. PPA vs. CG in PCA two dimensions.

Table 1. Main clinical and demographic characteristics.

	PPA	nfvPPA	svPPA	lvPPA
Number of participants	40	18 (45%)	10 (25%)	12 (30%)
Age	68.7 ± 6.94	68.55 ± 7.29	66.80 ± 6.35	70.50 ± 6.97
Women	26 (65%)
Years of education	13.90 ± 4.26	13.33 ± 4.41	14.20 ± 4.15	14.50 ± 4.35
Years since symptom onset	4.00 ± 2.25	4.83 ± 1.94	4.00 ± 2.98	2.75 ± 1.42
ACE-III	55.78 ± 26.59	71.76 ± 22.07	53.89 ± 15.72	48.00 ± 23.37
CDR-FTLD (Sum of boxes)	2.6 ± 1.81	2.22 ± 1.54	2.60 ± 1.67	3.16 ± 2.26

Table 2. Metrics from classification models for PPA vs. CG.

Model	F1-Score	Precision	Sensitivity	Accuracy
Decision Tree	0.38	0.39	0.38	0.42
kNN	0.83	0.78	0.88	0.75
SVM	0.58	0.72	0.86	0.58
Random Forest	0.37	0.32	0.43	0.58
Elastic Net	0.4	0.33	0.5	0.66
Gaussian NB	0.78	0.9	0.75	0.83
Multinomial NB	0.73	0.73	0.75	0.75

Table 3. Metrics from classification models for 4 groups (nfvPPA, svPPA, lvPPA, and CG).

Model	F1-Score	Precision	Sensitivity	Accuracy
Decision Tree	0.32	0.32	0.4	0.42
kNN	0.6	0.68	0.58	0.58
SVM	0.39	0.40	0.46	0.5
Random Forest	0.39	0.38	0.48	0.5
Elastic Net	0.34	0.31	0.33	0.38
Gaussian NB	0.27	0.25	0.31	0.33
Multinomial NB	0.2	0.2	0.25	0.25

Table 4. Metrics from classification models for PPA vs. CG using all columns (with no feature selection).

Model	F1-Score	Precision	Sensitivity	Accuracy
Decision Tree	0.49	0.50	0.50	0.50
kNN	0.46	0.6	0.38	0.42
SVM	0.50	0.56	0.56	0.50
Random Forest	0.40	0.33	0.50	0.67
Elastic Net	0.40	0.33	0.50	0.67
Gaussian NB	0.37	0.32	0.44	0.58
Multinomial NB	0.56	0.56	0.56	0.58

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Moral-Rubio, C.; Balugo, P.; Fraile-Pereda, A.; Pytel, V.; Fernández-Romero, L.; Delgado-Alonso, C.; Delgado-Álvarez, A.; Matias-Guiu, J.; Matias-Guiu, J.A.; Ayala, J.L. Application of Machine Learning to Electroencephalography for the Diagnosis of Primary Progressive Aphasia: A Pilot Study. Brain Sci. 2021, 11, 1262. https://doi.org/10.3390/brainsci11101262

AMA Style

Moral-Rubio C, Balugo P, Fraile-Pereda A, Pytel V, Fernández-Romero L, Delgado-Alonso C, Delgado-Álvarez A, Matias-Guiu J, Matias-Guiu JA, Ayala JL. Application of Machine Learning to Electroencephalography for the Diagnosis of Primary Progressive Aphasia: A Pilot Study. Brain Sciences. 2021; 11(10):1262. https://doi.org/10.3390/brainsci11101262

Chicago/Turabian Style

Moral-Rubio, Carlos, Paloma Balugo, Adela Fraile-Pereda, Vanesa Pytel, Lucía Fernández-Romero, Cristina Delgado-Alonso, Alfonso Delgado-Álvarez, Jorge Matias-Guiu, Jordi A. Matias-Guiu, and José Luis Ayala. 2021. "Application of Machine Learning to Electroencephalography for the Diagnosis of Primary Progressive Aphasia: A Pilot Study" Brain Sciences 11, no. 10: 1262. https://doi.org/10.3390/brainsci11101262

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of Machine Learning to Electroencephalography for the Diagnosis of Primary Progressive Aphasia: A Pilot Study

Abstract

1. Introduction

2. Materials and Methods

2.1. Participants

2.2. EEG Acquisition

2.3. Preprocessing

2.4. Quantitative EEG

2.5. Wavelet Transformation

2.6. AutoEncoders

2.7. Graph Theory Analysis

2.8. Data Analysis

2.8.1. Binary Classification Model between PPA and CG

2.8.2. Classification Model for All Groups

2.8.3. Network Analysis

2.8.4. Principal Components Analysis

3. Results

3.1. Classification Model between CG and PPA

3.2. Classification Model between All Groups

4. Discussion

5. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI