Introduction

Coronaviruses are a large family of viruses and a subset of Coronavirus ranging from the common cold virus to the cause of more serious diseases such as severe acute respiratory syndrome (SARS), Middle East respiratory syndrome (MERS), and COVID-19. They were discovered in the 1960s and continued to be studied until the mid-1980s (Tyrrell and Bynoe 1965; Visy et al. 1991). These viruses are common in mammals and birds in general, but seven Coronaviruses of human origin have also been identified (Zhou et al. 2020). The latest version of the virus, Coronavirus Acute Respiratory Syndrome 2, also known as SARS-CoV-2, became widespread among people in Wuhan, China, in December 2019 and was then transmitted to the whole world in a short time (Talaee et al. 2020).

As of December 25, 2020, approximately 1,8 million people have lost their lives due to this disaster. Despite great efforts, the lack of a definitive solution for this disease deeply concerns researchers and the medical world. Several ways to keep people away from this pandemic have been proposed. These measures are limited to social distancing, frequent hand washing, and not touching the face as much as possible. Based on recent research, the most common way this virus spreads is by the droplets released into the air when the person with COVID-19 coughs or sneezes (What is Coronavirus and|Johns Hopkins Medicine n.d). Since this virus behaves differently in subjects and has no fixed symptoms, it has been difficult for studies to reach the main goal of virus diagnosis. In fact, common symptoms of COVID-19 include headache, fever, cough, and fatigue (Schuller et al. 2020). Other symptoms include shortness of breath, muscle and sore throat, and sometimes diarrhea and vomiting (Clinical Manifestations and of Patients with Coronavirus Disease 2019 (COVID-19) in a Referral Center in Iran - PubMed n.d; Brown et al. 2020). Among these symptoms, shortness of breath and dry cough caused by respiratory failure are the major causes of death. Unfortunately, Coronavirus sometimes appears to affect other essential organs, namely the heart, kidneys, brain, and lungs (Sadegh Beigee et al. 2020). Despite all the information available about this virus, there are still many uncertainties about its behavior, virus-patient interaction, and pandemic process (Features n.d). Based on the studies, the elderly and chronic patients are the most affected by this virus (Yang et al. 2020; Abrishami et al. 2020; Coronavirus Disease n.d). According to a pioneering research, some COVID-19 patients are defined as virus carriers only. The data show that 81% of COVID-19 patients merely act as carriers without very serious symptoms (Features n.d). In the remaining percentage (19%), symptoms occur approximately 2–14 days after exposure to the virus. A study of 181 patients with COVID-19 found that the average incubation period was 5 days, and 97% of people had symptoms occurring on average 11.5 days after exposure to the virus (Lauer et al. 2019; When is someone infectious|FAQ n.d).

A detailed assessment of the situation of the COVID-19 infection, which was approximately 5 months old in May 2020, was carried out by Bchetnia et al. (2020). This study generally evaluated studies on COVID-19 and served as an information repository. Additionally, a comparative analysis of SARS-CoV, MERS-CoV, and SARS-CoV-2 was prominently shown. The life cycle of the COVID-19 and how it affects infected cells were also presented.

Clinical and radiological features of COVID-19 and its detailed effects on chronic kidney disease have been studied by medical researchers (Abrishami et al. 2020). It has been observed that individuals with kidney disease are vulnerable to severe COVID-19. In another medical study (Sadegh Beigee et al. 2020), damage to lung tissue biopsy samples of people who died due to COVID-19 was detected. Thus, it is hoped that when research is expanded, it will shed light on the development of effective treatment methods. The behavior changes of this virus and its effects on the human body constituted the main target of many other medical studies (Rajapakse and Dixit 2020; Guo et al. 2019; Oliveros et al. 2019; Paterson et al. 2020).

The general diagnostic method of this virus in the world is the PCR test. However, it seems that this method is not adequate to contain this global disease. The reasons for this are the limitation of the number of tests because of temporal factors, the low availability and high cost of clinical tests despite the great demands for them, and the fact that these tests are largely dependent on hospital and clinic visits (Subirana et al. 2020). This mandatory visit can actually be problematic, as the COVID-19 virus is thought to last for hours and even days on different surfaces (Doremalen et al. 2020; Opinion|Hospitals are overwhelmed because of the coronavirus n.d). Moreover, this face-to-face test endangers healthcare professionals and medical personnel and carries a serious risk. Since the appearance of this virus, test results were initially determined after as long as 10 days, so it was a huge waste of time. As time passed, faster types of tests evolved (Detect COVID-19 in as Little as 5 Minutes|Abbott Newsroom n.d), but the decrease in accuracy raised different concerns (FDA says Abbott’s 5-minute Covid-19 test may miss infected patients n.d). After understanding the virus behavior to some extent, two different methods, X-ray and CT scan, were used for diagnosis (Zhang et al. 2020; Narin et al. 2020; Li et al. 2019; Ye et al. 2019). These methods seem to be as successful as the PCR test (Subirana et al. 2020; Gietema et al. 2020; Adams et al. 2020) but also require a clinic or hospital visit.

In the age of technology, machine learning techniques can be used to control this pandemic. These techniques actually developed a model by adapting to the environment and based on experience without explicit programming (Shuja et al. 2020; Lan et al. 2018). The most important advantage of these techniques is staying away from hospital or clinic visit-based COVID-19 test centers (Imran et al. 2020). In short, they aim to make life easier within the scope of digital health thanks to computer science and engineering, as well as the information obtained in medicine (Schuller et al. 2020). As another example, the virus spread prediction can be facilitated by the application of artificial intelligence (AI) (Hu et al. 2020; Wang et al. 2020; Maghded et al. 2020). In (Schuller et al. 2020), the contribution of sound analysis to the diagnosis of COVID-19 was analysed. It was observed that the computer audition (AC) technique is an accessible and successful tool in diagnosis. These machine listening tools have just begun to be taken into account by researchers during the global epidemic. Based on technological information, it is predicted that the sensors of smartphones will also be useful in the early diagnosis of this virus (Maghded et al. 2020). In another cell phone-based research, a study that has the ability to make cheap, early and reliable diagnosis of COVID-19 has been conducted (Fozouni et al. 2020).

The work done in the field of COVID-19 machine learning is generally divided into two domains of image and sound analysis. Cough, a major symptom of COVID-19, is the basis of sound-based studies (Schuller et al. 2020; Belkacem et al. 2020). In fact, this symptom is an early harbinger of many human diseases (Chang et al. 2008; Higenbottam 2002; Windmon et al. 2019). Cough samples containing COVID-19 can be classified with acceptable success using machine learning and signal processing techniques (Brown et al. 2020). A statistically-intensive study has developed a model for the analysis of recorded COVID-19 cough sound signals (Bagad et al. 2020). In the proposed artificial intelligence model, findings that facilitate the diagnosis of COVID-19 were found in the statistical properties of cough sounds recorded on the phone. In another artificial intelligence study based on cough samples, a fast, reliable, and pre-screening-based diagnostic mechanism was proposed to prevent the spread of coronavirus and to reduce the workload of healthcare institutions (Chowdhury et al. 2021). This deep learning model can provide convenience by giving great confidence to patient candidates who suspect COVID-19 and different respiratory diseases. A crowdsourced data set consisting of 355 participants were analysed in detail using a deep neural network model. This pilot study has shown that COVID-19 detection is possible from cough and breath sounds. In the neural network model, an Area Under Curve (AUC) of 0.846 receiver operating characteristics was reached (Coppock et al. 2021). In a study focusing on two data sets, it has been shown that cough plays a very strong role in the detection of COVID-19 based on cough, breath, and speech sounds (Pahar and Niesler 2021). In the analysis of cough, breath and speech sounds, the highest AUC for COVID-19 classification was 0.93, 0.92, and 0.91, respectively. The importance of cough in the diagnosis of COVID-19 has become a common point of many studies. Based on this information and applying the deep learning workspace on validated samples, the cough analysis system and infection rapid detection platform were realized with promising AUC results (Andreu-Perez et al. 2021). Another comprehensive study has been focused on different types of cough and distinguished patients with COVID-19 from patients with asthma, bronchitis, and healthy individuals with a 95.04% success (Pal and Sankarasubbu 2021). In cough samples recorded using a smartphone app, COVID-19 was classified with a 97% AUC using the Resnet50 classifier due to the different pattern of cough (Laguarta et al. 2020).

It is an important and critical point that the dataset used to obtain information be safe (Shuja et al. 2020; Belkacem et al. 2020). Using these datasets, all researchers around the world have focused on a single target. It is aimed to expand and verify the data within the scope of cooperation by using open-source data in AI-based research by carrying out different studies on an engineering basis (Shuja et al. 2020). In fact, the possibility of easy access to open-source data is a key step in machine learning (Yates et al. 2018). These useful data lead to rich and scientific studies with high availability, accuracy, and clarity (Frazer et al. 2020).

In this study, the diagnosis of COVID-19 was made within the scope of classification using an appropriate combination of machine learning and “GitHub” open-source datasets (GitHub n.d). “Virufy” is a voluntary trustworthy organization to identify patterns caused by COVID-19 coughing noises. This organization offers free COVID-19 cough datasets as a pioneer in bringing innovation to human life, health, and the industry sector. The main goal of this study is to conduct an easy and early diagnosis based on the classification technique during this pandemic by focusing on cough sound samples. This model, based on computer algorithms and the analysis of cough samples, makes predictions and decisions by taking into account the training data. Thus, it is predicted that early diagnosis of the virus will be possible by using effective signal processing methods in this dataset consisting of positive and negative samples of COVID-19. As a result, thanks to the collaboration of technology and engineering, people with suspected infection will be less likely to seek clinical treatment. This research is a basic study in terms of the early diagnosis of COVID-19 using the machine learning technique. This pandemic has the capacity to offer encouraging and reliable information by making more and more detailed research in the academic and scientific field in a technological sense. The results showed that COVID-19 cough sound samples can be diagnosed with acceptable classification accuracy with software applications in homes and workplaces via smart tools (Dunne et al. 2020).

Methods and test protocol

COVID-19 dataset and subjects

The dataset was collected at the hospital or clinic under the supervision of physicians or medical personnel. The cough sound data discussed were recorded with the "Virufy" mobile app upon the request of Stanford University and made available on “GitHub”. After the preprocessing steps were carefully done on the data, the results obtained from the PCR test were labeled as positive or negative (GitHub n.d). The subject demographics are presented in detail in Table 1. A data pool of 121 segmentations obtained from cough samples of 16 subjects was created. As a result of this segmentation, there were a total of 73 non-COVID and 48 COVID-19 coughs. The segmentation preprocessing step contributes greatly to the analysis by clarifying the importance and predominance of infected regions (Shuja et al. 2020; Shan et al. 2020). In this segmentation, the cough time of each subject was taken into account as 1640 ms (ms), and the sampling frequency in the study was determined as 48,000.

Table 1 Demographics of subjects suspected of COVID-19 (GitHub n.d)

Data processing

Figure 1 displays the flow chart that highlights the stages of the study in detail.

Fig. 1
figure 1

Flow diagram of data processing

Cough data STFT-based spectrogram analysis

One way to evaluate the frequency and phase behavior resulting from the change of signals over time is the short-time Fourier transform (STFT) (Sejdić et al. 2009). This technique divided a long signal into short and equal lengths and applies the Fourier transform to each segment separately (Kehtarnavaz 2008). For a random signal \(x\left( t \right)\) in the time domain, STFT is shown by Eq. (1) (He et al. 2016; Bajric et al. 2016).

$$Y\left( {t,f} \right) = STFT\left( {x\left( t \right)} \right) = \int\limits_{{ - \infty }}^{{ + \infty }} {x\left( p \right)w\left( {p - t} \right)e^{{ - 2\pi jfp}} } ~dp~~$$
(1)

where \(x\left( t \right)\) and \(w\left( t \right)\) are the continuous signal and windowing functions, respectively.

The visualization of this method (Krishnan et al. 2010), which provides a complete balance between time and frequency resolution (Mitrović et al. 2010), is defined as a spectrogram or waterfall plot which represents the square size of the coefficients of STFT (Sairamya et al. 2019).

Due to the unstable and unpredictable behavior of audio signals, the focus is on signal properties at a certain time by the windowing process. Thanks to these overlapping windows, the behavior of the unstable signal will resemble a knowing and stationary signal. The length of the sliding window is an important factor in this transformation (Manshouri et al. 2020). The purpose of the window is to actually select the time segment of the signal whose frequency properties are almost invariant (Boashash 2003). Considering the properties of soft behaving windows, the Hamming window (Harris 1987) was used in this study using the trial-and-error method (Manshouri and Kayikcioglu 2019). Furthermore, the size and overlap of the window were selected as 2400 and 600, respectively.

A spectrogram analysis was performed for each recording of COVID-19 and non-COVID cough sounds; then, the average was calculated for each stage. Finally, taking into account the spectrograms based on STFT for COVID-19 and non-COVID cough sounds, the difference in PSD was calculated for analysis. When spectrogram graphics were interpreted, time and frequency slices that best reflected PSD were considered in terms of COVID-19 positive and negative sample classification. The dominant time and frequency intervals reflecting PSD are represented in Table 2. Thus, taking into account these time and frequency intervals, the basic step was taken for the feature extraction method.

Table 2 The dominant time and frequency intervals based on the spectrogram

STFT and MFCC feature extraction techniques

To give a general reason about choosing feature extraction methods, this choice was actually made as a result of the literature review in the field of audio signal analysis. Two techniques were used as feature extraction methods, namely STFT and mel-frequency cepstral coefficients (MFCC). STFT is widely used in the analysis of audio signals (Mitrović et al. 2010; Mateo and Talavera 2020). In the STFT method (Mehala and Dahiya n.d; Mehala and Dahiya n.d) the features were chosen from 13 effective frequency sub-bands and their corresponding times, aiming at a reduced number of feature extraction destinations. Using the trapezoidal digital integration (Trapz) of these effective time and frequency intervals, a (121* 13) dimensional feature vector was obtained.

Another complex sound feature extraction technique to classify and distinguish COVID-19 and non-COVID samples has also been employed. The MFCC technique (Gonzalez 2013) is known to be compatible with the variation of the frequency band of the human ear (Hossan et al. 2010). In addition, MFCC is generally known as a technique with high success in recognizing sound systems (Winursito et al. 2018; Wang and Lawlor 2017). In a different study, MFCC has been shown to be a useful method for the differentiation of dry and wet coughs (Chatrzarrin et al. 2011). The working principle of MFCC can be briefly summarized as signal windowing, discrete Fourier transform (DFT) application, calculating the logarithm of the coefficients' magnitude, warping frequencies to a mel scale, and the application of discrete cosine transform (DCT) (Rao 2017). The detailed MFCC feature extraction flow chart is depicted in Fig. 2.

Fig. 2
figure 2

MFCC feature extraction flowchart

The mel spectrum is calculated based on the signal obtained from the Fourier transform passing through a series of band-pass filters. The mel approach from physical frequency (\(f\)) can be expressed as in Eq. (2).

$$f_{{{\text{Mel}}}} = 2595\log _{{10}} \left( {1 + \frac{f}{{700}}} \right)$$
(2)

In this technique, the Hamming window was chosen with a window size of 1024 and an overlap of 512. Finally, the size of the feature vector obtained from MFCC was obtained as (121 * 13).

SVM classification algorithm

In the Classification learner application (Train models to classify data using supervised machine learning - MATLAB n.d) of Matlab program, the obtained features were classified with the default parameters of the classifiers. Based on these results, it was determined that the SVM classifier is more successful than other methods. This supervised learning model is a powerful classification technique in the class determination of two-class problems. As a result of various scientific studies, it has been proven that this classifier is more successful than other classification methods (Hsu and Lin 2002; Joachims 1998). The advantages that distinguish SVM include its reliability, good generalization feature, strong theoretical basis, and clear geometric lines in terms of classification (Mavroforakis and Theodoridis 2006; El-Naqa et al. 2002). This classifier based on statistical functions minimizes structural risk possibilities (Romero et al. 2015). The working principle of SVM in linear or nonlinear problems follows almost the same basis. This essentially consists of discovering a hyperplane to distinguish positive and negative samples (Lin et al. 2008).

In the present study, linear and nonlinear SVM models were developed for cough-based COVID-19 diagnosis in the positively and negatively labeled problem. A brief description of linear and nonlinear SVM is given below.

Linear SVM

The linear SVM algorithm will be effective when the model created by the training dataset has a linear separable capability. The N-point training dataset is defined as \(\left( {x_{1} ,y_{1} } \right),~ \ldots ,~\left( {x_{n} ,y_{n} } \right)\), where \(y\) is the class label and \(x\) is the pattern to be classified. Now, the main problem is to build a decision function that can properly classify any x input. Ultimately, to solve the problem, the decision function \(g\left( x \right)\) is linearly defined in Eq. (3).

$$g\left( x \right) = w^{T} x + b$$
(3)

where \(w\) represents the hyperplane normal vector and \(b\) is a scalar. (El-Naqa et al. 2002; Romero et al. 2015). Ultimately, in a two-class problem, the training examples are separated by the hyperplane \(w^{T} x + b = 0\).

SVM chooses the plane with the maximum margin between the two classes from different hyperplanes for the training set (Burges 1998). The mathematical analysis of optimum hyperplane calculation is explained in detail in El-Naqa et al. (2002).

Nonlinear SVM and kernel functions

By using a nonlinear operator in the decision function, the linear SVM can be extended to a nonlinear classifier (El-Naqa et al. 2002; Burges 1998; Viitaniemi et al. 2015). In addition, this operator allows the analysis of data to be transferred to the multidimensional property area, also known as the “kernel trick” (Li et al. 2014).

Due to the different kernel functions, it is essential to choose the appropriate function (Lei 2017). The most common kernel function in the SVM classification method is defined as RBF. In this research, the two most commonly used kernel types in SVM studies were exploited, namely polynomial and RBF (Advances in Kernel Methods|The MIT Press n.d). These kernels were identified as in Eqs. (4) and (5).

$$K_{{poynomial}} \left( {x,y} \right) = \left( {x^{T} y + 1} \right)^{p}$$
(4)
$$K_{{RBF}} \left( {x,y} \right) = {\text{exp}}\left( { - \frac{{\left\| {x - y} \right\|^{2} }}{{2\sigma ^{2} }}} \right)$$
(5)

\(p,\sigma > 0\) are constants defining the kernel order and width, respectively. In this study, the parameter \(p\) was chosen as 2. Hyper-parameters were automatically optimized using “fitcsvm” (Train support vector machine (SVM) classifier for one-class and binary classification - MATLAB fitcsvm n.d).

Classifier training

To make a skill assessment of the proposed model, leave-one-out (LOO) and hold-out (HO) cross-validation processes of the statistical techniques were used. In the cross-validation technique (Shultz et al. 2011), the dataset is divided into approximately equal-sized subsets or folds named k (Webb et al. 2011). If these subsets are shown as \(N_{1} ,~N_{2} ,~ \ldots .,~N_{K}\), the classifier learning algorithm is then applied \(k\) times for, \(j = 1\) to \(k\), each time using the combination of all subsets except for \(N_{j}\) as the training set and \(N_{j}\) as the test set.

The LOO method, which is closely related to the jack-knife method (Efron 1982), is actually a special form of cross-validation. In this cross-validation method, where the number of folds is equal to the number of samples in the data set, a single sample is selected as the test set, and all other samples are selected as the training set, and this situation is used once for each time (Yuan et al. 2010; Cawley 2006).

The HO technique (Giresi et al. 2005) is a simple and comprehensive technique among cross-validation methods (Understanding 8 types of Cross-Validation|by Satyam Kumar | Towards Data Science n.d; Hold-out vs. Cross-validation in Machine Learning | by Eijaz Allibhai|Medium n.d). In this technique, the dataset is randomly divided into two sets, the training and the test set. After creating a model by training the training set, the proposed model will be evaluated in test validation data. The general separation for training and testing is 80% and 20%, respectively. In this study, training and test splitting was done as 50.41% and 49.58%. However, in the COVID-19 positive and negative two-class training data, two different separations were taken into account. This cross-validation process was repeated 1000 times.

Evaluating the performance of classification

A specific table called a confusion matrix is a useful tool for evaluating the classification performance. The parameters obtained from the elements of the complex matrix are accuracy, sensitivity, and specificity (Manshouri et al. 2020). These parameters were used for a detailed performance evaluation of the SVM classification technique. The mathematical equation of these parameters is presented in Eqs. (6), (7), (8). The non-COVID class was defined as the positive samples, and the COVID class as the negative samples. This matrix is defined in Table 3 in which P = Positive, N = Negative, TP = True Positive, FP = False Positive, TN = True Negative, and FN = False Negative.

$$Accuracy = \frac{{TP + TN}}{{TP + TN + FP + FN}}$$
(6)
$$Sensitivity = ~\frac{{TP}}{{TP + FN}}$$
(7)
$$Specificity = \frac{{TN}}{{TN + FP}}$$
(8)
Table 3 Confusion matrix

Results

The spectrogram graphs depicting the difference in PSD of cough sounds with COVID-19 positive and negative labels from the time and frequency perspectives are illustrated in Figs. 3 and 4, respectively. The three-dimensional (3D) PSD graph reflecting this PSD is presented in Fig. 5. In short, this graph represents the PSD difference of the average cough sounds from COVID-19 and non-COVID. Focusing on the graph, the PSD difference in this cough sound analysis is evident up to 0.65 s. This distinction is useful in terms of the classification of COVID-19 positive and negative labeled cough sounds. Based on the information obtained from these graphs, effective time–frequency intervals were taken into account for feature extraction. SVM classification results using the STFT feature extraction technique for LOO and HO cross-validation are presented in Tables 4 and 5, respectively. The same tables are also calculated in the MFCC feature extraction method, as in Tables 6 and 7. Looking at the results in general, the classification of COVID-19 cough sounds seems successful with the suggested methods. The MFCC technique yields a better percentage of success compared to STFT feature extraction, and it seems that RBF is more successful than linear and polynomial among SVM types. In addition, the classification performance analysis parameters largely agreed with each other. In the MFCC feature extraction method, in terms of mel coefficients, the issue of how many overlapping windows will bring the highest success to the sum of 13 mel coefficients was considered. This description is given in detail in Fig. 6. The best accuracy achievement was obtained for 11 overlapping windows for almost three SVM classifier types in Fig. 6. The confusion matrix, including sensitivity, specificity, precision, and negative predictive value for the successful feature extraction method determined in Table 6 is displayed in Fig. 7. In Fig. 7, labels 1 and 0 indicate COVID-19 and non-COVID, respectively.

Fig. 3
figure 3

Spectrogram graph from a time perspective

Fig. 4
figure 4

Spectrogram graph from a frequency perspective

Fig. 5
figure 5

3D spectrogram

Table 4 SVM classification results using STFT feature extraction technique for LOO cross-validation
Table 5 SVM classification results using STFT feature extraction technique for two training dataset’s splitting of HO cross-validation
Table 6 SVM classification results using MFCC feature extraction technique LOO cross-validation
Table 7 SVM classification results using MFCC feature extraction technique for two training dataset’s splitting of HO cross-validation
Fig. 6
figure 6

Results of SVM based on the number of overlapping windows by considering 13 mel coefficients

Fig. 7
figure 7

Confusion matrix for the best classification result

Feature selection

Choosing the relevant features to select the most suitable model is a critical step in machine learning. A feature selection technique (Pudil et al. 1994) has many advantages, including simplifying the model for better understanding it, decreasing the size of the feature vector, and greatly reducing training time (Chandrashekar and Sahin 2014). Since the signals received for analysis often carry unnecessary information, it will be useful to remove these data without losing too much information (Bermingham et al. 2015). The presence of these irrelevant data deeply affects the accuracy of the model, while at the same time, it enables the model to be trained based on irrelevant features (Belkacem et al. 2020; Langley 1994; Zhao et al. 2010).

The sequential forward search (SFS) method, which is frequently used in the scientific field (Xue et al. 2016; Jain et al. 2000), was applied as a feature selection technique in this study. After performing SFS, a peak accuracy of 95.86% was observed on the dataset when using the best 9 features from among the 13 (Fig. 8). Also, as a result of the application of the SFS technique, the success rate of RBF kernel SVM classification increased from 94.21 to 95.86% (Pahar et al. 2020). For this successful feature selection, the confusion matrix is presented in Fig. 9.

Fig. 8
figure 8

RBF kernel SVM classification success percentage as a result of the SFS feature selection technique

Fig. 9
figure 9

Confusion matrix after SFS application for RBF kernel SVM

Discussion

This study was conducted for the diagnosis of COVID-19, taking into account cough sounds, based on machine learning technology. This research was defined as a machine learning study for the COVID-19 pandemic based on cough sounds without being present in any hospital or clinical setting. This machine learning model provides great advantages by recording the cough sound of subjects suspected of COVID-19 based on the application of signal processing and audio recording of smartphones (Imran et al. 2020; Hu et al. 2020; Laguarta et al. 2020; Pahar et al. 2020). One of the advantages of these studies is that they can be performed anywhere, at any time, and under easy conditions. These models are thought to be beneficial as a health aid tool when COVID-19 test kits are lacking. As stated from the outset of the pandemic, the easiest way to be protected from this dangerous virus is to stay at home and minimize interaction with people as much as possible. Taking this important issue into account, the proposed machine learning-based models protect healthcare professionals to a large extent (Schuller et al. 2020; Imran et al. 2020; Lalmuanawma et al. 2020; Bachtiger et al. 2020; Schaar et al. 2020). In fact, computer-based methods to prevent the rapid and latent spread of the Coronavirus are highly appreciated by researchers. Thanks to the application recommended in Imran et al. (2020), the virus spread tracking has been easily possible. This study has developed a pre-diagnose model for COVID-19 from cough samples using AI technology.

A small portion of the studies on COVID-19 has focused on machine learning-based cough sound classification, a topic which has recently come under the attention of researchers. One reason for this is that there is a limited number of open-access datasets based on cough sound. As these datasets are unveiled, they are quickly examined by researchers because the goal is to overcome the COVID-19 epidemic as soon as possible and return to normal life.

In a new and similar machine learning study in this area, two datasets were considered in the cough-based analysis of patients with COVID-19 and non-COVID-19. In this study, MFCC was chosen as the feature extraction method and it was observed that the k-NN classification algorithm was successful among seven different classification algorithms (Maleki 2021). Due to the shortage of a wide range of COVID-19 studies based on cough sound classification, we do not think it is very accurate to compare our proposed classification study with the few available classification studies in detail (Brown et al. 2020; Laguarta et al. 2020; Pahar et al. 2020; Sharma et al. 2020) because each study focuses on different datasets. The common goal is to be able to classify COVID-19 positive and negative labeled cough sounds with high success, using effective feature extraction and classification techniques through signal processing steps.

Using the STFT and MFCC feature extraction method, it has been observed that MFCC yields more successful results, as expected, in the COVID-19 positive and negative cough sound classification process (Huzaifah 2017). Although STFT is known as a simple linear time–frequency transformation and provides valuable information about the frequency position over time (Krishnan et al. 2010), it is sometimes possible to appear insufficient in terms of time–frequency resolution tradeoffs. Considering this issue, the appropriate method can be selected according to the area of use. The MFCC method takes an important place in the field of audio signal processing, thanks to producing effective features by reducing the margin of error of sound signals exposed to noise (Kamarulafizam et al. 2007). Due to the logarithmic structure of the frequency bands in the MFCC method, it is close to the human system response, which makes this method different from other methods (Janse et al. 2014). The disadvantage of the MFCC feature extraction method is that it does not provide the necessary accuracy in the analysis of noisy audio signals (Kohshelan 2014). In order to perform the classification process, the SVM classification method, which has gained momentum in this field was selected (Ismael and Şengür 2021; Singh et al. 2020; Loey et al. 2021). When the study is extended in terms of datasets, this classification technique should be reviewed for its use, because the SVM algorithm is sometimes not suited enough for large and noisy datasets. In addition, a balance must always be struck between the number of features and the number of training data samples to avoid underperforming the SVM algorithm. Moreover, understanding the model of the final SVM and also selecting an appropriate kernel function for the algorithm can be difficult (Ray 2019). Finally, it was confirmed that feature selection is an important step in all aspects of signal processing. The classification performance was improved by using the SFS feature selection technique in this study (Cen et al. 2016).

Conclusion

The main reason for the spread of Coronavirus can be the inadequacy of test kits, as well as the heavy cost and loss of time in determining clinical test results. In this study, we tried to find a solution to the COVID-19 epidemic using an open-access dataset and machine learning technology. Based on this technology, we have developed a model that can classify COVID-19 cough records obtained from smartphones. We knew that the dry cough symptom is an important factor in the diagnosis of COVID-19. Thus, we created the main framework of this research by taking cough-based studies into consideration. The most important goal was to provide a virtual test opportunity away from the clinical and hospital environment based on machine learning technology. Cough records of 16 subjects suspected of COVID-19 were analyzed to create the dataset. The obtained cough sounds became a ready-made dataset by the preprocessing and segmentation process. Considering the spectrogram graph based on STFT, we focused on effective features. Focusing on the PSD difference of negative and positively labeled cough sounds, strong features were prepared from dominant time and frequency ranges for the STFT feature extraction method. In addition, the MFCC feature extraction method was included in the analysis for the diagnosis of COVID-19 cough. Based on the work done for the diagnosis of COVID-19 cough sound, the MFCC feature extraction technique and RBF kernel SVM were selected as the method with the best percentage of success.

In the future, a stronger cough-based COVID-19 diagnostic study can be conducted by a larger dataset. Future studies can also increase the number of subjects and the classification accuracy using different feature extraction and classification methods. Moreover, the train and classification stages can be improved by using deep learning algorithms.