1 Introduction

In the last few months of 2019, a new type of virus, which is a member of the family Coronaviridae, emerged. The virus in question is considered to have had a zoonotic origin [1]. The virus that emerged in the city of Wuhan in Hubei province in China affected this region first and then spread all over the world in a short time. The virus generally affects the upper and lower respiratory tract, lungs, and, less frequently, the heart muscles [2]. While the virus generally affects young and middle-aged people and people who do not have any chronic diseases to a lesser extent, it can cause severe consequences, resulting in death, in people who suffer from diseases such as hypertension, cardiovascular disease, and diabetes [3]. The epidemic, which was declared to be a pandemic in March 2020 by the World Health Organization; as of the first week of October of the same year, had a number of cases approaching thirty-six million, while the death toll reached one million hundred thousand. Also, a modeling study carried out by Hernandez-Matamoros et al. [4] indicates that the effects of the epidemic will become more severe in the future.

In people suffering severely from the disease, the serious adverse effects are generally in the lungs [3]. In this context, many literature studies have been carried out in a short time in which these effects of the disease in the lungs were shown using CT scans of lungs and chest X-ray imaging. Literature studies indicate that radiological imaging, along with clinical symptoms, blood, and biochemical tests, is an effective and reliable diagnostic tool for the diagnosis of Covid-19 disease.

Many clinical studies in which X-ray images were examined [5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24] have shown that Covid-19 disease causes interstitial involvement, bilateral and irregular ground-glass opacities, parenchymal abnormalities, a unilobar reversed halo sign, and consolidation on the lungs. The recent review article published by Long and Ehrenfeld [25] highlighted the importance of using artificial intelligence methods to quickly diagnose Covid-19 disease and reduce the effects of the outbreak crisis. In this context, some literature studies have been carried out that diagnose Covid-19 disease (Covid-19 and non-Covid-19) through X-ray images and using deep learning methods. Table 1 contains some summary information about the number of images, study methods, and study results used in these literature studies.

Table 1 Results of previous studies for Covid-19 and non-Covid-19 classification using X-ray images

CT imaging generally contains more data than X-ray imaging. However, it has some disadvantages for the follow-up of all stages of the disease due to the excess amount of radiation that the patients are exposed to. For this reason, an artificial intelligence application using X-ray images was created and tested in the study.

In this study, which aims at early diagnosis of Covid-19 disease with the help of X-ray images, a deep learning approach, which is an artificial intelligence method applying the latest technology, was used. In this context, automatic classification of the images was carried out through the two different convolutional neural networks (CNNs). In the study, experiments were carried out for the use of images directly, using local binary pattern (LBP) as a pre-process and dual tree complex wavelet transform (DT-CWT) as a secondary operation, and the results of the automatic classification were calculated separately. Within the scope of the study, four new classification approaches that involve performing the experiments together and combining the results through a result generation algorithm, have been proposed and tested. The results of the study show that in the diagnosis of Covid-19 disease, the analysis of chest X-ray images using deep learning methods provides fast and highly accurate results.

2 Methods

2.1 Used Data

The chest X-ray images of patients with Covid-19 used in the study were obtained by combining metadata data sets that were made open access over GitHub after being created by Cohen et al. [42] and over Kaggle after being created by Dadario [43]. The images that these data sets contain in common and the clinical notes related to these images were combined and a mixed Covid-19 image data set consisting of 150 chest X-ray images was created. In the study, images obtained while the patients were facing the X-ray device directly were used. In the studies, the images taken from the same patient were obtained on different days of the course of the disease and therefore do not contain exactly the same content. The dimensions of the images in question vary between 255 px × 249 px and 4280 px × 3520 px (px is pixel abbreviation) and show a wide variety. Also, these images have different data formats such as png, jpg, jpeg and two different bit depths such as 8-bit (gray-level) and 24-bit (RGB). Standardization of the images is an essential process for use in this study. In this context, all of the images have been converted to 8-bit gray-level images. Then, to clarify the area of interest on the images, manual framing was performed so as to cover the chest area. After this process, all the images were rearranged to 448 px × 448 px and saved in png format.

For the non-Covid-19 X-ray images in the study, two data sets, a Montgomery data set [44] and a Shenzhen data set [44], were used separately. These databases contain 80 and 326 non-Covid-19 X-ray images, respectively. The first training-test data set contains a total of 230 X-ray images, of which 150 are Covid-19 images and 80 are non-Covid-19 images, while the second training-test data set contains 476 X-ray images, of which 150 are Covid-19 images and 326 are non-Covid-19 images. Thus, it was ensured that classification results were obtained for the two data sets that contained predominantly Covid-19 images and predominantly non-Covid-19 images, respectively. The processes applied to the Covid-19 images were likewise applied to the non-Covid-19 images. In Fig. 1, original and edited versions of the X-ray images are shown; one belonging to a patient with Covid-19 and two belonging to people without Covid-19 (non-Covid-19 people).

Fig. 1
figure 1

a) X-ray image of a patient with Covid-19 (Phan et al. [23]) b) Non-Covid-19 X-ray image (Montgomery data set [44])) c) Non-Covid-19 X-ray image (Shenzhen data set [44]))

2.2 Local Binary Pattern (LBP)

Local binary pattern (LBP) is an approach that was proposed by Ojala et al. [45] to reveal local features. The method is basically based on comparing a pixel on the image to the neighboring pixels one by one, in terms of size.

In Fig. 2, the images obtained by applying the LBP operation to the X-ray images given in Fig. 1 are included. The purpose of benefiting from LBP operation within the scope of this study is to observe the effects of using LBP images, which reflect the local features in the CNN input on the study results, rather than the original images. Additionally, the aim of the study is to increase the image feature depth used in the new result generation algorithm.

Fig. 2
figure 2

Images created by applying LBP and resizing the images in Fig. 1

2.3 Dual Tree Complex Wavelet Transform (DT-CWT)

Dual tree complex wavelet transform (DT-CWT) was first introduced by Kingsbury [46,47,48]. This method is generally similar to the Gabor wavelet transform. In the Gabor wavelet transform, low-pass and high-pass filters are applied to the rows and columns of the image horizontally and vertically. In this way, two different sub-band groups are formed in rows and columns as low (L) and high (H). Crossing is made during the conversion of the said one-dimensional bands into two dimensions. At the end of the process, a low sub-band, named LL, is obtained. In addition, three sub-bands containing high bands, LH, HL, and HH, are formed. Further sub-bands (such as LLL, LLH) can be obtained by applying the same operations to the LL sub-band.

Unlike the Gabor wavelet transform, instead of a single filter, DT-CWT uses two filters that work in parallel. These two trees contain real and imaginary parts of complex numbers. That is, as a result of the DT-CWT process, a sub-band containing more directions than the Gabor wavelet transform is obtained. When DT-CWT is applied to an image, the processes are performed for six different directions, +15, −15, +45, −45, +75, and − 75 degrees. Three of these directions represent real sub-bands and the other three represent imaginary sub-bands. Figure 3 shows the DT-CWT decomposition tree. In Fig. 4, real and imaginary sub-band images obtained by applying the DT-CWT process (scale = 1) to the X-ray images given in Fig. 1, are shown. Within the scope of the study, the DT-CWT process was used with a scale (level) value of 1, and the dimensions of the sub-band images obtained were half the size of the original images. Since the complex wavelet transform has been successful in many studies [49,50,51] where medical images have previously been used, this conversion was preferred in the study.

Fig. 3
figure 3

Structure of the DT-CWT decomposition tree

Fig. 4
figure 4

Real and imaginary sub-band images obtained by applying DT-CWT to the X-ray Image (scale = 1)

2.4 Convolutional Neural Network (CNN)

Deep learning has come to the fore in recent years as an artificial intelligence approach that provides successful results in many image processing applications from image enhancement (such as [52]) to object identification (such as [53, 54]).

Convolutional neural network (CNN) has been the preferred deep learning model in image processing applications in recent years. The CNN classifier, in general, consists of a convolution layer, activation functions, a pooling layer, a flatten layer, and fully connected layer components. In this context, Fig. 5 describes the general operation of the CNN classifier. It is possible to examine more detailed information about the functions and operating modes of the layers in the CNN classifier from the studies [55,56,57,58,59].

Fig. 5
figure 5

General operation of the CNN classifier

Within the scope of the study, a CNN architecture with a total of 23 layers was designed. An effective design was aimed at, since increasing the number of layers in the CNN architecture leads to increased processing time in the training and classification processes. Table 2 contains details of the first CNN architecture used in the study. Also, a second CNN architecture was used to check whether the proposed pipeline approaches applied to other CNN architectures. In this context, an architecture modeled on VGG-16 CNN was used. However, to reduce the processing load, the number of filters and the fully connected layer sizes have been reduced. Additionally, normalization layers were added after the intermediate convolution layers. Details of this second CNN architecture used are given in Table 3.

Table 2 First CNN architecture used within the scope of the study
Table 3 Second CNN architecture used within the scope of the study

In the context of the study, Matlab 2019a program was preferred as software. The layer names and parameters in Tables 2 and 3 are the names and parameters used directly in the software. In the study, more than one experiment was carried out and the sizes of the input images used in the experiments differ. For this reason, there are different sizes in the input layer in Tables 2 and 3. Those CNN architectures were used in all the experiments carried out within the scope of the study.

2.5 Evaluation Criteria of the Classification Results

Within the scope of the study, confusion matrix and statistical parameters obtained from this matrix were used to evaluate the results. It is possible to examine detailed information about the confusion matrix, i.e., sensitivity (SEN), specificity (SPE), accuracy (ACC), and F-1 score (F-1), from the studies [60].

Receiver operating characteristic (ROC) analysis was also used to evaluate the results. In addition, the sizes of the areas under the ROC curve (Area Under Curve (AUC)) were calculated. ROC analysis basically reflects graphically the variation of sensitivity (SEN) (y-axis) relative to 1-SPE (x-axis) for the case that the threshold value is gradually changed with a certain precision between the minimum and maximum output predicted for the classification.

2.6 Pipeline Methodology

First of all, in the proposed pipeline algorithm, training and test procedures for images of size of 448 × 448 were performed and results were obtained.

  • Before the experiments after the first experiment were conducted, DT-CWT was applied to the images of size of 448 px × 448 px (scale = 1) and 224 px × 224 px sub-band images were obtained.

  • In the second experiment, results were obtained for the case of giving the real part of the LL sub-band image obtained by applying DT-CWT, as input to the CNN.

  • In the third experiment, training and testing procedures were carried out and results were obtained for the case of giving the imaginary part of the LL sub-band image obtained by applying DT-CWT, as input to the CNN.

  • In the fourth experiment, training and testing procedures were carried out and results were obtained for the case of giving the real parts of LL, LH and HL sub-band images obtained by applying DT-CWT as input to the CNN, together.

  • In the fifth experiment, training and test procedures were carried out and results were obtained for the case of giving the imaginary parts of the LL, LH and HL sub-band images obtained by applying DT-CWT as input to the CNN, together.

  • In the sixth experiment, results were obtained for the case of giving the real and imaginary parts of the LL sub-band image obtained by applying DT-CWT as input to the CNN, together.

  • In the seventh experiment, results were obtained for the case of giving the real and imaginary parts of the LL, LH, HL sub-band images obtained by applying DT-CWT, as input to the CNN, together.

A block diagram of the experiments carried out in the study is shown in Fig. 6. The first seven experiments conducted were repeated using new images obtained by applying LBP to the X-ray images, and the first stage experiments were completed. Since the image size decreases after LBP processing, these images were rearranged as 448 px × 448 px in size.

Fig. 6
figure 6

Block diagram representation of the study of the experiments

In the ongoing part of the study, four pipeline classification algorithms were designed using the principle of parallel operation. These algorithms are based on combining the results of previous experiments to obtain new results. The first two pipeline classification algorithms mentioned above work as follows:

  • If the numbers of labeling (threshold value for 0,5) obtained in the experiments (with and without LBP) for an image are not equal to each other, the labeling result obtained in more than half of the experiments is considered to be the algorithm labeling result for Covid-19 or non-Covid-19. In this case, if the algorithm labeling result is Covid-19, the real experiment result is the closest to the number 0; if the algorithm labeling result is non-Covid-19, the closest to the number 1, respectively, is assigned as the algorithm result.

  • If the numbers of labeling (for threshold value 0,5) obtained in the experiments (with and without LBP) for an image are equal to each other, the actual test results obtained in the experiments conducted for the image are mixed (50%–50% and 75%–25%, respectively), and the result is accepted as the algorithm result. After that, the labeling of the image is realized as Covid-19 or non-Covid-19 (for the threshold value 0,5) according to this result obtained.

The basic coding of the first two pipeline classification approaches is included in Table 4. In the codes between Tables 4 and 6, Result-1 and Label-1 represent the actual test result and the label obtained without using LBP, while Result-2 and Label-2 represent the actual test result and the label obtained using LBP.

Table 4 Basic coding of the pipeline algorithms (pipeline-1 and -2) proposed in the study

In the third and fourth pipeline algorithms, unlike the first two pipeline algorithms, if the tags obtained as a result of the classification experiment differ from each other, the result obtained without applying LBP has been taken into consideration with priority. Accordingly, in the case where the two classification tags are different from each other in the third pipeline algorithm, if the tag result obtained without applying LBP was abnormal, the result was considered abnormal. In the fourth pipeline algorithm, in the case of the two classification tags being different from each other, if the tag result obtained without applying LBP was normal, the result was considered normal. The other procedures are the same as for the first two pipeline algorithms. A mixing rate of 50% -50% was applied in the third and fourth pipeline algorithms. The basic coding of the third and fourth pipeline classification approaches is given in Tables 5 and 6.

Table 5 Basic coding of the pipeline algorithm (pipeline-3) proposed in the study
Table 6 Basic coding of the pipeline algorithm (pipeline-4) proposed in the study

3 Results

3.1 Experiments

In this study, which aims to detect Covid-19 disease early using X-ray images, the deep learning approach, which is the artificial intelligence method applying the latest technology, was used and automatic classification of the images was performed using CNN. In the first training-test data set used in the study, there were 230 X-ray images, of which 150 were Covid-19 and 80 were non-Covid-19, while in the second training-test data set there were 476 X-ray images, of which 150 were Covid-19 and 326 were non-Covid-19. Thus, it was ensured that the classification results were obtained separately from the two data sets containing predominantly abnormal images and predominantly normal images. The information from the training-test data sets is given in Table 7.

Table 7 Information about the images used in the study

Within the scope of the study, chest X-ray images were manually framed to cover the lung region, primarily to determine the areas of interest on the image. Then, standardization was carried out since the images used were of very different sizes, formats, and bit depths. The areas of interest on the image were resized and the image sizes were arranged as 448 px × 448 px. After that, the images in question were saved in png format so as to be as gray-scale and 8-bit depth. These operations were applied to all the abnormal and normal images used in the study.

In the ongoing part of the study, a 23-layer CNN architecture and a 54-layer CNN architecture were designed and used, the details of which have been previously described. Those CNN architectures were used in all the experiments. Due to the fact that more than one experiment was performed within the scope of the study, only the images given to the CNN input differ in size.

In the experiments conducted in the study, the trainings were carried out with the k-fold cross validation method. In this context, the k value was chosen as 23. Since the first training-test data set consists of 230 images, 220 images, except for ten images at each stage (fold), were used for the training operations, and the remaining ten images were used for the testing operations. The second training-test data set consists of 476 images, and, in the same way, except 20/21 (16 groups consisting of 21 images and seven groups of 20 images) images, 456/455 images were used in the training operations, and the remaining 20/21 images were used in the testing operations. The test procedures were repeated 23 times and classification results were obtained for all the images.

Finally, within the scope of the study, all the images were combined and the training and testing procedures were repeated by applying a 2-fold cross for a total of 556 X-ray images comprising 150 Covid-19 images and 406 non-Covid-19 images. Considering the length of the study as well, the results that have been shared in the study are only for the input data that provided the best results for the first and second data sets.

In this part of the study, a total of 14 experiments were carried out. Some initial weights and parameters in the CNN are randomly assigned. To make the study results stable, each experiment was repeated five times in itself, and average results in the study are shown.

Within the scope of the study, the CPU time taken for an experiment to be completed entirely, including the training and testing, was divided by the total number of images processed, and the processing CPU time per image was measured. The experiments of this study were carried out using MATLAB 2019 (a) software running on a computer with 64 GB RAM and Intel(R) Xeon (R) CPU E5–2680 2.7 GHz (32 CPUs).

4 Results

In the first experimental group within the scope of the study, the training and testing procedures were first performed using the chest X-ray images, and the results were obtained. LBP operation was then applied to the images in question, and then the training and testing procedures were repeated and the results were calculated. Finally, the results were calculated using the pipeline classification algorithms, the details of which were previously described and proposed within the scope of the study. Due to the random assignment of some initial variables used in the internal structure of the CNN, each experiment group was repeated five times in order to make the results more stable. The image sizes given to the CNN as input for this experiment were 448 × 448 × 1. The results obtained from the experimental group are given in Table 8 (first training-test data set) and Table 9 (second training-test data set).

Table 8 Results obtained directly using chest X-ray images (first training-test data set)
Table 9 Results obtained directly using chest X-ray images (second training-test data set)

In the second experimental group within the scope of the study, the training and testing procedures were performed using the real part of the LL sub-image obtained by applying DT-CWT to the chest X-ray images, and the results were obtained. Then, the training and testing procedures were performed using the real part of the LL sub-image obtained by applying the LBP and DT-CWT operations to the X-ray images, respectively. Finally, the results were calculated using the pipeline classification algorithms, the details of which were previously described and proposed within the scope of the study. The image sizes given to the CNN as input for this experiment were 224 × 224 × 1. The results obtained from the experimental group are given in Table 10 (first training-test data set) and Table 11 (second training-test data set).

Table 10 Results obtained by using the LL real sub-band obtained by applying DT-CWT to the chest X-ray images (first training-test data set)
Table 11 Results obtained by using the LL real sub-band obtained by applying DT-CWT to the chest X-ray images (second training-test data set)

In the third experimental group within the scope of the study, the training and testing procedures were performed using the imaginary part of the LL sub-image obtained by applying DT-CWT to the chest X-ray images, and the results were obtained. Then, the training and testing procedures were performed using the imaginary part of the LL sub-image obtained by applying the LBP and DT-CWT operations to the X-ray images, respectively. Finally, the results were calculated using the pipeline classification algorithms, the details of which were previously described and proposed within the scope of the study. The image sizes given to the CNN as input for this experiment were 224 × 224 × 1. The results obtained from the experimental group are given in Table 12 (first training-test data set) and Table 13 (second training-test data set).

Table 12 Results obtained by using the LL imaginary sub-band obtained by applying DT-CWT to the chest X-ray images (first training-test data set)
Table 13 Results obtained by using the LL imaginary sub-band obtained by applying DT-CWT to the chest X-ray images (second training-test data set)

In the fourth experimental group within the scope of the study, the training and testing procedures were performed using the real part of the LL, LH and HL sub-images obtained by applying DT-CWT to the chest X-ray images, and the results were obtained. Then, the training and testing procedures were performed using the real part of the LL, LH and HL sub-images obtained by applying the LBP and DT-CWT operations to the X-ray images, respectively. Finally, the results were calculated using the pipeline classification algorithms, the details of which were previously described and proposed within the scope of the study. The image sizes given to the CNN as input for this experiment were 224 × 224 × 3. The results obtained from the experimental group are given in Table 14 (first training-test data set) and Table 15 (second training-test data set).

Table 14 Results obtained using the LL, LH, HL real sub-bands obtained by applying DT-CWT to the chest X-ray images (first training-test data set)
Table 15 Results obtained using the LL, LH, HL real sub-bands obtained by applying DT-CWT to the chest X-ray images (second training-test data set)

In the fifth experimental group within the scope of the study, the training and testing procedures were performed using the imaginary part of the LL, LH and HL sub-images obtained by applying DT-CWT to the chest X-ray images, and the results were obtained. Then, the training and testing procedures were performed using the imaginary part of the LL, LH and HL sub-images obtained by applying the LBP and DT-CWT operations to the X-ray images, respectively. Finally, the results were calculated using the pipeline classification algorithms, the details of which were previously described and proposed within the scope of the study. The image sizes given to the CNN as input for this experiment were 224 × 224 × 3. The results obtained from the experimental group are given in Table 16 (first training-test data set) and Table 17 (second training-test data set).

Table 16 Results obtained using the LL, LH, HL imaginary sub-bands obtained by applying DT-CWT to the chest X-ray images (first training-test data set)
Table 17 Results obtained using the LL, LH, HL imaginary sub-bands obtained by applying DT-CWT to the chest X-ray images (second training-test data set)

In the sixth experimental group within the scope of the study, the training and testing procedures were performed using the real and imaginary parts of the LL sub-image obtained by applying DT-CWT to the chest X-ray images, and the results were obtained. Then, the training and testing procedures were performed using the real and imaginary parts of the LL sub-image obtained by applying the LBP and DT-CWT operations to the X-ray images, respectively. Finally, the results were calculated using the pipeline classification algorithms, the details of which were previously described and proposed within the scope of the study. The image sizes given to the CNN as input for this experiment were 224 × 224 × 2. The results obtained from the experimental group are given in Table 18 (first training-test data set) and Table 19 (second training-test data set).

Table 18 Results obtained by using the LL real and imaginary sub-bands obtained by applying DT-CWT to the chest X-ray images (first training-test data set)
Table 19 Results obtained by using the LL real and imaginary sub-bands obtained by applying DT-CWT to the chest X-ray images (second training-test data set)

In the seventh experimental group within the scope of the study, the training and testing procedures were performed using the real and imaginary parts of the LL, LH, HL sub-images obtained by applying DT-CWT to the chest X-ray images, and the results were obtained. Then, the training and testing procedures were performed using the real and imaginary parts of the LL, LH, HL sub-images obtained by applying the LBP and DT-CWT operations to the X-ray images, respectively. Finally, the results were calculated using the pipeline classification algorithms, the details of which were previously described and proposed within the scope of the study. The image sizes given to the CNN as input for this experiment were 224 × 224 × 6. The results obtained from he experimental group are given in Table 20 (first training-test data set) and Table 21 (second training-test data set).

Table 20 Results obtained by using the LL, LH, HL real and imaginary sub-bands obtained by applying DT-CWT to the chest X-ray images (first training-test data set)
Table 21 Results obtained by using the LL, LH, HL real and imaginary sub-bands obtained by applying DT-CWT to the chest X-ray images (second training-test data set)

Finally, all the training-test data sets were combined to test the performance of the proposed method and the pipeline approaches. In this context, a collective training-test data set containing a total of 556 X-ray images comprising 150 Covid-19 and 406 non-Covid-19 images was created. Then the k value was determined as 2 (cross training and testing for 75 Covid-19 and 203 non-Covid-19 images). The training and testing processes were realized for the input images (original image and the LL (real sub-band)), ensuring the best results in the first and second training-test data sets. The results obtained are given in Tables 22 and 23.

Table 22 Results obtained directly using chest X-ray images (k = 2 and a total of 556 images (150 Covid-19 and 406 non-Covid-19 images))
Table 23 Results obtained using the LL real sub-band obtained by applying DT-CWT to the chest X-ray images (k = 2 and a total of 556 images (150 Covid-19 and 406 non-Covid-19 images))

5 Conclusion

In this section, first of all, the results that were obtained without using pipeline algorithms are compared. When the results of the study given between Tables 8 and 23 are examined within the scope of the study, it can be seen that the results of the study obtained without using LBP are generally better than the results of the study using LBP, for the same input image. In this context, it is understood that there are exceptions for the sensitivity parameter of some results obtained using the first CNN architecture for the first training-test data set. Within the scope of the study, the highest mean sensitivity, specificity, accuracy, F-1 score, and AUC values obtained without using the pipeline algorithms were, respectively; 0,9853, 0,9725, 0,9765, 0,9819, 0,9983 for the first training-test data set and the first CNN architecture, 0,9613, 0,9725, 0,9600, 0,9691, 0,9949 for the first training-test data set and the second CNN architecture, 0,9720, 0,9908, 0,9845, 0,9752, 0,9982 for the second training-test data set and the first CNN architecture, and 0,9733, 0,9920, 0,9857, 0,9772, 0,9987 for the second training-test data set and the second CNN architecture. In this context, it can be seen that the achievements of the first and second CNN architectures are generally close to each other. However, when a comparison is made in terms of CPU run-time, it is understood that the second CNN architecture is two times slower than the first CNN architecture in terms of CPU run-time. The main reason for this is that the number of layers in the second CNN architecture is approximately twice that as high as in the first CNN architecture. A similar situation arose in the experiments performed by combining all the data and using 2-fold cross. For these experiments, the highest mean sensitivity, specificity, accuracy, F-1 score, and AUC values obtained without using the pipeline algorithms are respectively; 0,9253, 0,9892, 0,9719, 0,9468, 0,9939 for the first CNN architecture and 0,9240, 0,9936, 0,9745, 0,9511, 0,9975 for the second CNN architecture.

Within the scope of the study, DT-CWT was used to reduce the image dimensions. In this way, DT-CWT tolerated the increase in result-producing time due to the use of the pipeline algorithm. In this context, when the results obtained using the original images and the ones obtained using DT-CWT are compared, it can be seen that there is no serious decrease in the results, in general. Using DT-CWT, the image sizes were reduced successfully and a reduction in the result-producing times was achieved, in the study.

The pipeline algorithms proposed within the scope of the study are based on combining the results obtained without using LBP and with using LBP, as detailed previously. After this stage, the study results obtained by using the pipeline algorithms were analyzed. With the introduction of the pipeline algorithms, improvements were achieved in all the parameters obtained by using both training-test data sets and the CNN architectures. In this context, an improvement was achieved in general, according to the highest results obtained without LBP and with using LBP, in terms of percentage ranging between 0,67% and 3,73% for the sensitivity parameter, between 0,06% and 2,25% for the specificity parameter, between 0% to 2,61% for the accuracy parameter, between 0,03% and 2,04% for the F-1 score parameter, and between 0% and 1,20% for the AUC parameter.

It was also observed that similar improvements were achieved for the experiments performed by combining all data and using 2-fold cross. In this context, according to the highest results obtained without LBP and with using LBP, an improvement was achieved generally in terms of percentage ranging between 2,13% and 5,07% for the sensitivity parameter, between 0,59% and 1,08% for the specificity parameter, between 0,58% and 1,87% for the accuracy parameter, between 1,18% and 3,55% for the F-1 score parameter, and between 0,13% and 0,59% for the AUC parameter.

When comparing the success of pipeline algorithms in improving the results in general, it can be seen that the algorithms of pipeline-1 and pipeline-3 obtain the highest sensitivity values; pipeline-4 obtains the highest specificity values; pipeline-1 and pipeline-3 obtain the highest accuracy values; pipeline-1 and pipeline-3 obtain the highest F-1 scores values; and pipeline-1, pipeline-2 and pipeline-3 algorithms successfully obtained the highest AUC values.

When the input data with the best results obtained by using the pipeline algorithms are examined, it can be seen that using the real part of the LL sub-image band for the first training-test data set and using the original images for the second training-test data set provided the best results. Experiments performed using the 2-fold cross by combining all the data also confirm this situation. For this reason, only the results of the experiments mentioned were included in the study, in consideration of the length of the study.

The highest mean sensitivity, specificity, accuracy, F-1 score, and AUC values obtained using the study pipeline algorithms are as follows, respectively; 0,9947, 0,9800, 0,9843, 0,9881, 0,9990 for the first training-test data set and the first CNN architecture; 0,9867, 0,9800, 0,9809, 0,9853, 0,9977 for the first training-test data set and the second CNN architecture; 0,9853, 0,9926, 0,9857, 0,9774, 0,9988 for the second training-test data set and the first CNN architecture; and 0,9920, 0,9939, 0,9891, 0,9828, 0,9991 for the second training-test and the second CNN architecture.

The highest mean sensitivity, specificity, accuracy, F-1 score and AUC values obtained in the experiments performed by combining all data and using the 2-fold cross were respectively; 0,9760, 1,0000, 0,9906, 0,9823, 0,9997 for the first CNN architecture; and 0,9707, 1,0000, 0,9867, 0,9752, 0,9994 for the second CNN architecture.

Within the scope of the study, the best results obtained before and after using the pipeline algorithm and the comparison of these results with the recent literature studies are given in Table 24.

Table 24 Comparison of the results obtained, within the scope of the study, with previous studies

6 Discussion

As a result of our study on the automatic classification of chest X-ray images and using one of the deep learning methods, the CNN, some important and comprehensive test results were obtained for early diagnosis of Covid-19 disease. When the results obtained within the scope of the study are compared with the literature studies detailed in Tables 1 and 24, the results of the study were found to be better than the 14 out of the 16 studies in which this value was calculated for the sensitivity parameter, than all the 13 studies in which this value was calculated for the specificity parameter, than the 13 out of the 15 studies in which this value was calculated for the accuracy parameter, than the eight out of the nine studies in which this value was calculated for the F-1 score parameter, and than all the 3 studies in which this value was calculated for the AUC parameter. Moreover, if it is necessary to make a comparison in terms of run-times, it was found that it produced a result at least three times faster in terms of run-time than the result was obtained in the study conducted by Mohammed et al. [29]. This study is the only study in which this parameter was calculated. Also, it is at least ten times faster than the study conducted by Toraman et al. [39]. These two studies were studies in which the run-times were shared. No information was given about run-times in the other previous studies.

Overall, the results obtained within the scope of the study lagged behind the results obtained in studies conducted by Tuncer et al. [26], Benbrahim et al. [35], and Loey et al. [38]. However, in order to make a more detailed comparison, the number of images used in these studies should be compared with the number of images used in our study. The number of images used in our study is higher than the number of images used in these studies. In particular, the number of images used in our study is almost three times the number of images used by Loey et al. [38]. Another important issue is the procedure for training and testing. There was no cross validation in the studies by Benbrahim et al. [35] and Loet et al. [38]. In our study, cross-validation in the training-test processes is one of the important measures taken against the overfitting problem that occurs during the training of the network. However, it is known that cross validation improves the reliability of the study results while balancing the study results. In this context, these issues should be taken into consideration when making a comparison.

In the context of the study, if an evaluation should be based on the differentiation made between giving the images to the CNN as input directly and after the LBP was applied, it can be seen that the images obtained by applying the LBP produced worse results than the original images. However, the pipeline classification algorithm presented in the context of this study enabled the results obtained to be improved by combining the original and LBP-applied images. In this context, a significant part of the best results obtained in the study was provided using the pipeline classification algorithm. In this sense, it can be seen that the results of the study support some other literature studies [61,62,63,64,65,66] where the CNN and LBP methods are used together and use of the LBP was shown to increase the success of the relevant study.

The success achieved through the pipeline approaches in the study is due to the fact that some classification results that could not be revealed without using the LBP alone and with using the LBP alone were revealed by using the two methods together. Feeding the results from the two sources in the pipeline approaches results in an increase in running time. However, the results obtained within the scope of the study show that this time cost can be eliminated by using DT-CWT. In this way, it has been observed that working success can be increased significantly without time cost. It is considered that this model is within the scope of the study and can be used in many other deep learning studies.

It was evaluated that another important factor in achieving the successful results in this study was the framing process, which included the chest region and clarified the area of interest before the training and test procedures started. Hence, thanks to this pre-process carried out in this context, the parts lacking medical diagnostic information were removed from the images and only the relevant areas on the images were used in the procedures.

As the size of the inputs given to the CNN increases, the time taken for the training and testing increases. The DT-CWT transformation used in the study reduces the size of the image by half. Although the image sizes are reduced by half, there is no serious adverse effect on the study results. By contrast, some of the best results achieved in the study were obtained using the DT-CWT. In this context, although the pipeline classification algorithms proposed in the study increase the time to produce the results for the image, the times in question are less than half the time required for the images to be used directly without applying LBP and DT-CWT. Also, all the training and test procedures provided in the study reflect the amount per image. However, approximately 98% of these periods are spent on the training procedures. In this context, in the case where the results obtained by the transfer learning approach are used with the pipeline classification algorithm proposed in the study, the periods mentioned will decrease accordingly.

The pipeline algorithms revealed within the scope of the study were tested for data sets with different weights in terms of the number of Covid-19 and non-Covid-19 images, for different training-test ratios and different CNN architectures. The pipeline algorithms were successful for all these situations that may have affected the results. This shows that the proposed pipeline algorithms are not partial but are general solutions. From this point of view, it is obvious that if the pipeline algorithms mentioned above are added to the algorithms used in other literature studies, this would increase the success of these studies.

The results of the study show that analyzing chest X-ray images in the diagnosis of Covid-19 disease using deep learning methods will speed up the diagnosis and significantly reduce the burden on healthcare personnel. To further improve the results of the study, increasing the number of images in the training set, i.e., the creation of databases in which the clinical data of patients with Covid-19 that are accessible to the public, is of prime importance.

After this stage, it is aimed to realize applications using CT images of the lungs an important diagnostic tool, such as chest X-ray images, in Covid-19 disease diagnosis. In addition, it is planned to analyze the effects of using the results obtained, through direct transfer learning in pipeline classification algorithms, on the study results. This is evaluated as another important application to classify the complex-valued sub-bands of images obtained by applying DT-CWT, with the help of using the complex-valued CNN directly.