Automatic chronic degenerative diseases identification using enteric nervous system images

Felipe, Gustavo Z.; Zanoni, Jacqueline N.; Sehaber-Sierakowski, Camila C.; Bossolani, Gleison D. P.; Souza, Sara R. G.; Flores, Franklin C.; Oliveira, Luiz E. S.; Pereira, Rodolfo M.; Costa, Yandre M. G.

doi:10.1007/s00521-021-06164-7

Automatic chronic degenerative diseases identification using enteric nervous system images

Original Article
Published: 17 June 2021

Volume 33, pages 15373–15395, (2021)
Cite this article

Download PDF

Neural Computing and Applications Aims and scope Submit manuscript

Automatic chronic degenerative diseases identification using enteric nervous system images

Download PDF

Gustavo Z. Felipe ORCID: orcid.org/0000-0003-3613-1048¹,
Jacqueline N. Zanoni¹,
Camila C. Sehaber-Sierakowski¹,
Gleison D. P. Bossolani¹,
Sara R. G. Souza¹,
Franklin C. Flores¹,
Luiz E. S. Oliveira²,
Rodolfo M. Pereira³ &
…
Yandre M. G. Costa¹

1971 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Studies recently accomplished on the Enteric Nervous System have shown that chronic degenerative diseases affect the Enteric Glial Cells (EGC) and, thus, the development of recognition methods able to identify whether or not the EGC are affected by these type of diseases may be helpful in its diagnoses. In this work, we propose the use of pattern recognition and machine learning techniques to evaluate if a given animal EGC image was obtained from a healthy individual or one affect by a chronic degenerative disease. In the proposed approach, we have performed the classification task with handcrafted features and deep learning-based techniques, also known as non-handcrafted features. The handcrafted features were obtained from the textural content of the ECG images using texture descriptors, such as the Local Binary Pattern (LBP). Moreover, the representation learning techniques employed in the approach are based on different Convolutional Neural Network (CNN) architectures, such as AlexNet and VGG16, with and without transfer learning. The complementarity between the handcrafted and non-handcrafted features was also evaluated with late fusion techniques. The datasets of EGC images used in the experiments, which are also contributions of this paper, are composed of three different chronic degenerative diseases: Cancer, Diabetes Mellitus, and Rheumatoid Arthritis. The experimental results, supported by statistical analysis, show that the proposed approach can distinguish healthy cells from the sick ones with a recognition rate of 89.30% (Rheumatoid Arthritis), 98.45% (Cancer), and 95.13% (Diabetes Mellitus), being achieved by combining classifiers obtained on both feature scenarios.

Machine learning and deep learning approach for medical image analysis: diagnosis to detection

Article 24 December 2022

Meghavi Rana & Megha Bhushan

Machine Learning for Dementia Prediction: A Systematic Review and Future Research Directions

Article Open access 01 February 2023

Ashir Javeed, Ana Luiza Dallora, … Peter Anderberg

Convolutional neural networks: an overview and application in radiology

Article Open access 22 June 2018

Rikiya Yamashita, Mizuho Nishio, … Kaori Togashi

1 Introduction

The Enteric Nervous System (ENS), which the small intestine is dependent of, controls the digestive tract, coordinating different movement patterns such as: fast propulsion of content (peristalsis), mixing movements (segmentation), slow propulsion and retropulsion (expulsion of harmful substances associated with vomiting) [16].

Two networks, or neural plexus, compose the main components of the ENS: (1) an plexus located between the longitudinal and circular muscle layers, called myenteric (or Auerbach’s) plexus; and (2) the submucosal (or Meissner’s) plexus, located in the submucosa. The myenteric plexus controls the gastrointestinal movements. While the submucosal plexus, overall, controls the gastrointestinal secretion and the local blood flow [16]. The Enteric Glial Cells (EGC) are another type of cells that can be found in the ENS as well. These ones play a vital role in the homeostasis of the gastrointestinal tract (GIT) functions [44].

Formerly, it was thought that the EGC worked only as a structural support to the neurons. But decisive studies carried out more recently [18, 44] verified that these cells also have other functions, contributing significantly to the neuronal maintenance, survival and function. In neurodegenerative processes, these cells play a role in the neuronal reconstruction, increasing their expression [41].

The immunostaining, i.e., an antibody-based method to detect a specific protein in a sample, of the S100 protein is used to identify the EGC. The S100 is a calcium binding protein located in the cytoplasm and/or nucleus of nerve and non-nerve tissues and can be expressed exclusively in the EGC. This one regulates the cytoskeleton’s structure and function, as well as the calcium homeostasis in EGC’s cytoplasm. It also presents neurotrophic properties that play a neuroprotective function [17].

The study of ENS cells is usually approached in pre-clinical research, aiming to experiment new methodologies and techniques in animals to be, later on, employed in humans. Resulting in the non-exposure of patients to the risk of death or permanent disability. Several projects involving the ENS have been developed by researchers in the field of neurogastroenterology [2, 34, 35]. In these works, the enteric neurons and EGC are studied in order to understand the impact suffered by these cells on different diseases, as well as analyze the performance of different treatments for them.

Usually, the enteric neurons and the EGC are preferable in such studies, because these kind of cells are all heavily affected by a considerable portion of chronic degenerative diseases. Thus, different studies may be taken on different diseases by evaluating a single kind of image sample.

Considering that the disease affects the EGC in shape and quantity, to ascertain the healthiness of a target animal, the researcher performed morphometric and quantitative analyses. These can be declared as exhaustive and time-consuming, once the lack of automation in the overall process, makes it extraordinarily manual and repetitive. It is reaffirming the relevance of developing computational models that execute such tasks automatically and with efficiency.

With that in mind, this work aimed to develop an approach for the automatic identification of chronic degenerative diseases in EGC animal images. Being capable of categorizing if an image sample from the ENS, evidencing the EGC, was obtained by a healthy or a sick animal.

The proposed method aims to classify the images based on the extraction of handcrafted and non-handcrafted (automated learned) features. The handcrafted features are the ones here extracted by texture descriptors, while the non-handcrafted features are obtained with the use of Convolutional Neural Networks (CNNs). We have also investigated their complementarity by combining the resulting classifiers from both scenarios through classifiers’ combination techniques. As far as we know, this is the first work to deal with identifying chronic degenerative diseases on ENS images.

To experimentally evaluate the proposed approach, we have created three datasets with EGC images of rats affected by different chronic degenerative diseases: Diabetes Mellitus, Cancer (Walkers Tumor-256), and Rheumatoid Arthritis. Each dataset is composed of image samples, collected by the evidence of EGC from the myenteric plexus from control (healthy) animals and from animals that presented the target disease. The datasets are freely available for download and can also be considered as a contribution to this work.

By achieving this goal, we look forward to giving the ENS researchers a texture-based automatic alternative to perform the EGC image analysis and foment new computer science research with the ENS, considering its great unexplored potential. It is worth mentioning that this one may be expanded to automatically detect diseases in human histopathological and radiographic images, aiming to reduce the possible subjectivity in their analysis.

The remaining of this work is organized as follows: Sect. 2 presents the approach proposed in this work, describing the data augmentation protocol, feature extraction, classification and combination phases. Section 3 presents the three degenerative diseases datasets proposed in this work. Section 4 presents the experimental analysis of the proposed approach, which is subdivided in exploratory investigation, parameters, configurations and results. In Sect. 5, we present the analysis and discussion regarding the obtained results and, finally, in Sect. 6 we describe the concluding remarks and future works.

2 Proposed approach

By analyzing the content of the images investigated in this work, we can observe that texture is one of the primary visual content to be explored. In this way, we decided to organize our proposed approach mainly based on different strategies aimed at describing the textural content. Besides that, methods that can also cooperate to each other in the perspective of the combination of classifiers or representations. In this vein, we create descriptors founded both on the so-called handcrafted and non-handcrafted scenarios. An overview of the proposed approach can be seen in Fig. 1.

The handcrafted features correspond to features manually extracted, aiming to find the best representation of the addressed data through a process also known as feature engineering. These kinds of features were extracted in this work considering some widely and successfully used texture operators available in the literature. In total, three texture descriptors were broached: Local Binary Pattern (LBP) [31], Robust Local Binary Pattern (RLBP) [52] and Local Phase Quantization (LPQ) [32].

The handcrafted features are then used as input to well-known classification algorithms such as Support Vector Machines (SVM), Gradient Boosting (GB), Random Forests (RF), k-Nearest Neighbors (k-NN) and Naive-Bayes (NB) classification algorithms.

As aforementioned, the second scenario studied here extracts non-handcrafted features, i.e., features automatically extracted from the images, through feature learning techniques. In this work, we used three well-established Convolutional Neural Network (CNN) architectures to obtain this kind of feature: LeNet5 [23], AlexNet [22] and MaxNet [40].

The concept of Transfer Learning was also experimented using pre-trained CNNs to extract features from the image samples. Thus, the CNN architectures VGG16 [45], InceptionV3 [49] and InceptionResNetV2 [48] were used. The feature learning layers of such models had their weights trained in the ImageNet dataset [10]. It is worth mentioning that the Chi-square test ($X^2$) was employed as a feature selection method, aiming to reduce the number of features extracted from these CNN architectures. Differing from the traditional use of transfer learning, instead of redesigning the CNN model’s classification layers, in this work, we classified the resulting features using the SVM algorithm.

The estimation of probabilities generated from the classification experiments performed was then used to combine the resulting classifiers. Thus, the sum, product, and max classification rules, proposed by Kittler et al. [21], were applied aiming to take advantage of a possible complementarity between classifiers generated by the use of different features and techniques.

It is worth mentioning that the experiments used the Stratified K-Fold Cross-Validation technique to divide the dataset to keep the existing proportions of the problem’s classes. In this work, the k value used was set to ten. More details about the concepts introduced here may be found in the following subsections.

2.1 Handcrafted modeling

This section describes the handcrafted modeling, which follows the Pattern Recognition framework to classify features extracted through feature engineering. For simplicity, we refer them as handcrafted features.

2.1.1 Data augmentation

In this work, some pre-processing techniques were explored, considering two main goals. The first one performs variations in the image samples’ coloration. To achieve it, firstly, the image samples had their colors omitted by converting them to grayscale. Motivated by the fact that all image samples have a red tonality, implying that the two other channels of the RBG color system (blue and green) may not have any influence when extracting the handcrafted features. Figure 2 presents a comparison between an image sample in its original coloration and the same one after the conversion to grayscale.

Then, in a second approach, the image samples were pseudo-colored to create a new color pattern to highlight the EGC. The pseudo-coloring is a technique that tries to color a grayscale image sample. This is commonly achieved by mapping a single grayscale value to an RGB value, which is usually referred to as color maps. Since the image samples used in this work were originally in the RGB color system, it was necessary to convert them to the grayscale to apply such a technique. A representation of a pseudo-coloring may be observed in Fig. 3.

It is worth mentioning that these kinds of operations do not affect the image samples’ existing texture.

Most of the captured EGC image samples have a lack of sharpness in their edges/shapes. These images result from the immunohistochemical reaction for a protein expressed exclusively in EGC, the S10 protein. This protein can be irregularly distributed in the cell, which can form irregular outlines and, associated with the low resolution of the microscopy used, often generates images with a blurred aspect.

Considering that, the second goal aimed to reduce the existing blur in the image samples and increase the edges’ definition. To achieve that, the data samples had their edges (or borders) highlighted, by detecting and adding them to the original image samples. This is made possible by edge detection methods, that generally apply a filter by using different kernels. In this work, three different filters were used: Laplacian, Sobel, and Scharr. Figure 4 presents a comparison between examples generated by using these filters.

2.1.2 Texture descriptors

Different methods may be used to extract features from a given image. Regarding the visual attribute captured by these descriptors, we may observe that texture descriptors can potentially obtain good results in several different situations. And it stands out in different scenarios/applications, including in medical image analysis, biometric identification, etc. [26].

The texture of a digital image is characterized by variations in the color intensity. By observing the differences between the pixels of the images, it is possible to provide a practical way to analyze an object’s texture. Such analysis overlaps other ways to make an image’s interpretation, e.g., by color.

Among the different works described in the literature, carried out on diverse application domains, using texture-based features extraction approaches, we can mention: music genre classification [8], bird species classification [28], north atlantic right whale identification [14], identification of infants’ cry motivation [12], speech recognition [36], acoustic scene classification [13], and COVID-19 identification using chest X-ray images [38], among others.

In this work, three handcrafted texture descriptor approaches were used: Local Binary Pattern (LBP), Robust Local Binary Pattern (RLBP), and Local Phase Quantization (LPQ).

The LBP was originally proposed by Ojala et al. [31] and uses a local neighborhood from every input pixel to generate a representative binary value. Two main parameters may be cited: P and R. The P parameter represents the number of neighbor pixels from a central pixel c, while R represents the distance from it. The most common setup for LBP uses eight local neighbors ($P = 8$) two pixels distant from c ($R = 2$)[5]. The representation of such configuration can be described as $LBP_{8,2}$ or LBP(8, 2). An extension of the original LBP defines the final feature vector as the normalized histogram that counts all uniform binary patterns (a binary pattern is considered uniform if it has no more than two transitions from 1 to 0 and vice-versa when evaluated as a circular list). This one has a total length of 59 features and presents better results when compared to the histogram of all individual binary patterns [5, 7, 31]. More information about the LBP can be found in [31].

The RLBP was originally proposed by Zhao et al. [52] as a variation of the LBP texture descriptor. According to the authors, the RLBP is more accurate to capture the textural content from images that contain noise interference, when it results in non-monotonic gray-level changes (even when such changes are not significant). The RLBP searches for a bit in the LBP pattern, that possibly suffered a variation inflicted by some kind of noise and then, review it. The original LBP’s robustness is increased by this method, turning the binary patterns’ uniformity concept a bit more flexible[5]. More details about the RLBP may be found in [5].

The LPQ was initially proposed by Ojansivu and Heikkil$\ddot{a}$ [32], being designed to be a texture descriptor more sensible to the image samples affected by blur. This descriptor has been presenting good performances, even in classification tasks, not targeted to images affected by this type of interference [7]. The LPQ uses periodic information from a bi-dimensional Discrete Fourier Transform (DFT), or specifically, a Short Term Fourier Transform (STFT). The STFT is computed in a rectangular neighborhood $N_x$ for each pixel in an image sample. The rectangular window size ($N_x$) is an important parameter to be varied, considering its direct impact in the generated features. After locally computing the texture of each pixel, the resulting codes are presented in a histogram, similarly as the LBP method. More detailed information about the LPQ texture descriptor can be obtained in [32].

2.2 Classifier algorithms

In the machine learning context, the supervised learning task builds a hypothesis capable of predicting an unobserved data sample label, based on the knowledge obtained from a known dataset, containing labeled data samples. In other words, given a dataset with n examples of input/output pairs ($x_1$, $y_1$), ($x_2$, $y_2$), ..., ($x_n$, $y_n$), in which x and y may represent any value (not necessarily numerical), it is built an hypothesis (function h) that approximates to a true function f that generates each $y_i$ value starting from the $x_i$ value, i.e., $y=f(x)$. To evaluate this hypothesis, a set of test data samples is used.

This kind of task can be divided into two categories: regression problems and classification problems. If the output y assumes a finite set of values, the task is categorized as a classification problem. Otherwise, if y assumes a continuous numerical value the task is categorized as a regression problem [42].

To the accomplishment of classification tasks, classification algorithms may be used. Different classification algorithms are described in the literature. In this work, four of them were approached: Support Vector Machines (SVM), Gradient Boosting (GB), Random Forests (RF), Naïve Bayes (NB) and k-Nearest Neighbors (k-NN).

The SVM is a well-known method widely used by its efficiency to perform classification tasks. In its training step, a hyperplane is built, i.e., a decision limit with the shortest distance between the example points. In other words, it searches for a line (or surface) that segregates the patterns from different classes, being the margin defined as the distance from this one to the closest pattern. The support vectors are the transformed patterns that delimit such margin. The input is mapped to a higher dimensional space by a nonlinear function, using what is known as Kernel Trick. Such action is performed based on the fact that some data may not be linearly separable in its original input space, but being easily separable when another dimension is added [11, 42].

The NB can be categorized as a Bayesian Classification Algorithm. Such a category of algorithms identify an object based on the posterior probability. Thereby, an object’s class is assumed by the Bayes’s theorem. Introduced originally by Thomas Bayes (1702-1761), the Bayes’s Theorem (or the Bayes’s Rule) is a simple equation, quiet often used by most modern artificial intelligence systems as a base for probabilistic inferences. It allows the calculation of a new unknown probability, by using three giving known conditional probabilities, that can usually be easily found. Based on this logic, the NB classifier assumes that a dataset’s attributes are conditionally independent of each other. The prediction of an unseen data sample is then performed, based on the probabilities calculated from its attribute values, giving the labeled data samples, and targeting one specific class [42].

The RF algorithm was originally proposed as a method of building classifiers based on decision trees, being as capable of increasing the accuracy in the training step as for samples not previously observed. Its operation can be shortly described as the build of multiple decision trees in a randomly selected subspace inside the features space. Then, generalizing the classification in different complementary approaches. By the end of the method’s process, the tree-structured classifiers $h(x, \theta {i}), i \in \{1, .., k\}$, being k equivalent to the total number of decision trees, cast individual votes for one of the possible classes of x. The final prediction is assumed to be the most popular class, i. e. the class with the greatest number of votes. More details about this algorithm may be found in [19].

The K-NN is a instance-based learning algorithm and has its operation based on the nearest neighbor rule. Considering a dataset with n labeled samples $D^n = {x_1, ..., x_n}$, a certain $x'$ sample, such that $x' \in D^n$, can be described as being the closest point to a test sample x. By using the nearest neighbor rule, x is classified with the same label/class as $x'$. This rule can be naturally extended to operate with a larger number of neighbors. In this way, k nearest neighbors to x are used to perform the decision when classifying such a test sample. In other words, each of the neighbors cast a vote for one of the possible classes, and at the end, x is classified with the most popular class, i. e. the class with the largest number of votes. Considering this context, the traditional nearest neighbor rule is assumed as having $k=1$. It is noteworthy that to avoid draws, the value of k always assumes an odd integer [11, 42].

Boosting refers to a general and effective method of producing an accurate/efficient learner by combining a set of weak (or moderately inaccurate) learners, [43]. Many methods have been developed based on such concept, e.g., Gradient Boosting, AdaBoost, and others. In this work, the GB algorithm is adopted. The algorithm’s core functionality is based on constructing the new base-learners to be maximally correlated with the negative gradient of the loss function, associated with the whole ensemble [30]. So, the final model can reduce the error over time, considering the errors made by the previous predictors.

2.3 Non-handcrafted modeling

Traditionally, in the pattern recognition framework, features are extracted from the dataset and used as input in machine learning algorithms that are supposed to learn how to discriminate patterns from different classes. Thereby, a significant part of the effort put in works based on machine learning algorithms is dedicated to feature engineering. This is a time-consuming and challenging process that requires specialized knowledge.

That being said, we introduce the Feature Learning (FL), or Representation Learning (RL), concept. In this category, the techniques can automatically generate data representations, favoring the extraction of useful information during the build of classifiers and other predictive systems [1]. Among the countless ways to perform FL, Deep Learning methods, such as Convolutional Neural Networks (CNN), can be highlighted. Keeping that in mind, this section describes the non-handcrafted modeling, which aims to classify features extracted automatically by FL, being here referred to as non-handcrafted features. Figure 5 illustrates the general scheme employed to obtain features from the penultimate layer of a CNN.

2.3.1 Pre-processing

Different approaches can be described to avoid overfitting, helping to ensure better and most accurate classification rates. One of them, Data Augmentation, artificially increases the total number of image samples by performing small changes in the original samples, using different operations [47].

This approach is usually employed as a pre-processing technique when working with CNNs, taking into account the huge amount of data required to achieve satisfactory classification rates. Such a technique was applied in this work, considering the modest quantity of image samples existing in the used datasets. At this point, it is important to observe that the limited size of the dataset used in this work is virtually insurmountable. It involves the induction of diseases to the research animals. It also needs to be approved in a long-term protocol by the “Standing Committee on Ethics in Animals Experimentation” of our university.

To generate new image samples based on the ones available, randomly chosen image processing operations were applied to the original samples, such as: adding Gaussian noises, contrast normalization, adding blur noises, vertical/horizontal rotations, saturation variations, sharpen and others. Some examples of generated image samples can be seen in Fig. 6.

2.3.2 Convolution neural networks (CNN)

CNN is an efficient algorithm, widely used in pattern recognition and image processing areas, due to its favorable characteristics, such as simple structure, less training parameters, and adaptability. Its shared weights structure makes it more similar to the neurons’ connectivity pattern found in the human brain. Been heavily inspired in the human visual system structure, each neuron from a CNN does not globally visualize an entire image sample. But instead, only a portion of it (a local area) is visualized. This way, reducing the network model’s complexity and the number of weights [24, 25].

A CNN can be considered a variation of the Multi-Layer Perceptron networks, been capable of applying filters in visual data, keeping the neighborhood relation between the image pixels through the network processing [50].

Traditionally, the CNN’s layers can be segregated in two sets: (1) the set of layers responsible for the image sample’s feature extraction by FL (in this work, named non-handcrafted features), is usually composed of convolutional and pooling layers; and (2) the set of layers responsible for the classification, being composed of one or more fully connected (dense) layers [27, 50]. It is worth mentioning that the total number of layers may differ from one CNN network architecture to another.

2.4 Transfer learning

Traditionally, machine learning algorithms predict the label of unknown data samples using models that were previously trained in a labeled dataset. With the evolution of such algorithms and the complexity growth of the tasks that employ them, it creates the need to heave an enormous amount of training data to achieve satisfactory results [33].

The knowledge transferring concept induces the thought that a previous knowledge, acquired for a specific purpose, may be reused in another one. Having that in mind, the Transfer Learning ideal is presented. It aims to resolve new problems faster and with better solutions by using pre-obtained knowledge [33].

Transfer Learning extracts knowledge from a source task and uses it in a target task. When using CNNs, this method makes possible the use of pre-trained models (trained in different datasets/purposes) in new classification tasks. Thus, demanding a smaller number of data samples in CNN’s training. Once the FL layers are kept unchanged, only requiring the classifications layers to be trained. It is worth mentioning that the quantity and dimensions of such layers may differ from those used originally in the base model, aiming to adapt the network to the target task. Also, it can be replaced by traditional classification algorithms, e.g., Support Vector Machines (SVM).

Finally, it is essential to emphasize that transfer learning in this work can be seen as a timely strategy to deal with issues regarding the dataset’s modest size. As already pointed, in our specific application, it is not easy to enlarge the dataset since it needs new animal resources and the appliance of a long-term method.

2.5 Classifier combination

Many classification algorithms used to generate probability estimates in their outputs. Such values reference the prediction scores for each class present in the problem, obtained from the evaluation of test samples. It is possible to perform different operations over these probability scores aiming to combine them. This way, new predictions scores are obtained based on the previously reached values, generating a final result.

Three classifiers combination rules, originally proposed by Kittler et al. [21], were used in this work for such finality. In these equations, x refers to the pattern to be classified, n is the number of classifiers involved in the combination, c is the number of classes, $y_i$ is the output label of the $i_{th}$ classifier in a problem with the following possibilities of classes labels $\Omega = \omega _1, \omega _2, ..., \omega _c$, and $P(\omega _k | y_i(x))$ is the probability of the sample x to belong to the class $\omega _k$ according to the $i_{th}$ classifier.

Max Rule: for each existing class from the dataset, the maximum value between the prediction scores from different classifiers is chosen. Later on, the final result is given by the class with the biggest score. This combination rule can be represented by Equation 1.
$$\begin{aligned} max(x)={\mathrm{arg\,max}}_{k=1}^c max_{i=1}^n P(\omega _k | y_i(x)) \end{aligned}$$
(1)
Sum Rule: considering all generated classifiers, the calculated predictions are summed for each class. Then, the class with the maximum score is chosen as the final result. This combination rule can be represented by Equation 2.
$$\begin{aligned} sum(x) = {\mathrm{arg\,max}}_{k=1}^c \sum _{i=1}^n P(\omega _k | y_i(x)) \end{aligned}$$
(2)
Product Rule: this rule works similarly as the sum rule. But, instead of performing a sum operation, the values are multiplied. This combination rule can be represented by Equation 3.
$$\begin{aligned} prod(x) = {\mathrm{arg\,max}}_{k=1}^c \prod _{i=1}^n P(\omega _k | y_i(x)) \end{aligned}$$
(3)

It is important to remember that by combining classifiers, we have an opportunity to merge results obtained using handcrafted and non-handcrafted features once we have probabilities predictions as output from classifiers on both modes. The combination of handcrafted and non-handcrafted approaches aiming to get a single final decision has somehow already proven to be effective in other works [9, 29].

3 The datasets

This work presents three novel datasets created by researchers of the Enteric Neural Plasticity Laboratory of the State University of Maringá. Such datasets are composed of image samples obtained from the ENS of rats, in which EGC can be visualized through the immunostaining of the S100 protein. Figure 7 shows some image samples taken from the datasets. Each dataset represents an investigated disease, being them: arthritis rheumatoid (AIA), cancer (TW) and diabetes mellitus (D).

Each dataset is represented by the diseases’ name abbreviations (AIA, TW, and D) in this work. It is worth mentioning that in this scenario, the datasets can be also called “disease groups.” The datasets are composed of two classes, one containing image samples extracted from sick animals (S) and other from control/healthy (C) ones. The exact quantity of image samples per dataset and per class, and the image samples’ dimensions, can be seen in Table 1.

The datasets were created taking into account the ethical principles under the terms set out in the Brazilian federal law^{Footnote 1}, established by the Brazilian Society of Science on Laboratory Animals (SBCAL). All the proceedings were submitted and approved by the Standing Committee on Ethics in Animals Experimentation of the State University of Maringá^{Footnote 2}. After the experimental time, the animals were frozen and sent to incineration.

Table 1 Diseases evaluated in this work, according to the categories’ distribution in the experimental groups

Automatic chronic degenerative diseases identification using enteric nervous system images

Abstract

Similar content being viewed by others

Machine learning and deep learning approach for medical image analysis: diagnosis to detection

Machine Learning for Dementia Prediction: A Systematic Review and Future Research Directions

Convolutional neural networks: an overview and application in radiology

1 Introduction

2 Proposed approach

2.1 Handcrafted modeling

2.1.1 Data augmentation

2.1.2 Texture descriptors

2.2 Classifier algorithms

2.3 Non-handcrafted modeling

2.3.1 Pre-processing

2.3.2 Convolution neural networks (CNN)

2.4 Transfer learning

2.5 Classifier combination

3 The datasets

4 Experimental analysis

4.1 Exploratory investigation

4.1.1 Handcrafted modeling

4.1.2 Non-handcrafted modeling

4.1.3 Classifiers combination

4.2 Parameters and configurations

4.3 Results

4.3.1 Handcrafted modeling

4.3.2 Non-handcrafted modeling

4.3.3 Classifiers combination

5 Discussions

5.1 Which features provided the best results in each feature representation scenario and disease group?

5.2 Which feature representation provided the best results?

5.3 Which disease group is easier/harder to predict?

5.4 Have the fusion strategies contributed to improve the results?

6 Concluding remarks and future works

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation