1 Introduction

Improper medical examination, failure to turn up for the follow-up, difficulty to access the medical records of an individual can lead to delay in detection of diseases [1]. The IoT technology that has been applied to an extensive variety of services is also being applied to the health sector. IoT is certainly redefining the healthcare service by transforming the medical devices and people interface in providing medical services. IoT based healthcare applications are beneficial for patients, medical professionals, hospitals, pharmaceutical, and insurance companies. The IoT based healthcare applications are highly important since it improves medical care provided to the patients and reduces the cost of such facility [2]. The IoMT allows effective and efficient supervision of patient health and can even make an early diagnosis of disease and save an individual's life with the initiation of timely treatment [3]. The availability of new-age sensors has improved the performances of IoMT services by enabling the timely and precise collection of an individual's physiological parameters. However, the accuracy of the diagnosis system is not only dependent on precise data but also on the image analysis technique. Researchers [4] have also developed deep learning-based frameworks for task scheduling and sequencing for IoT assisted medical systems.

In the orthodox method of disease diagnosis, the procedures involved have huge costs and are time-consuming. The pathological sample has to be collected by a trained individual and studied by pathologists. The pathologist's prepared report is based on the observation of the samples and the reports are referred by practitioners. Lack of enough number of trained pathologists at times can delay the process of proper diagnosis of diseases and can hinder appropriate medical intervention at the right time endangering the patient’s life. This raises the requirement of auto-detection of medical images. The usage of computer-assisted detection of diseases can be tracked down to the 1990s [5]. The use of artificial intelligence in the analysis of medical images for the medical evaluation of an individual proved to be immensely useful (Fig. 1). With the enormous growth in the field of artificial intelligence and machine learning, the analysis of pathological images has become one prime area of focus. The need for efficient computer-assisted disease diagnosis is on the rise with the rise in critical disease cases all over the world. Automated analysis of digital images of a tissue sample is quite challenging since the cellular structure in images varies in terms of colour, shapes, sizes, and other physiological characteristics. The use of deep learning in computer-assisted diagnosis has improved the precision of the screening of diseases. With the advent of highly accurate learning networks like Convolutional Neural Network (ConvNet/CNN) [6] and recurrent neural network (RNN) [7], it is possible to detect diseases using histological tissue images.

Fig. 1
figure 1

Computer-assisted disease diagnosis using medical images

One of the major emphases of examination of pathological images has been on the computerized analysis of cytological images for disease detection. These images are mostly characterized by lone or clustered cells and are easier to process as compared to histopathological samples. The histopathological images provide a more extensive picture of the disease and its consequences on tissue samples. Moreover, screening of histopathological images for disease diagnosis is considered to be the gold standard, since it is capable of detecting a large number of diseases and several types of cancers [8]. However, the additional pathological traits present in these images pose new complexities in the process of automated disease screening. These complexities can be resolved by using sophisticated AI-based approaches. These image analyzing methods can assist the medical practitioners in making an exact diagnosis of the ailment and categorize morphological characteristics correlated with the prognosis. In IoMT applications, such automated image analysis can play a vital role in the early and accurate detection of diseases, but due to the immense size of such software, the computation is possible only either on fog nodes or on cloud servers. To compute the images locally, lightweight software is needed which can be embedded in IoMT devices. Many of the existing works as specified in Sect. 2, have achieved high accuracy in disease detection but are computationally expensive making them unsuitable for deploying within medical devices. This paper proposes a novel deep learning model ReducedFireNet, with substantially reduced size and a considerably low computational requirement for analyzing histopathological samples. Majority voting [9] is deployed to generate the final prognosis. The model is further compressed using quantization [10] without any substantial degradation in the performance of the proposed model. The key contribution of this research work can be summarized as follows:

  • A deep learning-based histopathological image processing model is proposed that learns important traits from real-life samples essential for the diagnosis of diseases.

  • A novel model is proposed with a size of a few KBs and a requirement of low computational power, such that it can be effectively embedded into any medical image capturing device and can be used to process the information at the source.

  • The simulation results validate the findings of the proposed “ReducedFireNet” model when applied to a real-life medical dataset. The comparative statement specified in the result section proves the efficiency of the model in disease prognosis using histopathological images.

The remaining sections of this paper are arranged as follows: Sect. 2 explains the related works on the proposed problem; the subsequent section discusses the motivation behind the work. Section 4 presents the proposed solution. The experimentation and the result analysis are described in Sect. 5 and Sect. 6 respectively, and, in the end, the paper is concluded in Sect. 7.

2 Related Work

There are diverse techniques of medical imaging procedures that include ultrasonography (USG), computed-tomography (CT), magnetic-resonance imaging (MRI), and digitally scanned histology images for studying medical cases. The last few decades have witnessed incredible growth in the area of medical image analysis using the deep learning approach. To prevent the deaths of patients due to late diagnosis, the research community has been dedicated to modelling AI-based frameworks for the diagnosis of fatal diseases specifically different forms of cancer.

Sun et al. [11] presented an adaptive fuzzy C-means-based mass detection approach along with a supervised neural network to inspect the presence of tumors in an area. The study aims to show the importance of the ipsilateral multi-view CAD mechanism along with concurrent analysis to reduce false-positive rates. To improve the competence of a breast-cancer CAD system, Kumar et al. [12] offered Zernike moments (ZMs) image retrieval system. To perform segmentation of breast tumors, Saidin et al. [13] used pixels and a region growing method. A scheduling framework was proposed for the grid resource allocation. Xu et al. [14] performed a coarse segmentation followed by recognition of image edges. The edge detection was performed using the mean gray-scale value as a means for combining regions. A distance-based computation is done for internal markers and morphological dilation was applied for the external marker. In recent times authors in [15] presented an extreme learning machine (ELM) model for the prognosis of breast cancer. In addition, a gain ratio feature selection method is deployed to remove insignificant features. A cloud computing-based system enabled with ELM is also developed for remote diagnosis of breast cancer. The research work reported accuracy of 98.68% using the Wisconsin Diagnostic Breast Cancer dataset.

Huang et al. [16] presented a Machine learning aided Ultrasound CAD for recognition of fetal standard plane. In [17] Doyle et al. presented a Graph Embedding algorithm to distribute the various grades of prostate cancer. The researchers deployed an SVM classifier which was able to achieve the highest accuracy of 92.8% in differentiating between the various cancer grades. Kim et al. [18] presented a CNN and U-Net-based automatic estimation method for measuring the abdominal circumference of a fetus. In [19], the authors managed to achieve an accuracy of 97% for grading prostate cancer. They used H&E (hematoxylin and eosin) stained samples for deriving features of nuclear structures. Rajpoot et al. [20] present manifold learning for shape-based differentiation of “prostate nuclei”. The researchers presented a CAD method for the detection of prostate cancer from high-quality diverse MRI images [21]. An accuracy of 62.3-76.5% is reported in the classifying of H&E-stained cervical tissue in [22]. Roy et al. [23] presented a lung cancer detection model with an accuracy of 94.12%. The proposed model used a fuzzy inference system for the prognosis and gray-scale transformation for enhancing contrast.

In [24], the authors proposed a naive Bayes network-based classification model for lymphoma. A two-stage framework is presented, the first stage transforms the raw pixel-level information to spectral planes and then global features were calculated on each spectral plane. Classification of lesions with stationery wavelet transformation-based descriptors are employed in [25]. The Analysis of Variance reported an accuracy of 100%. Nascimento et al. [26] presented a CAD prototype based on non-morphological and morphological features to diagnose subtypes of lymphoma. The model exhibited an accuracy between 94 and 96%. K. H. Abdulkareem et al. [27] proposed a machine learning and IoT based framework to diagnose COVID-19. The study uses Naïve Bayes, Support Vector Machine and random forest for the classification task and reports the highest achieved accuracy (SVM) to be 95%.

Al-Waisy et al. [28] developed a deep learning-based diagnosis tool for detecting COVID-19 based on X-ray images of an individual’s chest. The proposed deep net system reported accuracy of 99.93%. Researchers in [29] uses ResNet-14 architecture of CNN to detect anterior cruciate ligament injury from an MRI (Magnetic resonance imaging) and reported an average accuracy of 92%.

The research works discussed in this section are mostly based on complex frameworks which have high computational requirement. In the case of simple models like SVM or Naïve Bayes, deployed for disease prognosis, the accuracy is relatively low er than deep learning-based models.

3 Motivation

The global community is witnessing a very difficult situation due to COVID-19. This has further raised the importance of the Internet of Medical Things in providing essential healthcare services to patients, remotely. People who are suffering from life-threatening diseases like cancer, chronic kidney disorder, cirrhosis, and alike can have a better life expectancy if proper timely medical care is available. Diverse forms of medical image analysis are available for images available through USG, CT or MRI scan, virtual microscopy, whole slide scanning techniques. There has been an increased interest in utilizing machine learning and deep learning for medical applications however very few approaches are being developed to take advantage of the IoMT devices for improving medical pipelines. Most of the research work utilizing deep learning and statistical methods used for medical diagnosis is not suitable for being embedded in the medical imaging devices due to their large size and high computational requirement which lead to IoMT devices being ignored by many researchers.

The motivation of the proposed work is to develop a lightweight histopathological image classification system using convolutional neural networks that could utilize the weak computational power and low storage capacity of IoMT devices that are slowly being integrated into medical equipment. The importance of the availability of a local automated image analyzer is immense since it ensures early diagnosis of disease without transferring the data to the next level for computation. The automated analyzer provides an early estimation of the disease and enables the medical practitioner to provide timely appropriate care to the patients. Apart from using IoMT devices to help lab technicians during the diagnosis of a disease, another objective of the proposed work is to develop a classification process in such a manner that will protect the patient’s medical data since the process of classification takes place on the IoMT device itself, there is no need to send the medical data to an external server. The proposed work’s success will allow us to use similar techniques on other medical classification applications, we could run multiple low-cost systems of our proposed model for different diseases at the same time on the patient’s medical data, and this will greatly increase the chance of detecting an ailment that would have gone unnoticed earlier.

4 Proposed Solution

The importance of local automated analysis of histopathological images in IoMT applications has been emphasized in the earlier sections. Such an automated image analyzer model requires a solution with very low memory and computational needs while having a minimal drop in accuracy when compared to the existing state of the art models for image classification. Our proposed disease prognosis model is a lightweight image-based classifier with minimal resource requirement and high accuracy. The proposed solution is divided into 4 steps. Firstly, data augmentation is performed to improve the available dataset and the images are broken down into smaller patches. These patches are used to train our proposed model called ReducedFireNet. The concept of majority voting is employed to produce the final prediction to the corresponding image. The model is further compressed to a smaller size using Quantization.

In this section, we discuss each of these approaches in the context of the proposed solution.

4.1 Data Augmentation and Patch Creation

Data augmentation [30] is the technique used to make the dataset more diverse and expand the size of the dataset artificially by applying meaningful transformations to the existing data. It also helps in solving the class imbalance problem that may exist in datasets. In image classification tasks, having a diverse and balanced dataset is very important as datasets are the lifeline of our systems, biased or imbalanced datasets can lead to fatal errors in our predictions. Some of the data augmentation methods which can be applied to images are rotation, shearing, brightness shift, random zoom, horizontal flip, vertical flip, etc. It is however important to note that not all data augmentation techniques can be applied to all types of data. We should pick the data augmentation methods in such a manner that the transformations should result in realistic images such that the label of the original image remains preserved. After carefully analyzing various data augmentations methods, we have decided to use horizontal flip, vertical flip, and brightness shift augmentations because these transformations will result in images that are most likely to be present in an unknown input image. The examples of these augmentations are shown in Fig. 2. Figure 2 shows the different augmentations performed to generate the final image dataset. To augment the image, the image can be rotated by 180◦ along the x-axis (Fig. 2b) or the y-axis (Fig. 2c) or a random brightness shift may be applied (Fig. 2d). There are other networks like Generative Adversarial Networks (GANs) [31] which can be utilized to create synthetic data samples that can substitute real data.

Fig. 2
figure 2

Different data augmentation techniques for the same image

After applying data augmentation techniques, the image is broken down into patches. The entire process of data augmentation and patch creation is explained in Algorithm 1.

figure a

4.2 Proposed Model—ReducedFireNet

Our proposed model, ReducedFireNet is a type of Convolutional Neural Network (CNN), CNN is the backbone of all state-of-the-art models used for image classification. They are a variation of neural networks where matrix multiplication is replaced by convolution for at least one layer of the model. Unlike a neural network which uses each pixel of the input image as an independent input, convolution allows the neighbourhood pixels to be taken into consideration which drastically improves the network’s performance. The ReducedFireNet model uses multiple Fire modules., which were originally the building blocks of SqueezeNet [32]. SqueezeNet is a CNN model developed by Iandola et al. in an attempt to reduce the model size while maintaining accuracy when compared to AlexNet [33]. The SqueezeNet model mainly consists of 1 × 1 and 3 × 3 filters. They efficiently used 1 × 1 filters to reduce the input channels to 3 × 3 filters and used the strategy of late downsampling in the network. Using 1 × 1 filters allows us to reduce the number of channels. If we have an input of size 32 × 32x4 and is passed to a convolutional layer containing 2 filters each of size 1 × 1x64, then the output will be of size 32 × 32x2, i.e. the number of channels is reduced from 4 to 2 as shown in Fig. 3.

Fig. 3
figure 3

Reduction of the number of channels through 1 × 1 convolution

Using 1 × 1 filters and decreasing the input channels to 3 × 3 filters help in reducing the number of parameters and downsampling late in the network helps in achieving higher accuracy.

The fire module consists of 2 main layers, the first being a 1 × 1 convolutional layer and the second being a concatenation of 1 × 1 and 3 × 3 convolutional layers (Fig. 4). The number of filters in each of these layers can be set according to the need with the only restriction being that the number of 1 × 1 filters in the first layer should be less than the sum of the number of 1 × 1 and 3 × 3 filters of the second layer.

Fig. 4
figure 4

A single Fire Module

Our proposed ReducedFireNet model consists of 4 Fire modules. The entire model creation process is explained in Algorithm 2.

figure b

Max-pooling layers are applied after every Fire module. They reduce the size of the input, speed up the computation and help in detecting more robust features. A dropout [34] layer is applied after the second fire module, to reduce overfitting and make the model more robust to the new inputs. Rectified Linear Unit (ReLU) [33] activation function is applied to the output of each convolutional layer. It gives an output 0 if the input is negative, else it directly outputs the input. The final Dense layer which is the output layer consists of 3 units corresponding to each of the output class and the activation function which is applied is Softmax[33]. The softmax function (Eq. 1) is a mathematical function that converts a numeric vector to a vector of real values which are interpreted as a probability, it applies an exponential function to each element of the input vector which is divided by the sum of all the exponents producing a normalized output where each output element lies between 0 and 1. It allows the output of penultimate layers, a real-valued vector to be represented as a normalized probability distribution. Softmax function is defined by the formula:

$$ softmax\left( x \right)_{a} = \frac{{{\text{exp}}\left( {x_{a} } \right)}}{{\mathop \sum \nolimits_{a} {\text{exp}}\left( {x_{a} } \right)}} $$
(1)

The comprehensive architecture of ReducedFireNet is shown in Fig. 5. The model is visualized using Netron [35]. In Fig. 5(a), ‘?’ in the output of “input_1” module represents the number of training samples.

Fig. 5
figure 5figure 5

a ReducedFireNet configuration b: ReducedFireNet architectural dimensions

Our proposed CNN model (ReducedFireNet) uses four Fire modules. A single Fire module consists of 2 layers, the first one is a 1 × 1 convolutional layer and the second one is the concatenation of 1 × 1 and 3 × 3 convolutional layers. These convolutional layers use ReLU activation functions. Each of the first three Fire modules is followed by a MaxPooling layer and the last Fire module is followed by a GlobalAveragePooling layer. A dropout layer is applied after the second MaxPooling layer. Finally, a dense layer with softmax as an activation function is applied for classification purpose.

4.3 Majority Voting

The concept of Majority Voting is borrowed from the ensemble-based classifiers. Ensemble-based classifiers utilize multiple base models to generate an optimal optimized model that performs exceptionally better than the individual base models. Simply in Majority Voting, predictions are performed for each patch of the image and the final prediction is the outcome with the largest count among the patch predictions [9]. As shown in Fig. 6, a high-resolution medical image has been divided into multiple image patches. Our model provides a prediction for every patch and the prediction with the most count is considered as the final prediction for the given image. The entire process of model evaluation using majority voting is explained in Algorithm 3.

Fig. 6
figure 6

Example of majority voting

figure c

4.4 Compression

Mobile and IoT Devices have limited computational power and memory space, hence there is a need to further compress our model. There are two main approaches used to compress a machine learning model, Pruning and Quantization [10]. Pruning efficiently reduces the size of a neural network by removing redundant connections that are present in a neural network, whereas Quantization is the process of transforming a machine learning model’s parameter such that the model can be trained and executed at a lower approximated precision operation. Quantization decreases the size of the model by decreasing the number of bits required to represent the weights of the model. To limit the number of effective weights, many weights are shared by different connections and the weights are fine-tuned further to maintain high accuracy. Affine representation converts higher precision weights to lower precision values as shown in Fig. 7. The other benefits of having lower precision operations apart from a reduction in model size are faster execution, reduced power consumption, and reduced hardware cost as it is cheaper to create circuitry for lower precision data when compared to the original higher precision data.

Fig. 7
figure 7

Affine representation

The process of quantization is performed in 3 steps:

Step 1: A transfer function that converts data from a higher precision to a lower precision.

Step 2: The conversion process from the original model to the new compressed model is performed.

Step 3: Calibration is performed to compute new data required by the compressed model and fine-tune parameters if required by the model.

5 Experimentation

This section describes the implementation of our proposed solution. It is divided into 4 subsections. In the first subsection, we introduce the dataset which we will be using throughout our experiments along with various complications with medical datasets and how we have tried to solve them. Then we explain the training and evaluation of the ReducedFireNet model. In the third subsection, we analyze and compare the results of our model with various state of the art CNN architectures. In the last subsection, compression of the model is performed to reduce the memory and computational requirements. TensorFlow [36] and Keras [37] were used in our experiments.

5.1 Dataset, Augmentation, and Patch Creation

The Malignant Lymphoma dataset [38] which we have used consists of 113 images of chronic lymphocytic leukemia (CLL), 122 images of mantle cell lymphoma (MCL), and 139 images of follicular lymphoma (FL), in total 374 histopathological images stained with H&E. A sample image of each subtype is shown in Fig. 8.

Fig. 8
figure 8

An example image of each type of lymphoma

Data used to train state of the art image classifiers are incredibly different from the available medical data. There is a severe lack of labelled medical datasets. For example, ImageNet [39] Dataset has over 14 million images with over 20,000 different subcategories. However, there is a lack of similar medical datasets available to the general public. Moreover, the creation of a similar combined medical dataset is challenging as most of the medical diseases are different from each other and may require an entirely different procedure for their identification. Further these available medical images usually have very high resolution, and the general practice of image resizing to a lower size is invalid for them because resizing of these images might lead to loss of cellular details which are crucial for detection of the ailment. Similarly, the dataset which we are using has only 374 high-resolution images. To resolve these problems, we applied data augmentation mainly to expand the dataset size and also to tackle the problem of class imbalance. The distribution of lymphoma subtype classes in the dataset before and after applying augmentation techniques is shown in Table 1.

Table 1 Image distribution before and after augmentation

Then we created patches of size 128 × 128 from each training image such that each image generated 80 patches as shown in Fig. 9 and used these patches to train the model.

Fig. 9
figure 9

Image patch creation

5.2 Training and Evaluation

The building block of our model is the fire module. ReducedFireNet in a sense is a variation of SqueezeNet which we have tried to optimize for such medical applications. After experimenting with different numbers of fire modules, we decided to use 4 fire modules applied in sequence with a max-pooling layer following each of the fire modules as using this combination we were able to achieve high accuracy while minimizing the computation requirements and the number of parameters which in turn reduces the FLOPS (Floating Point Operations Per Second) and model size. A dropout layer is also applied before the third fire module to reduce overfitting.

To train and compare our model with other states of the art models, we used stratified K-Fold cross-validation [40]. Cross-Validation is the method of dividing the data in such a manner that every data point will be part of the training and testing data. The Dataset is divided into K different equal parts, (K-1) parts are employed to train the model and the remaining Kth fold is used to test the model, each different Kth fold will be used to test the model in K different iterations. A larger value of K leads to a less biased model. In Stratified Cross-Validation, splitting of the data is done in such a manner that each fold has the same ratio of categorical values so that data distribution across every fold is the same.

We performed 5-Fold cross-validation such that for each time the model trains, 80% (360 samples) of total data is taken as training data and 20% (90 samples) is taken as testing data. Then each of the 360 images was broken into smaller patches with each patch having a size 128 × 128. Each image created 80 patches, so finally, the training data had 28,800 (360 × 80) samples. Using these patches as the training data, we train our model. To evaluate the performance of our model, we utilized the majority voting strategy which selects the prediction with the maximum count among all the patch predictions. Mean F1 scores and accuracies are used for evaluation. In addition to this, we separately report the accuracy (Eq. 2) and F1 score (Eq. 3) [41] of each fold as shown in Table 2. The F1 score is calculated using Precision (Eq. 4) and Recall (Eq. 5) values.

Table 2 Fold-wise Accuracy and F1 score of ReducedFireNet model

Accuracy: It denotes the ratio of correct predictions to the total predictions

$$ {\text{Accuracy }} = \frac{TP + TN}{{TP + FP + TN + FN}} $$
(2)

Precision: It denotes the ratio of correct positive predictions to all the positive predictions.

$$ {\text{Precision }} = { }\frac{TP}{{TP + FP}} $$
(3)

Recall: It denotes the ratio of correct positive predictions to all the positive observations.

$$ {\text{Recall }} = { }\frac{TP}{{TP + FN}} $$
(4)

F1 Score: It is the harmonic mean of precision and recall.

$$ {\text{F}}1 - {\text{Score}} = 2{*}\frac{Recall*Precision}{{Recall + Precision}} $$
(5)

TP symbolizes true positive, TN represents true negative, FP symbolizes false positive and FN denotes false negative values for multiclass predictions.

6 Results and Limitations

To test the performance of our model, we compared it with state-of-the-art models like ResNet50 [42], Xception [43], and InceptionV3 [44] which are being used for image classification at a large commercial scale. Apart from them we also compared our model with MobileNet [45], which is used for small, low latency, low power image classification for mobile devices.

To show the effectiveness of patch-based training, we have also done the image-based training on each of the models and compared those results with patch-based training. In image-based training, the entire image is resized to a lower size, and the whole image is used for training the model. The results of image-based training are shown in Tables 3 and 4 and the results of patch-based training are shown in Table 5 and Table 6. Tables 3 and 4 report the accuracies and F1 scores obtained when the entire image was used to train the models. Tables 5 and 6 report the accuracies and F1 scores obtained when the image was broken down into patches and these patches were used to train the models. In each of these tables, separate accuracy and F1 score for each fold of fivefold cross-validation are reported and their mean value is calculated.

Table 3 Image based accuracies
Table 4 Image based F1 scores
Table 5 Patch based accuracies
Table 6 Patch based F1 scores

From Fig. 10, it is visible that patch-based training performs much better than image-based training.

Fig. 10
figure 10

Comparison of image-based and patch-based accuracies

These state-of-the-art models have a large number of parameters in their model which in turn leads to a larger model size. Further, these models require a much larger number of floating-point operations per second (FLOPS) in one single instance of the model training.

The following five metrics were employed to analyze the results:

  1. i.

    Accuracy

  2. ii.

    F1 score

  3. iii.

    Number of parameters

  4. iv.

    FLOPS

  5. v.

    Model size

The experiment results have been summarized in Table 7

Table 7 Detailed comparison of ReducedFireNet against the state-of-the-art models

From Table 7, we can observe that InceptionV3, Xception, and MobileNet perform very well on both accuracy and F1 score. However, the FLOPS and model size of InceptionV3 and Xception is very high when compared to our proposed ReducedFireNet model. Although the MobileNet requires much fewer FLOPS as compared to InceptionV3 and Xception, the model size of MobileNet remains large. ReducedFireNet performs as well as InceptionV3 and MobileNet with only a 1.12% drop in accuracy while achieving an approximately 100 times smaller model, thereby having a much lesser memory and computational requirements, hence making it a strong candidate for deployment on mobile and IoT devices.

6.1 Compression

To compress our model, we used the process of Quantization, we used the TensorFlow Lite framework to perform Post-Training Quantization to compress our model. TensorFlow Lite (TFlite) is an open-source framework for deep learning, developed by TensorFlow for inference on mobile and IoT devices. Different optimizations are required to allow models to be executed under these constraints, TFlite allows optimizations tailored specifically for hardware accelerations on different kinds of mobile and IoT devices. Post-Training Quantization is the compression technique that compresses a trained TensorFlow model thereby improving the latency for the targeted hardware while having a minimal impact on the accuracy. The TensorFlow model is quantized to its TF Lite format after the process of training has been completed by the model. The entire process is shown in Fig. 11.

Fig. 11
figure 11

Steps showing the model compression using TensorFlow Lite

As seen in Table 8, performing compression has allowed us to decrease the model size from 391.2 to 42.8 KB with a Moreover, the size of histopathological images is usually large, 1.11% decrease in terms of accuracy. Figure 12 shows the comparison between the size of our proposed model ReducedFireNet with and without compression.

Table 8 Comparison of uncompressed and compressed ReducedFireNet model
Fig. 12
figure 12

Comparison of model size before and after compression

6.2 Limitations

Despite our best efforts to create a precise lightweight model with a low computational requirement for disease detection using histopathological images, our process has few limitations. Firstly data augmentation methods performed on the dataset could be further improved by generating high-quality synthetic data using a variation of GANs and employing techniques like texture transfer and style transfer. Regarding the compression of our model, we have utilized post-training quantization, which could further be improved slightly by utilizing an efficient pruning strategy for our proposed model.

7 Conclusion and Future Work

The importance of IoMT for providing effective, low-cost, and timely medical care to patients is undeniable. To aid this medical facility, early diagnosis of critical diseases is equally important. Timely detection of diseases can save millions of lives. However, accurate and spontaneous disease prognosis task suffers from several issues. The transfer of real-time patient data to the next IoT level, for computation, can cause a delay in the decision. Moreover, the size of histopathological images is usually large, which will require large bandwidth for data transfer. These issues can be resolved if the collected data is analyzed locally at the IoMT devices. To resolve these issues, we proposed a ReducedFireNet model, which is a high performing, low weight model comparable with state-of-the-art models while maintaining its small size and fewer FLOPS requirements. However, the model size remained fairly sizeable and the compression process allowed us to decrease its size to 43 KB with only a minute drop in terms of accuracy.

For our future work related to this paper, we will work on mitigating our process’s limitations by utilizing a CycleGAN to develop high-quality synthetic histopathological images to further augment our dataset and devising an effective strategy for pruning to further compress our proposed ReducedFireNet model. Apart from this we also want to work on making other medical applications of deep learning such as detection of cardiac arrhythmia and image segmentation used for nuclei segmentation, etc. feasible on loMT devices to further improve medical pipelines.