1 Introduction

The plant diseases affect both quality and quantity of agricultural products by interfering with set of processes including plant growth, flower and fruit development, and absorbent capacity, to name but a few [1]. Therefore, early detection and classification of plant diseases play a vital role in agriculture farming. Nevertheless, two possible options may be availed — manual inspection and computer vision techniques. The former method is quite difficult and requires a lot of efforts and time [2], while the latter is mostly followed because of its improved performance [3]. Plants show range of symptoms from their early to final stages, which can be easily observed on fruits and leaves/stem with the naked eye. Therefore, set of symptoms can be categorized using computer vision (CV) and other machine learning (ML) methods [4].

A great effort has been made in the field of CV to process visual features extracted from fruits’ images for the recognition of multiple diseases [5]. Several existing methods worked well, but not considering different set of constraints — specifically related to image quality [611], training/testing samples, number of labels, and disease complexity, to name but a few [12]. In this article, two fruits are selected and four different types of fruits’ diseases are initially focused including apple scab, apple rust, grapes rot leaves, and grapes leaf blight. Mostly existing methods follow a typical architecture, which includes (a) preprocessing block, (b) segmentation block, (c) feature extraction block, and (d) classification block. Several detection methods are employed by scholars working in this domain including clustering, thresholding, color, shape, and texture-based methods, adaptive approaches, etc. All these methods are somewhat problem dependent and by some means following a same trend — addressing one sort of problems while keeping other problems’ parameters fixed. Therefore, no universal mechanism exists which efficiently deals with all kind of problems.

In this article, we are primarily focusing on the classification of aforementioned diseases by following fundamental steps. Our primary contributions are enumerated below:

1.1 Major contributions

In this article, we introduced a new automated method for the identification and recognition of apple and grape diseases. The proposed method consists of five major steps: (a) contrast stretching; (b) identification of disease part by a fusion of novel adaptive and quartile deviation (QD)-based segmentation, which efficiently performs at the change in scale, origin, and irregularity of infection regions; (c) feature extraction and fusion; (d) an integrated framework of entropy and rank correlation is implemented for feature selection; and (e) classification. Our major contributions are listed below.

  1. 1.

    A contrast stretching technique based on global min and max values is proposed, which defines a contrast range to determine lower and upper threshold values.

  2. 2.

    An adaptive thresholding method following trapezoidal rule is proposed, which works in two steps: (1) location of infected regions and (2) computing threshold based on maxima and minima — calculated after taking second derivative.

  3. 3.

    A parallel feature fusion methodology is opted, which jointly takes advantage of three sets of feature (color, texture, and shape)to select the most discriminant value.

  4. 4.

    To overcome the problem of curse of dimensionality, a feature selection methodology is proposed, which efficiently assigns ranks to set of features based on entropy.

2 Literature review

Several methods exist in literature, which accurately classify fruit diseases using computer vision methods [1318]. Specifically, for the identification of apple and grape diseases, various methods are proposed, which somehow manage to classify set of diseases with acceptable accuracy and sensitivity [1922]. In unsupervised methods, range of algorithms are proposed including K-means clustering [23], global thresholding with morphological operations [24], graph cut methods [25], color segmentation [26], CLPSO-based fuzzy color segmentation [27], and adaptive approaches [28], to name but a few.

Bhivini et al. [2] introduced a framework to classify infected regions in apples. In the first stage of segmentation, they utilized K-means clustering to excerpt the infected region and then extract color and texture features from the segmented part. Subsequently, feature fusion is performed using simple concatenation prior to classification using random forest method. Similarly, Shiv et al. [5] introduced a novel method to classify apple diseases based on color, texture, and shape features. The introduced method is comprised of three fundamental steps of segmentation using K-means; extraction of color, texture, and shape features; and classification using multi-class SVM. Following the same trend, Shiv et al. [28] introduced an adaptive approach to detect infectious regions including apple scab, rot, and blotch by achieving a classification accuracy of 93%. The proposed method incorporates three primary steps of segmentation using K-means, feature extraction, and classification using multi-class SVM.

Zhang et al. [29] followed a novel machine learning method for detecting apple diseases. They made use of HSI, YUV, and gray color spaces for the removal of background via thresholding. The infectious regions are extricated by a region growing method to calculate shape, color, and texture features for each region. Finally, the most prominent features are classified using SVM, which are selected using genetic algorithm (GA) and correlation-based feature selection (CFS) method. Similarly, Soni et al. [30] identified plant diseases by following two fundamental steps of segmentation and classification. In the first step, ring-based segmentation is performed to identify infectious regions, followed by the feature extraction step. A probabilistic neural network is used for the final classification of diseases from randomly selected images acquired from the web. Lee et al. [31] implemented a swarm optimization-based method for the identification of apple diseases. Stochastic PSO algorithm finds out 10 spectral features based on pair of bands to return distinctiveness between each pair of classes. The selected features are later utilized by SVM to achieve improved performance. Harshal et al. [32] introduced a framework for the identification and classification of grape diseases. They implemented a background subtraction method for segmentation and later analyze the regions after passing through a high-pass filter. Thereafter, unique fractal-based texture features are extracted and finally classified through a multi-class SVM. They selected downy mildew and black rot diseases for evaluation and achieved classification accuracy of 96.6%.

Pranjali et al. [33] introduced a novel approach of fused classifiers for efficient classification of grape diseases. Initially, both SVM and ANN are utilized independently and then a new ensembles classifier is constructed for final classification. Similarly, Awate et al. [34] introduced a novel idea in which they utilized K-means for segmentation. Later, texture, color, morphological, and structural features are calculated, which are then subjected to ANN classifier for final classification. A general comparison with recent methods is also provided in Table 1 — in terms of segmentation technique, type of features, feature selection, classification method, disease type, and classification accuracy.

Table 1 Comparison of proposed technique with existing methods

From the recent studies, it is quite clear that set of methods including fuzzy, thresholding, and K-means are mostly utilized for the identification of infectious regions. Recently, inclusion of saliency and CNN-based techniques show improved performance in this domain of agricultural farming [38]. Moreover, color and texture features are mostly utilized for final classification, but “curse of dimensionality” is somehow ignored. In this article, we are primarily focusing on contrast stretching, infectious region segmentation, and ultimately feature selection to avoid aforementioned problem. The contrast stretching technique improves the visual characteristics of an input image, which can help in the segmentation phase. A proposed feature selection algorithm aids in improving the overall classification accuracy.

3 Proposed method

In this section, the proposed method is explained, which incorporates series of steps including preprocessing, image segmentation and fusion, feature extraction, fusion and selection, and a final step of classification. Figure 1 demonstrates a working framework of the proposed method — clearly explaining series of aforementioned steps.

Fig. 1
figure 1

A proposed framework of detection and classification of diseases in the plants and fruits

3.1 Contrast stretching

Contrast stretching is mostly applied on the images in which visual contents need to be enhanced. In this article, a global contrast stretching technique is proposed, which directly affects the infectious regions by making them maximally differentiable compared to the background. This method initially finds the global maxima and minima of each red, green, and blue channel to generate a new global minima and maxima values. These calculated values are later utilized to find a new range of intensity values against each channel, which in turns locate a new low and high threshold values.

Let ψ(i,j,k) is an original 3-dimensional RGB image, (256×,256×3), where \(\psi _{1}(i, j)=\frac {r}{\sum _{k=1}^{3}r^{k}},\psi _{2}(i, j)=\frac {g}{\sum _{k=1}^{3}g^{k}}\), and \(\psi _{3}(i, j)=\frac {b}{\sum _{k=1}^{3}b^{k}}\) represent the modified red, green, and blue channels. Here, the red channel is fraction of \(red=\frac {red}{red+green+blue}\); therefore, we used \(\sum \) for addition of all pixel values of three channels, and their histograms are shown in Fig. 2.

Fig. 2
figure 2

Original image and red, green, and blue channels with their respective histogram. a Original image. b Red channel. c Green channel. d Bue channel

Suppose TL and TH are low and high threshold values which initialize as 0.01 and 1, respectively. Then global maxima and minima are calculated using initial TL and TH values as follows:

$$ \phi_{\text{max}}(k)=Max(\psi_{i,j}); \ \phi_{\text{min}}(k)=Min(\psi_{i,j}) $$
(1)

where ϕmax and ϕmin are global maximum and minimum values, Max and Min represents the max and min functions which select the maximum and minimum values from each channel k, where k∈{1:3} of three respective channels red, green, and blue denoted by ψ1,ψ2, and ψ3.

The initial values of global maximum and minimum are 1 and 0. Then calculate a new global minimum pixel image by subtracting ϕmin in to the original image ψ(i,j,k) and effects are shown in Fig. 3b. The information of subtracted image is stored in a temporary array (Tar) of size 256×256 and find the maximum and minimum pixel value for the entire processed image by Eqs. 2 and 3:

$$ \phi_{\text{min}}(\text{new})=\psi(i,j,k)-\phi_{\text{min}}(k) $$
(2)
Fig. 3
figure 3

Proposed contrast stretching results. a Initial global minimum value image. b New global minimum pixel image. c Contrast range image. d Variation removal image. e Final enhanced image

$$ \phi_{\text{max}}(\text{new})=\psi(i,j,k)-\phi_{\text{max}}(k) $$
(3)

These values are utilize to calculate the range of contrast by Eq. 4.

$$ R_{ctr}= \phi_{\text{max}}(\text{new})\times \phi_{\text{min}}(\text{new}) $$
(4)

where Rctr denotes the contrast range image of dimension 256×256 as shown in Fig. 3c. To control the variation of contrast stretching, the low threshold (TL) and high threshold values (TH) are updated by Eqs. 5 and 6.

$$ T_{L}=T_{L} \times R_{ctr}, \ T_{H}=T_{H} \times R_{ctr} $$
(5)
$$ T_{L}=\phi_{\text{min}}(\text{new})+ T_{L}, \\ T_{H}= R_{ctr}- T_{H} $$
(6)

The values of low threshold and high threshold are utilized in contrast stretching cost function to concatenate the results of each channels. The cost function produced the new image, which is more enhanced as compared to original image. The cost function is defined by Eq. 7:

$$ F_{\text{cost}}(i,j,k)= \left\{\begin{array}{ll} 1 & if \ \frac{ T_{ar}}{T_{H}-T_{L}}\geq R_{ctr}\\ 0 & if \ \left (T_{H}-T_{L} < R_{ctr} \right) \end{array}\right. $$
(7)

where Fcost(i,j,k) is a resultant contrast stretched image and Rctr is contrast range value which lies between 0 and 1. Equation 7 shows that if \(\frac { T_{ar}}{T_{H}-T_{L}}\geq R_{ctr}\), then the diseased region in the image is enhanced; otherwise, it improves the background. Contrast stretching final results are shown in Figs. 3 and 4, which are later processed in segmentation phase.

Fig. 4
figure 4

Proposed contrast stretching results. a Original image. b Final enhanced image. c Histogram plot

3.2 Disease identification

In this section, the proposed segmentation method is elucidated — comprising of proposed segmentation and fusion methods. In the former one, a trapezoidal based adaptive thresholding and a quartile deviation (Q.D)-based segmentation method are employed independently, while, in the latter, binary images are fused using proposed method of weighted coefficient of correlation. Figure 1 demonstrates set of steps for image segmentation and fusion.

3.2.1 Trapezoidal based adaptive thresholding

Let Fcost(i,j,k) is a contrast stretched image. To identify the infectious regions, a trapezoidal rule is employed [39], which calculates the area of infection by utilizing max and min pixel values.

$$ \Delta =\frac{Max_{\text{old}}-Min_{\text{old}}}{\text{Total}_{n}} $$
(8)

where Totaln denotes the total number of pixels in Fcost(i,j,k). A second derivative of an image is later computed and Eq. 8 is updated to find max and min pixel values. The obtained pixel values are finally embedded into a cost function to extract the infectious regions.

$$ D(i,j)={f}'(F_{\text{cost}}(i,j))\ di \ dj $$
(9)
$$ D^{2}(x, y)=f^{\prime\prime}D(i,j) \ di \ dj $$
(10)
$$ Max_{\text{up}}=Max\left(D^{2}(i, j)\right) $$
(11)
$$ Min_{\text{up}}=Min\left(D^{2}(i,j)\right) $$
(12)

where D(i,j) and D2(i,j) represent the first and second derivatives of an input image, and Maxup and Minup are the updated max and min pixel values. These updated values are initially compared with the old max and min values, defined in Eq. 8, and later updated to calculate the area of infection.

$$ \int_{\alpha}^{\beta}f(i)di=\frac{Max_{\text{up}}-Min_{\text{up}}}{\text{Total}_{n}}[D^{2}(i,j)] $$
(13)

\( \int _{\alpha }^{\beta }f(i)di\) representing area of the infected region, which is further utilized in the threshold function.

$$ T(i,j)=\left\{ \begin{array}{ll} Foreground & if \ \int_{\alpha}^{\beta}f(i)di \geq \xi\\ Background & if \ \int_{\alpha}^{\beta}f(i)di < \xi \end{array}\right. $$
(14)

where ξ denotes pixels which are directly linked to \(\int _{\alpha }^{\beta }f(i)di\), and T(i,j) represents an optimized adaptive segmented image; sample results are shown in Fig. 5.

Fig. 5
figure 5

Proposed optimized adaptive segmentation results. a Original image. b Segmented image. c Infection part mapped on the original image. d Contour image under mesh graph. e Contour of infection. f 3-D contour image

3.2.2 Quartile deviation-based segmentation

Quartile deviation-based segmentation is a new segmentation method, which can be directly mapped on to the input image, prior to the thresholding step to generate a binary image. This method works on the basis of coupling — depending on the curve changes. The coupling points are utilized with the normalization function, because Q.D is a property of a normal distribution. Let f(t)∈Fcost(i,j,k) having dimension (256×256×3), then the initial function is defined as:

$$ \int_{\mu - r}^{\mu + r}f(t)dt=\frac{1}{2} $$
(15)

where (μr) and (μ+r) represent the points of inflection. Taking L.H.S and putting the normalization function in Eq. 15:

$$ =\frac{1}{\sigma\sqrt{2 \pi}}\int_{\mu - r}^{\mu + r}e^{\frac{-1}{2}(\frac{t-\mu}{\sigma})^{2}} dt $$
(16)

Equating \(\frac {t-\mu }{\sigma }=X\) and simplify dt=σdX to obtain a new equation:

$$ Q.D=\frac{1}{\sqrt{2 \pi}}\int_{\mu - r}^{\mu + r}e^{\frac{-X^{2}}{2}} dX $$
(17)
$$ \frac{1}{\sqrt{2 \pi}}\int_{-\frac{r}{q}}^{\frac{r}{q}}e^{\frac{-X^{2}}{2}} dX=\frac{1}{2} $$
(18)

According to even property of normal distribution, it will become:

$$ \frac{1}{\sqrt{2 \pi}}\int_{0}^{\frac{r}{q}}e^{\frac{-X^{2}}{2}} dX=\frac{1}{4} $$
(19)
$$ \frac{r}{a}=0.6745,r=0.6745 \sigma $$
(20)

where r denotes final Q.D value, which is finally utilized in desired cost function for the extraction of infectious regions in fruits and plants. The output of the cost function is in the form of infectious and normal pixels.

$$ F_{\text{out}}(t)=\int \frac{1}{2(1+t)} \times \frac{1}{(1+r)^{3}}dt $$
(21)
$$ F_{\text{out}}(t)=\frac{1}{(1+r)^{3}}\int \frac{1}{2(1+t)}dt $$
(22)
$$ F_{\text{out}}(t)=\frac{1}{(1+r)^{3}} \ ln(1+t) $$
(23)

where tFcost(i,j,k) and Fout(t) represents the pixels showing infection, which are set in the threshold function to obtain a binary segmented image.

$$ F_{QD}(i,j)=\left\{ \begin{array}{ll} 1 & if \ F_{\text{out}}(t) \geq t_{i} \\ 0 & if F_{\text{out}}(t)< t_{i} \end{array}\right. $$
(24)

where FQD(i,j) represents the final Q.D-based segmented image and ti denotes the current enhanced image pixel. The Q.D segmentation results including their contour, mesh graph, and 3-D contour images are shown in Fig. 6.

Fig. 6
figure 6

Proposed Q.D-based segmentation results. a Original image. b Segmented image. c Infection part mapped on the original image. d Contour image under mesh graph. e Contour of infection. f 3-D contour image

3.2.3 Image fusion

Image fusion concept is mostly employed, where information from multiple sources (images) is consolidated into fewer images, usually a single one. In this article, a weighted coefficient of correlation (WCoC)-based technique is implemented for pixel-based fusion of two segmented images. Actual range of CoC lies between (−1 : 1), but in this work, we are working on binary images; therefore, the resultant image is a binary. This method finds a strong correlation between pixels of both images. The highest correlated pixels are assigned higher weights, while lower correlated pixels are considered to be a background and eliminated. Suppose \(\{p_{1}, p_{2}, \dots, p_{n}\}\) are uncorrelated pixels from both segmented images T(i,j) and FQD(i,j) having the same standard deviation, the correlation coefficient is defined as:

$$ \gamma_{12}=\frac{\sum(p_{1}-\bar{p_{1}})(p_{2}-\bar{p_{2}})}{\sqrt{\sum(p_{1}-\bar{p_{1}})^{2}\sum(p_{2}-\bar{p_{2}})^{2}}} $$
(25)

where γ12 denotes a correlation between pixels which is initialized as γ12=0.

Let σ2(p1)=S2 and σ2(p2)=S2 so {i,j}=(u1+u2). Let (i,j)∈(x,y), then the mathematical formulation is done as:

$$ \gamma_{xy}=\frac{\sum(x-\bar{x})(y-\bar{y})}{\sqrt{\sum(x-\bar{x})^{2}\sum(y-\bar{y})^{2}}} $$
(26)

Then assign the weight and bias values which are selected to be 0.8 and 2.5.

$$ \gamma_{xy}=W_{xy}\times \frac{\sum(x-\bar{x})(y-\bar{y})}{\sqrt{\sum(x-\bar{x})^{2}\sum(y-\bar{y})^{2}}} \times bias $$
(27)

The above equation is simplified as:

$$ R_{i,j}=0.8 \times \frac{nS^{2}}{2nS^{2}} \times 2.5=1 $$
(28)

where \(\sigma ^{2}(p_{1})=\frac {\sum (p_{1}-\bar {p_{1}})^{2}}{n}, S^{2}=\frac {\sum (p_{1}-\bar {p_{1}})^{2}}{n}, nS^{2}=\sum (p_{1}-\bar {p_{1}})^{2}, \sum (x-\bar {x})^{2}=2nS^{2}, \sum (y-\bar {y})^{2}=2nS^{2}, \sum (x-\bar {x})(y-\bar {y})=nS^{2}\) and Ri,j correlate those pixels which sum is 1. The final fusion results are shown in Fig. 7.

Fig. 7
figure 7

Proposed fusion results. a Original image. b Segmented image. c Infection part mapped on the original image. d Contour of infection. e 3-D contour image. f Contour image under mesh graph

3.2.4 Analysis of segmentation results

For the analysis of proposed segmentation technique against each disease, we selected 400 image samples (100 against each disease — apple scab, apple rust, grapes rot leaves, and grape leaf blight); few can be seen in Fig. 8. Three measures are implemented to show the performance of the proposed method including accuracy, Jaccard Index, and false negative rate — calculated as follows:

$$ J_{Index}=\frac{\left | R_{i,j}\cap S(i,j) \right |}{\left |R_{i,j} \right |+\left | S(i,j) \right |-\left |R_{i,j}\cap S(x,y) \right |} $$
(29)
Fig. 8
figure 8

Sample images for the identification of infection parts using the proposed segmentation algorithm

$$ Acc=\frac{TP_{l}}{R_{i,j}(l)+S(i,j)(l)-TP_{l}}\times 100 $$
(30)
$$ FNR=100-Acc $$
(31)

where Ri,j is a proposed segmented image, S(i,j) is a ground truth, and TPl represents correlated pixels. Results in tabular are provided in Table 2, and graphical results along with their ground truths are shown in Figs. 9 and 10. Additionally, few other sample segmentation results are provided in Fig. 11. The maximum accuracy of 95.63% is achieved from the tested images; moreover, the minimum reported negative rate is 4.37, maximum Jaccard Index is 99.26%, overall average accuracy is 93.74%, average Jaccard Index is 94.17%, and negative rate is 6.26%. Average results are also plotted in Fig. 12, which describes a range of segmented accuracy on all selected images.

Fig. 9
figure 9

Proposed infection part detection results with ground truth image. a Original image. b Segmented image. c Infection part mapped on the original image. d Contour of infection part. e Ground truth image

Fig. 10
figure 10

Proposed segmentation results with their respective ground truth image. a Original image. b Segmented image. c Contour of infection part. d Infection part mapped on the original image. e Ground truth image

Fig. 11
figure 11

Proposed segmentation results. a Original image. b Segmented image. c Infection part mapped on the original image. d Contour of the infection part

Fig. 12
figure 12

Proposed average segmentation results plotted in the form of boxplots

Table 2 Performance results of the proposed identification of infection parts in the fruits and plants

3.3 Feature extraction

Features play their vital role in recognizing the primary contents of an images or signals. Therefore, in the field of pattern recognition and machine learning, set of techniques are proposed [4045]. On the one hand, optimal set of features lead to an accurate classification, while, on the other hand, irrelevant and redundant features are one of the factors for high misclassifications. In this article, we are not only focusing on the utilization of multiple set of features but also avoiding feature redundancy by implementing a suitable feature selection method. We utilize three different types of features including statistical, color [46], and texture (segmented local binary patterns (SLBP)) from the segmented images.

For color features, RGB, HSV, LAB, and YCbCr color spaces are used and four measures, mean, standard deviation, entropy, and skewness, are calculated against each channel. From each color space, we obtain a feature vector of size 1×12, which increases up 1×48 for all selected color spaces, and N×48 for N images.

For statistical features, Harlick [47] is implemented, which originally used 14 features, but we added 8 new features including correlation 2, cluster prominence, cluster shade, dissimilarity, energy, homogeneity 1, homogeneity 2, and max probability. Addition of these features improves the overall classification accuracy but also increases the computational time. A complete mathematical description of each feature is provided in Table 3, and the final vector size is 1×88.

Table 3 Extraction of twenty-two GLCM features

LBP [48] belongs to a category of texture features, which captures the information related to the neighboring pixels. In this work, ‘A’ channel from LAB color space is utilized as an input for feature extraction, because it provides more information compared to other channels. The proposed segmented local binary pattern features (SLBPF) is based on three steps: (a) calculate the distance between extracted set of LBP features, (b) calculate the statistical features of LBP, and (c) calculate the entropy features of their 8 neighborhood features. The extracted features are simply concatenated each other and make a new feature vector of size 1×72.

$$ \Psi_{LBP}=\sum_{i=0}^{n-1}2^{m} \ S(g_{p}-g_{c}) $$
(32)

where ΨLBP is a feature vector and \( S(u)=\left \{ \begin {array}{ll} 1 & if \ \ u\geq 0 \\ 0 & if \ \ u<0 \end {array}\right \}\) is a threshold function, n=8,gp denotes total number of neighbors, and gc is a pivot location [49]. Distance between feature is calculated using relation:

$$ \vec{D}_{ij}=\sum_{i=1}^{M}\sum_{j=1}^{N}\left [ f_{i}^{n}-f_{j}^{n} \right ]^{2}, n\in n^{th} \ features $$
(33)

where \(\vec {D}_{ij}\) denotes the distance matrix which is later utilized to compute the mean, variance, skewness, and kurtosis. Later, these metrics are concatenated to generate a new vector having dimension 1×64. The entropy features of each 8 neighboring features are computed as:

$$ Ent_{f}(i)=MK \sum_{i=1}^{8}(f_{i}\ ln f_{i}, a_{x}) \\ $$
(34)
$$ Ent_{f}(j)=NK \sum_{j=1}^{8}(f_{j}\ ln f_{j}, a_{y}) $$
(35)

where ax and ay denote the neighboring ith and jth features; 8 entropy features are extracted and concatenated with the previous vector to obtain a new feature vector having size 1×72. Finally, all features are fused [50] to generate a resultant vector of size 1×208. The core architecture of feature extraction and selection is shown in Fig. 13.

Fig. 13
figure 13

An system architecture of proposed feature extraction and selection

3.4 Feature selection

To avoid redundancy, the feature selection step plays a primary role by eliminating and discarding the irrelevant and repeated information, hence selecting the most discriminant information. In this article, we implemented a new method based on rank correlation and entropy value of fused vector. The proposed method incorporates three fundamental steps: (a) calculate the correlation between fused features, (b) find the entropy value of fused features, and (c) selection of features with minimum entropy-correlation values. Find the entropy value of fused features and multiply by rank correlation; (c) set a threshold function to select those features, which are minimum to entropy-correlation value. It is given that extracted fused features f1,f2,...fn are rank from 1 to n. We need to find out the correlation between the rank of given features. The rank correlation is defined as:

$$ R_{F}=\frac{n\sum f_{1}f_{2}-\sum(f_{1})\sum(f_{2})}{\sqrt{(n\sum f_{1}^{2}-(\sum f_{1})^{2})(n\sum f_{2}^{2}-(\sum f_{2})^{2})}} $$
(36)

where f1 and f2 represents the fused feature vector. The above equation solves and simplifies as \(\sum f_{1}, \sum f_{2}=\frac {n(n+1)}{2}\) and \(\sum (f_{1})^{2}, \sum (f_{2})^{2}=\frac {n(n+1)(2n+1)}{6}\). Then calculating the difference between fused features, given as: As φ=f1f2, where φ denotes the difference between features and taking square both sides and apply \(\sum \) and divided by 2 both sides, then it will become as \(\sum f_{1}f_{2}=\frac {n(n+1)(2n+1)}{6}-\frac {\sum \varphi ^{2}}{2}\) and \(n\sum f_{1}f_{2}-\sum (f_{1})\sum (f_{2})=\frac {n^{2}(n^{2}-1)}{12}-\frac {n\sum \varphi ^{2}}{2}\). Similarly, \(n\sum f_{1}^{2}\) and \( n\sum f_{2}^{2}\) is \(=\frac {n^{2}(n^{2}-1)}{12}\). Put these simplifications in Eq. 36 and becomes:

$$ R_{F}=\frac{\frac{n(n^{2}-1)}{6}-\sum \varphi^{2}}{\frac{n(n^{2}-1)}{6}} $$
(37)
$$ R_{F}=1-\frac{6\sum \varphi^{2}}{n(n^{2}-1)} $$
(38)

where \(\sum \varphi ^{2}=\sum (f_{1})^{2}+ \sum (f_{2})^{2}-2\sum f_{1}f_{2}\). Then calculate the entropy value of fused feature vector and multiply it with the correlation. The obtained value is compared with each feature of fused vector and select the features based of final threshold function as follows:

$$ EC(f_{i})=Entropy \ \times (1-\frac{6\sum \varphi^{2}}{n(n^{2}-1)}) $$
(39)
$$ \overrightarrow{F(Vec)}=\left\{ \begin{array}{ll} Remove & if \ f_{i}>EC(f_{i})\\ Select & if \ f_{i}\leq EC(f_{i}) \end{array}\right. $$
(40)

Resultant vector \(\overrightarrow {F(Vec)}\) is utilized for final classification. We performed simulations several times and found selected vector in the range of 180–195. In several experiments, mostly the selected vector size is between 180 and 195. Finally, the multi-class SVM [51] is used as a base classifier for the classification of apple and grape diseases, and its classification results were compared with other well-known classification methods such as ensemble, decision trees, etc. Two kernel functions of SVM are utilized in this work such as linear and radial basis function (RBF). The linear kernel is used for binary class problem along other parameters such as kernel scale is automatic, classification method is one vs one, and standardized data is true. Similar for RBF kernel, the other parameters include a kernel scale is manual, box constraint level is 4, multi-class method is one vs all, and gamma is initialized as 0.3.

4 Experimental results and discussion

In this section, the proposed method is validated on a publicly available dataset, PlantVillage [52] — containing set of diseased and healthy images (Fig. 14). To prove the authenticity of the proposed algorithm, firstly, individual features are classified and latter fusion and selection is applied. A 10-fold cross-validation methodology is also opted along with a providence of a fair comparison with other state-of-the-art classifiers including decision trees (DT), quadratic discriminant analysis (QDA), quadratic SVM (Q-SVM), cubic SVM (C-SVM), fine KNN (F-KNN), weighted KNN (W-KNN), ensemble boosted trees (EBT), and ensemble subspace discriminant analysis (E-SDA). Six statistical measures are considered for the performance comparison of the proposed method, sensitivity (Sen), specificity (Spec), precision (Prec), false positive rate (FPR), false negative rate (FNR), and accuracy. Training/testing ratio is selected to be 50:50 having 50% training images and 50% for the testing. All the experiments are done in Matlab 2017b, utilizing a personal Intel Core i7 computer.

Fig. 14
figure 14

Sample selected images for testing

4.1 Apple scab disease

In this section, the classification results on apple scab diseases are presented. Total 2275 images of apple scab (630) and apple healthy (1645) are collected from the PlantVillage dataset. The results are accomplished in two phases. In the first phase, the results are obtained from each extracted set of features as depicted in Table 4 having maximum accuracy on multi-class SVM 94.1%, 86.3%, and 72.0% for SLBP, statistical, and color features, respectively. Then these results are compared with the proposed entropy-rank correlation-based selection method. Table 5 shows a maximum accuracy of 97.1%, FNR 2.9%, sensitivity 96.15%, specificity 96.2%, FPR 0.039, and precision 96.10%. Proposed results are confirmed with their confusion matrix of apple scab given in Table 6. From Tables 4 and 5, it is clearly shown that the proposed feature selection method produced best results as compared to individual set of features. Moreover, the proposed method is also compared with previous state-of-the-art methods as presented in Table 7, which gives the authenticity of the proposed entropy-rank correlation method.

Table 4 Classification result on each extracted set of features for apple scab disease
Table 5 Proposed feature selection results of apple scab disease
Table 6 Confusion matrix of all experiments for selected diseases
Table 7 Comparison with existing methods

4.2 Apple rust disease

A total of 1920 images are collected from the PlantVillage dataset containing apple rust (275) and apple healthy (1645) images. The experiments are being performed in two steps, where in the first step classification results are obtained on each extracted set of features (Table 8). Maximum accuracy achieved is on M-SVM classifier, which is 93.2%, 90.9%, and 95.8% for SLBP, Harlick, and color features, respectively. In the second step, selected features are utilized for classification using the proposed method — showing improved performance (Table 9). Classification results are also confirmed using confusion matrix given in Table 6. From Tables 8 and 9, it is quite cleared, with the proposed feature selection method, performance improved significantly. Additionally, proposed classification results are also compared with the existing methods given in Table 7.

Table 8 Classification result on each extracted set of features for apple rust disease
Table 9 Classification results on the proposed feature selection approach for apple rust

4.3 Grape diseases

Two types of grape diseases, grapes rot leave and grapes leaf blight, are selected in this section for classification. Total 2679 images are collected from the PlantVillage dataset which include grapes black rot (1180), grapes leaf blight (1076), and healthy (423). The same trend is being followed; in the first step, classification results are obtained on each extracted set of feature (Table 10). In Table 10, the classification results are obtained on grapes rot leaves having accuracy 93.2%, 90.9%, and 95.8% for SLBP, Harlick, and color features, respectively. Also, the proposed classification results of grapes leaf blight are presented in Table 11 with maximum accuracy of 96.30% — also confirmed from the confusion matrix (Table 6). Finally, the proposed results are compared with existing methods described in Table 7, which shows that the proposed method performs significantly well compared to existing methods.

Table 10 Classification result on each extracted set of features for grapes rot leave disease
Table 11 Proposed classification results for grapes leaf blight

4.4 Final classification

In this section, all selected diseases are utilized for classification, and the proposed method is directly implemented on it. The testing results are given in Table 12 having a maximum accuracy of 97.1% on multi-class SVM. The proposed testing results are confirmed by their confusion matrix given in Table 13, which shows the authenticity of the proposed method.

Table 12 Proposed classification results on all selected diseases
Table 13 Confusion matrix of the final classification

4.5 Discussion

On a broader perspective, two primary domains are somewhat covered: (1) infected region segmentation and (2) discriminant feature selection. A proposed method of segmentation is directly relying on image fusion from two different sources — selected results can be seen in Figs. 7, 9, 10, and 11 and Table 2 — having maximum achieved accuracy of 95.63% and average accuracy of 93.45%. In the latter phase, feature selection, three types of features are fused by implementing a simple serial-based method, which are finalized using the entropy-rank correlation method. Five experiments are done on selected diseases, apple scab, apple rust, rot grapes leaves, grapes leaf spot, and final classification on all diseases to achieve an accuracy of 97.1%, 94.70%, 96.60%, 96.30%, and 97.7%, respectively. For validation, the classification results are obtained on individual feature type as presented in Tables 4, 8, and 10. The proposed entropy-rank correlation results are presented in Tables 5, 9, 14, 11, and 12, which are confirmed by confusion matrix given in Tables 6 and 13, which clearly shows the authenticity of the proposed method. Additionally, 8 new statistical features improve the overall accuracy by embedding set of unique features (Fig. 15). In Fig. 15, it is explained that when 14 texture features are computed, then the achieved accuracies are 81.9%, 82.7%, 81.8%, and 84.5% for apple scab, rust, grapes rot, and grape blights, respectively, whereas the addition of 8 features increases the overall accuracy to 86.3%, 87.2%, 90.9%, and 91.7%, respectively.

Fig. 15
figure 15

Change in original 14 texture feature results after the addition of 8 new texture features. The bottom lines show the original features accuracy whereas the above lines present the accuracy after addition of 8 features. Moreover, the left side values of the plot indicates the accuracy values

Table 14 Classification results on the proposed feature selection approach for grapes rot leaves

In Fig. 16, the F1 score is calculated for the proposed feature selection approach. The F1 score is computed for all selected diseases such as apple scab, apple rust, grapes rot, and grapes leaf blight. The proposed feature selection results in terms of sensitivity, precision, F1 score, and accuracy show that the proposed feature selection method performed better as compared to individual feature sets. Finally, a comparison is conducted with latest techniques in Table 7 which shows that the proposed method performs significantly well as compared to existing methods.

Fig. 16
figure 16

F1 score of the proposed feature selection algorithm. a F1 score for apple rot. b F1 score for apple rust. c F1 score for grapes rot. d F1 score for grapes leaf blight

5 Conclusion

Detection and classification of fruit diseases is an important research area in the field of computer vision and pattern recognition. Due to the complexity and irregularity of diseases in apple and grape leaves/fruits, several existing methods are unable to achieve the required classification accuracy. Therefore, in this article, a new technique is implemented for apple and grape disease detection and classification, which is based on fusion of a novel adaptive thresholding and Q.D-based segmentation. Later on, set of different features are extracted to perform a serial-based fusion. A novel entropy-rank correlation technique is implemented for robust feature selection, which works efficiently, compared to individual features and existing related methods in terms of accuracy, sensitivity, precision, and FPR. The proposed method works not only efficiently on WEB images but also efficiently for publicly available datasets, which contains a lot of challenges like noise and background complexity, to name but a few. From this research, we finally conclude that a combination of set of different features increases the overall accuracy but also increases the computational time and complexity. Therefore, it is somewhat mandatory to involve a feature selection method. A segmentation step plays its role in the extraction of better features — leading to better classification. As a future work, deep features will be utilized instead of conventional, as well as, number of disease will be increase, but the selection step is somewhat obligatory even with the deep features.