NSGA-III-Based Deep-Learning Model for Biomedical Search Engines

Gupta, Manish; Kumar, Naresh; Singh, Bhupesh Kumar; Gupta, Neha

doi:https://doi.org/10.1155/2021/9935862

Mathematical Problems in Engineering

On this page

Abstract Introduction Related Work Conclusions Data Availability Conflicts of Interest References Copyright Related Articles

Research Article | Open Access

Volume 2021 | Article ID 9935862 | https://doi.org/10.1155/2021/9935862

NSGA-III-Based Deep-Learning Model for Biomedical Search Engines

Manish Gupta,¹Naresh Kumar,²Bhupesh Kumar Singh,³and Neha Gupta¹

Academic Editor: Robertas Damaševičius

Received23 Mar 2021

Revised10 Apr 2021

Accepted04 May 2021

Published13 May 2021

Abstract

With the advancements in biomedical imaging applications, it becomes more important to provide potential results for searching the biomedical imaging data. During the health emergency, tremors require efficient results at rapid speed to provide results to spatial queries using the Web. An efficient biomedical search engine can obtain the significant search intention and return additional important contents in which users have already indicated some interest. The development of biomedical search engines is still an open area of research. Recently, many researchers have utilized various deep-learning models to improve the performance of biomedical search engines. However, the existing deep-learning-based biomedical search engines suffer from the overfitting and hyperparameter tuning problems. Therefore, in this paper, a nondominated-sorting-genetic-algorithm-III- (NSGA-III-) based deep-learning model is proposed for biomedical search engines. Initially, the hyperparameters of the proposed deep-learning model are obtained using the NSGA-III. Thereafter, the proposed deep-learning model is trained by using the tuned parameters. Finally, the proposed model is validated on the testing dataset. Comparative analysis reveals that the proposed model outperforms the competitive biomedical search engine models.

1. Introduction

The advancements in biomedical imaging applications lead to the challenge of providing the significant results for searching the biomedical imaging data. During the health emergency, tremors require efficient results at rapid speed to provide results to spatial queries using the Web [1]. Search engines which allow obtaining specific medical contents along with the complementary and different details would considerably help biomedical researchers [2]. An efficient biomedical search engine can obtain the significant search intention and return additional important contents in which users have already shown some interest [3, 4].

Numerous biomedical search engines have been implemented such as inverted index and Boolean retrieval [5–8]. But, index sizes are becoming exponentially large as the number of biomedical contents is increasing at a rapid rate [5]. Thus, ranking of biomedical contents has been achieved using their respective retrieval frequency [8]. It has been found that to compute the results when the biomedical queries have minimum similarity scores is still an open area of research [9–11].

Essie is a well-known biomedical search engine which is providing services to various websites at the National Library of Medicine. It is a phrase-based search engine with notion and term query expansion and probabilistic relevancy ranking. It has proven that a judicious group of exploiting document structure, phrase searching, and concept-based query expansion is a beneficial method for data retrieval in the biomedical field [12].

Bidirectional Encoder Representations from Transformers (BERTs) has shown good advancement in the field of biomedical search engines. In precision medicine, corresponding patients to appropriate investigational support or probable therapies is a difficult job which needs both biological and clinical information. To resolve it, BERT-based ranking models can provide fair comparisons [13]. A computer-based query recommendation model was designed which recommends semantically interchangeable terms based on an initial user-entered query [14, 15].

The typical view of biomedical search engine is represented in Figure 1. Initially, potential features of biomedical images are obtained. Thereafter, similarity score is computed. A prediction model is then utilized to compute the image index. Finally, the obtained class wise results are returned to the users.

The primary contributions of this paper are as follows:(1)An NSGA-III-based deep-learning model is proposed for biomedical search engines(2)The hyperparameters of the proposed deep-learning model are obtained using the NSGA-III(3)The proposed deep-learning model is trained by using the tuned parameters obtained from NSGA-III(4)Finally, the proposed model is validated on the testing dataset.

The remaining paper is organized as follows: The related work is presented in Section 2. Section 3 presents the proposed biomedical search engine. Comparative analysis is discussed in Section 4. Section 5 concludes the paper.

Hsieh et al. [16] proposed a semantic similarity approach by utilizing the page counts of two biomedical contents obtained from Google AJAX web search engine. The features were extracted in co-occurrence forms by considering two provided words. Support vector machines (SVMs) were utilized for classification purpose. Mao and Tian [17] utilized TCMSearch as a semantic-based search engine for biomedical images. It has shown good results for biomedical contents. Wang et al. [3] designed an ontology-graph-based web search engine named as G-Bean for evaluating biomedical contents from the MEDLINE database. The multithreading parallel approach was utilized to obtain the document index to address. Kohlschein et al. [18] studied that ViLiP can be efficiently utilized to search the contents in PubMed. ViLiP was further improved using NLP-based semantic search engine for obtaining better drug-related contents within a query. Depending upon the linguistic annotations, significant drug names can be obtained.

Tsishkou et al. [19] designed a TTA10 approach which stores biomedical data in hierarchical fashion. Logarithmic complexity was utilized to retrieve a huge data repository. AdaBoost was utilized to integrate independent search results to obtain efficient results. Mao et al. [1] designed a prototype model of subject-oriented spatial-content-based search engine (SOSC) for critical public health hazards. It can obtain Web contents from the Internet, find the Web page database, and obtain spatial content during pandemic from these Web pages. Boulos [20] designed a GeoNames-powered PubMed search which has an ability to handle these problems. The geographic ontology can utilize potential words to obtain the significant results from the PubMed. Al Zamil and Can [21] improved the contextual retrieval and ranking performance (CRRP) with minimal input from researchers. The performance was evaluated using the retrieval procedure in terms of topical ranking, precision, and recall. Grandhe et al. [22] designed an ascendable search engine (ASE) for biomedical images. Researchers can select a region of interest iteratively to evaluate the corresponding region from the images. An efficient cluster-based engine was designed to reduce the content retrieval time. Mishra and Tripathi [23] proposed a vector- and deep-learning- (VDP-) based biomedical search engine model. The degree of similarity was computed by integrating the vector space and deep-learning model.

The implementation of efficient biomedical search engines is still a challenging issue [24]. Recently, many researchers have utilized various deep-learning models to improve the performance of biomedical search engines [25]. However, the existing deep-learning-based biomedical search engines suffer from the overfitting and hyperparameter tuning problems.

3. Proposed Model

This section discusses the proposed biomedical search engine. Initially, the deep-learning model is discussed. Thereafter, the tuning of the deep-learning model is achieved using the NSGA-III.

3.1. Deep Convolutional Neural Network

The deep convolutional neural network (CNN) is widely accepted as a classification problem, and many researchers have utilized it in the field of search engines. Figure 2 shows the deep-learning-model-based biomedical search engine. It utilizes numerous convolution filters to extract the potential features.

We assume a single channel which is mathematically computed as

Here, . shows the dimension of every input factor. represents the number of images. During convolution operation, a filter is utilized to extract the potential features aswhere represents a bias. is the integration of . shows an activation function. The filter approaches . Thus, feature map can be computed as

Maxpool is applied on to compute peak value as . It shows the final feature obtained using .

The proposed model evaluates numerous feature groups by utilizing various filters with different sizes. The computed feature groups returns a vector aswhere shows the number of filters. Softmax () is utilized to evaluate the prediction probability as

We assume a training set (, ), where shows the similarity of the biomedical image query for search engine , and the prediction probability of the proposed model is for each label . The computed error can be defined as

Here, represents the labels of . shows an indicator and if , otherwise. The gradient descent is then utilized to update the deep-learning variables.

3.2. Nondominated Sorting Genetic Algorithm-III

Nondominated sorting algorithm-III (NSGA-III) [26] is widely accepted to solve numerous engineering applications. Recently, many researchers have utilized NSGA-III to solve hyperparameter tuning issues with deep-learning models [9, 11, 27].

The nomenclature of NSGA-III is demonstrated in Table 1. The generation of the initial population is represented in Algorithm 1. Initially, the random population is computed. The computed solutions are then encoded to the initial attributes of CNN.

	Optimal CNNs.

	Evaluate CNN on optimal CNNs.

	whiledo
	Consider CNN with maximum performance


	Ifthen

	else

	end if
	end while
	elect a random set of solutions from using a normal distribution
	obtain a group of random solutions

	return

NSGA-III- and CNN-based biomedical search engines are discussed in Algorithm 2. The random-population-based CNN models are trained on the chunk of biomedical dataset. Fitness of the computed CNN models is then evaluated. Solutions are then divided into dominated and nondominated groups. Crossover and mutation operators are further employed to compute the children. Nondominated sorting () is implemented to sort the nondominated solutions. Based upon the termination criteria, the tuned parameters of CNN models are returned.

	elect randomly solutions from given elite
	for alldo
	decode as hyper-parameters of gradient boosting
	fortodo
	compute a random population based CNN in
	ifthen

end if
	end for
	Ifthen

	end if
	fortodo
	elect randomly an
	ifthen

	else

	end if
	ifthen

	end if
	end for
	end for
	ifthen
	select solutions obtain from NSGA-III
	end if

decomposes random individual to initial parameters of the CNN model.

4. Performance Analysis

The proposed biomedical search engine is implemented on MATLAB 2019a with the help of deep-learning and image processing toolboxes. The proposed and the existing models are tested on the biomedical search engine dataset. The proposed model is compared with the competitive models such as TCMSearch [17], SVM [16], G-Bean [3], TTA10 [19], ViLiP [18], SOSC [1], GeoNames [20], CRRP [21], ASE [22], and VDP [23]. To compute the performance of the NSGA-III-based CNN model, median and variation values (i.e., median ) are computed. of biomedical dataset is used for building the model. of the dataset is used for validation purpose. Remaining is used for testing purpose.

The training and validation loss analysis of the NSGA-III-based CNN model are represented in Figure 3. It clearly shows that the loss difference between training and validation is significantly lesser; therefore, the NSGA-III-based CNN model is least affected from the overfitting issue. Additionally, the loss approaches towards and convergence during the epoch. Thus, the proposed model is trained efficiently on the biomedical images.

Training and testing analysis of the NSGA-III-based CNN model are depicted in Tables 2 and 3. Specificity, area under curve (AUC), sensitivity, f-measure, and accuracy metrics have been utilized to evaluate the performance of the NSGA-III-based CNN model over competitive models such as TCMSearch [17], SVM [16], G-Bean [3], TTA10 [19], ViLiP [18], SOSC [1], GeoNames [20], CRRP [21], ASE [22], and VDP [23]. It has been observed that the proposed model outperforms the competitive models. The bold indicates the highest performance of biomedical search engines. Comparative analysis reveals that the proposed model outperforms the competitive biomedical search engine models in terms of specificity, AUC, sensitivity, f-measure, and accuracy by , 1.8372, 1.8328, 1.4838, and 1.4828, respectively.

The comparative analysis of the NSGA-III-based CNN model with the state-of-the-art approaches is depicted in Table 4. It has been observed that the NSGA-III-based CNN model achieves significantly better results than the existing web search engines.

5. Conclusions

This paper has proposed an efficient model for biomedical search engines. It has been found that the deep-learning models can be used to improve the performance of the biomedical search engines. However, the existing deep-learning-based biomedical search engines suffer from the overfitting and hyperparameter tuning problems. Therefore, an NSGA-III-based CNN model was proposed for biomedical search engines. Initially, the hyperparameters of the proposed model were obtained using the NSGA-III. Thereafter, the proposed CNN model was trained by using the tuned parameters. Finally, the proposed model is validated on the testing dataset. Comparative analysis reveal that the proposed model outperforms the competitive biomedical search engine models in terms of specificity, AUC, sensitivity, f-measure, and accuracy by , 1.8372, 1.8328, 1.4838, and 1.4828, respectively.

Data Availability

The dataset used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

X. Mao, Q. Li, Z. Zhang, and Q. Zhu, “Application of spatial information search engine based on ontology in public health emergence,” in Proceedings of the 2009 3rd International Conference on Bioinformatics and Biomedical Engineering, pp. 1–4, IEEE, Beijing, China, June 2009.
View at: Publisher Site | Google Scholar
M. Kwak, G. Leroy, and J. D. Martinez, “A pilot study of a predicate-based vector space model for a biomedical search engine,” in Proceedings of the 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW), pp. 1001–1003, IEEE, Atlanta, GA, USA, November 2011.
View at: Publisher Site | Google Scholar
J. Z. Wang, Y. Zhang, L. Dong, L. Lin, K. Pradip, and S. Y. Philip, “G-bean: an ontology-graph based web tool for biomedical literature retrieval,” BMC Bioinformatics, vol. 15, no. 12, pp. 1–9, 2014.
View at: Google Scholar
H. S. Basavegowda and G. Dagnew, “Deep learning approach for microarray cancer data classification,” CAAI Transactions on Intelligence Technology, vol. 5, no. 1, pp. 22–33, 2020.
View at: Google Scholar
H. Schütze, C. D. Manning, and P. Raghavan, Introduction to Information Retrieval, vol. 39, Cambridge University Press, Cambridge, UK, 2008.
B. Gupta, M. Tiwari, and S. Singh Lamba, “Visibility improvement and mass segmentation of mammogram images using quantile separated histogram equalisation with local contrast enhancement,” CAAI Transactions on Intelligence Technology, vol. 4, no. 2, pp. 73–79, 2019.
View at: Publisher Site | Google Scholar
S. Osterland and J. Weber, “Analytical analysis of single-stage pressure relief valves,” International Journal of Hydromechatronics, vol. 2, no. 1, pp. 32–53, 2019.
View at: Publisher Site | Google Scholar
J. Zobel and A. Moffat, “Inverted files for text search engines,” ACM Computing Surveys (CSUR), vol. 38, no. 2, p. 6, 2006.
View at: Publisher Site | Google Scholar
M. Kaur, D. Singh, V. Kumar, and K. Sun, “Color image dehazing using gradient channel prior and guided l0 filter,” Information Sciences, vol. 521, pp. 326–342, 2020.
View at: Publisher Site | Google Scholar
T. Wiens, “Engine speed reduction for hydraulic machinery using predictive algorithms,” International Journal of Hydromechatronics, vol. 2, no. 1, pp. 16–31, 2019.
View at: Publisher Site | Google Scholar
A. Gupta, D. Singh, and M. Kaur, “An efficient image encryption using non-dominated sorting genetic algorithm-iii based 4-d chaotic maps,” Journal of Ambient Intelligence and Humanized Computing, vol. 11, no. 3, pp. 1309–1324, 2020.
View at: Publisher Site | Google Scholar
N. C. Ide, R. F. Loane, and D. Demner-Fushman, “Essie: a concept-based search engine for structured biomedical text,” Journal of the American Medical Informatics Association, vol. 14, no. 3, pp. 253–263, 2007.
View at: Publisher Site | Google Scholar
M. Rybinski, J. Xu, and S. Karimi, “Clinical trial search: using biomedical language understanding models for re-ranking,” Journal of Biomedical Informatics, vol. 109, Article ID 103530, 2020.
View at: Publisher Site | Google Scholar
D. A. Hanauer, D. T. Y. Wu, L. Yang et al., “Development and empirical user-centered evaluation of semantically-based query recommendation for an electronic health record search engine,” Journal of Biomedical Informatics, vol. 67, no. 1–10, 2017.
View at: Publisher Site | Google Scholar
N. Zong, S. Lee, J. Ahn, and H.-G. Kim, “Supporting inter-topic entity search for biomedical linked data based on heterogeneous relationships,” Computers in Biology and Medicine, vol. 87, pp. 217–229, 2017.
View at: Publisher Site | Google Scholar
S.-L. Hsieh, W.-Y. Chang, C.-H. Chen, and Y.-C. Weng, “Semantic similarity measures in the biomedical domain by leveraging a web search engine,” IEEE Journal of Biomedical and Health Informatics, vol. 17, no. 4, pp. 853–861, 2013.
View at: Publisher Site | Google Scholar
Y. Mao and W. Tian, “A semantic-based search engine for traditional medical informatics,” in Proceedings of the 2009 Fourth International Conference on Computer Sciences and Convergence Information Technology, pp. 503–506, IEEE, Seoul, South Korea, November 2009.
View at: Publisher Site | Google Scholar
C. Kohlschein, D. Klischies, A. Paulus, A. Burgdorf, T. Meisen, and M. Kipp, “An extensible semantic search engine for biomedical publications,” in Proceedings of the 2018 IEEE 20th International Conference on e-Health Networking, Applications and Services (Healthcom), pp. 1–6, IEEE, Ostrava, Czech Republic, September 2018.
View at: Publisher Site | Google Scholar
D. V. Tsishkou, P. D. Kukharchik, E. I. Bovbel, I. E. Kheidorov, and M. M. Liventseva, “Boosting biomedical images indexing,” in Proceedings of the IEEE EMBS Asian-Pacific Conference on Biomedical Engineering, pp. 74-75, IEEE, Kyoto, Japan, October 2003.
View at: Publisher Site | Google Scholar
M. N. Kamel Boulos, “Geosemantically-enhanced pubmed queries using the geonames ontology and web services,” in Proceedings of the 2012 IEEE International Geoscience and Remote Sensing Symposium, pp. 2891–2893, IEEE, Munich, Germany, July 2012.
View at: Publisher Site | Google Scholar
M. Al Zamil and A. B. Can, “Toward effective medical search engines,” in Proceedings of the 2010 5th International Symposium on Health Informatics and Bioinformatics, pp. 21–26, IEEE, Ankara, Turkey, April 2010.
View at: Publisher Site | Google Scholar
P. Grandhe, S. R. Edara, and V. Devara, “Adaptive roi search for 3d visualization of MRI medical images,” in Proceedings of the 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS), pp. 3785–3788, IEEE, Chennai, India, August 2017.
View at: Publisher Site | Google Scholar
R. Mishra and S. P. Tripathi, “Deep learning based search engine for biomedical images using convolutional neural networks,” Multimedia Tools and Applications, vol. 80, no. 10, pp. 15057–15065, 2021.
View at: Publisher Site | Google Scholar
S. Ghosh, P. Shivakumara, P. Roy, U. Pal, and T. Lu, “Graphology based handwritten character analysis for human behaviour identification,” CAAI Transactions on Intelligence Technology, vol. 5, no. 1, pp. 55–65, 2020.
View at: Publisher Site | Google Scholar
R. Wang, H. Yu, G. Wang, G. Zhang, and W. Wang, “Study on the dynamic and static characteristics of gas static thrust bearing with micro-hole restrictors,” International Journal of Hydromechatronics, vol. 2, no. 3, pp. 189–202, 2019.
View at: Publisher Site | Google Scholar
Y. Yuan, H. Xu, B. Wang, and X. Yao, “A new dominance relation-based evolutionary algorithm for many-objective optimization,” IEEE Transactions on Evolutionary Computation, vol. 20, no. 1, pp. 16–37, 2015.
View at: Google Scholar
M. Kaur, D. Singh, and V. Kumar, “Color image encryption using minimax differential evolution-based 7d hyper-chaotic map,” Applied Physics B, vol. 126, no. 9, pp. 1–19, 2020.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2021 Manish Gupta et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

574

Downloads

664

Citations

Mathematical Problems in Engineering

NSGA-III-Based Deep-Learning Model for Biomedical Search Engines

Abstract

1. Introduction

2. Related Work

3. Proposed Model

3.1. Deep Convolutional Neural Network

3.2. Nondominated Sorting Genetic Algorithm-III

4. Performance Analysis

5. Conclusions

Data Availability

Conflicts of Interest

References

Copyright