Abstract

In this study, an algorithm based on improved maximum stable extremum region (MSER) was used to match and recognize the content of signage in the airfield, so as to overcome unrecognizable signage caused by low visibility or an overlapping field of view of multiple aircrafts. Firstly, the MSER algorithm was employed to combine regional structure similarity with feature space to realize the matching and recognition of signage in airport flight areas and improve the stability of signage content recognition. Secondly, the threshold of repeated recognition content was eliminated using the NMS algorithm to handle the repeated recognition caused by the MSER algorithm. Finally, an identification card of the taxiing paths in an airport flight area was selected for simulation. The experimental results demonstrated that the accuracy of identification card content can be improved by the optimized MSER algorithm.

1. Introduction

No intelligent identification projects have been conducted on airfield signs, most of which are taxiing identification of aircrafts. However, more and more attention has been paid to the taxi guidance of aircraft arriving and departing with the emergence of extreme weather (such as low-level wind shear, heavy fog, and snow), the increase in flight traffic and problems in the signal relay of communication, and navigation facilities in remote areas. Commonly used pattern recognition algorithms include machine learning, BP, and CNN. However, there are no unified databases based on safety considerations in the airfield sign identification. Meanwhile, the response time of the above algorithms is too long and cannot be performed in an aircraft master control system (FMS) due to the short arrival and departure time of aircrafts. Moreover, the rapid development of intelligent air traffic control technology makes it more convenient to travel by plane, where target recognition plays an increasingly crucial role as a part of this technology. With airports as an example, regarding automatic berth guidance systems [1], identification of aircraft registration numbers in air traffic management [2], and engine hole detection in-ground aircraft maintenance [3, 4], airport operation efficiency and flight punctuality can be effectively improved by introducing target recognition technology into the airport in a combination with the intelligent optimization and scheduling algorithm proposed by Zhao et al. [57].

Object recognition technology is majorly composed of three parts: recognition positioning, character segmentation, and text recognition. At present, the commonly used recognition and positioning algorithms around the world are based on texture and color features [8, 9], domain transformation processing [10], and geometric form analysis [11]; character segmentation algorithms mainly consist of gap method [12], projection method [13, 14], and connected domain method [15, 16]. For example, Yang et al. [17] proposed a connected domain bundle segmentation algorithm for irregularly arranged texts, while it was more sensitive to the extraction error in the foreground area of texts. Zhao et al. [18] designed a multifeature texture image segmentation method combined with region division. Unfortunately, the segmentation time was too long, and the edge segmentation accuracy was not improved. Chen et al. [19] presented a multithreshold image segmentation algorithm based on the firefly algorithm, which did not demonstrate well-preserved edge information and details. Barthakur and Sarma [20] constructed a neural network combined with the aggregation class division method. Nonetheless, the complexity of the algorithm was high, while the traversal speed was slow. Character recognition algorithms primarily contain the template matching method [21] and feature extraction method [22]. For example, Wu et al. [23] adopted feature point clustering to identify the horizontal text area, whereas the clustering points tended to be too concentrated. Liu and Samarabandu [24] employed the second derivative of images to process the text edge information and perform text recognition through regional fusion positioning. However, the response time was too long, and the data was lost easily. Neumann and Matas [25] utilized edge and color information to extract and recognize text content. Tian et al. [26] recognized horizontal texts with CTPN but achieved a poor detection effect of irregular texts. Zhou et al. [27] managed vertical text recognition using EAST. Nonetheless, a longer text cannot be recognized, the above character recognition algorithm cannot adapt to the changing shooting environment or angle, and the recognition distortion occurred in the case of poor visibility. The above identification and positioning algorithms were adapted to a specific angle and obtained better lighting, while it was difficult to ensure the accuracy, real-timeliness, and robustness of sign identification in the case of low visibility and contrast. Therefore, the control should be optimized on the basis of ensuring the integrity of the identification information [28, 29]. In most cases, the background of the character images processed was relatively simple, and a better recognition effect can be reached through traditional segmentation methods. However, no better effect was achieved through traditional segmentation methods for the complex background of airport flight areas. Through the MSER algorithm, multiscale detection can be performed with the basic functions of image retrieval analysis and recognition. The imperative thing was that it was not sensitive to lighting, contributing to the improved accuracy of identification and positioning of the identification plates. Simultaneously, the NMS was adopted to screen its threshold in a combination with the boundary attribute of the MSER algorithm, resulting in the accelerated traversal query speed of the algorithm.

2. Identification of Identification Plates and Positioning Detection

2.1. Pretreatment through MSER Extraction

The MSER is an effective feature detection operator, through which a grayscale image can be processed by first converting the original color image into a uint8 grayscale image. In other words, the color image is converted to a gray one. Concurrently, the influence of external environmental factors is weakened, the contrast of the gray graphic is strengthened, and the sign area is highlighted. A segmented linear gray transformation (gray value is 0~255) is expressed as where denote grayscale transformation nodes to adjust the appropriate value after many tests can be adjusted. They can highlight the areas with signs while suppressing those without signs. The original images and the preprocessed gray images are illustrated in Figures 1 and 2 to verify the effective selection of different angles, backlights, and directional signs for the improved MSER algorithm.

After gray preprocessing, an image with good contrast was obtained, and then the MSER area was extracted. The process is detailed as follows. (1) In pixel sorting, the gray degree value with a contrast enhancement was obtained according to Equation (1) and sorted sequentially following the value range and value size. (2) The connected domain was obtained. Specifically, the actual grayness value of the gray image was given a threshold and binarized, and the threshold size was increased in turn. Let represent the connected area in the binarized image corresponding to threshold . When it changed and had a difference of before and after, the corresponding connected domain became the interval . (3) The area ratio was calculated. When the area changes with , is the local maximum, which is the minimum stable extremum area [30].

Considering that the MSER algorithm has good robustness to the viewpoint, size, and lighting conditions while being too sensitive to blurry images, which would cause image recognition disorders, edge detection segmentation was adopted in our study to extract the MSER region. Candidate areas for identification plates were obtained by splitting the MSER region by edges and then filtered based on the geometric attributes of the characters of identification plates (minimum external rectangle aspect ratio, area size, and duty cycle). According to the prior knowledge of character recognition, the duty cycle interval range of the connected domain was set to be [0.2, 0.6], and the pixel range of the selected was set to be 100 to 2000, so as to reduce the impact of noise on the images. Therefore, the connected domain should meet the following constraints: where indicates the aspect ratio of the external rectangle (height and width denote the height and width of the external rectangle, respectively), represents the total number of pixels, and designates the duty cycle. The images extracted through the MSER are displayed in Figure 3.

2.2. Based on Morphological SWEET Treatment

The content of the significant signs in nonflight zones was eliminated using the MSER algorithm to improve the accuracy of the candidate connected domain. After MSER processing, the remaining connected domain was morphologically processed to complete the transformation of stroke width, suggesting that the following operations were performed in Figure 3. where denotes the set of candidate-connected domains, represents the structural element, () indicates -times corrosion of , and is the number of last iterations before the candidate connected domain set is corroded to an empty set. The Euclidean distance from the foreground pixel to the center pixel was calculated according to the corrosion order. Concurrently, the Euclidean distance from the boundary element in the outer rectangle of each area to the center linear element was calculated. The value obtained was 0.5 of the stroke width value [31], implying that the width transformation was completed. After the width value of each candidate area was obtained, the connected domain was filtered under the reservation following the requirements of the Civil Airport Flight Area Technical Standard Manual. Besides, the connected domain is screened by where denotes the stroke width change factor, indicates the width value of a pixel in an image, and represents the number of pixels in the connected domain.

2.3. Identification Plate Fine Positioning

After the above screening, the candidate area of the connected domain of a single identification plate was obtained. Thus, the aggregate connected domain can be finely positioned and identified based on the characteristics of the identification plate. Fine positioning required the construction of adjacent pairs in all areas of an image according to the height, width, and center point coordinates of the external rectangle [32]. Then, the neighboring nodes were connected under the principle of least distance. In other words, the nearest connected domain that satisfied the conditions was aggregated; the aggregation was performed in turn until all the character candidate regions were fully marked; the number of aggregation areas was calculated. where indicate the length and width of the -connected and -connected domains, respectively; denote the center coordinate points of the i-connected outer rectangle. After aggregation, the content of the tag was accurately positioned according to Equation (6) by calculating the width and height of the text area in an area where the text of an entire tag was located. where denotes the number of connected domains of the aggregation. The accurately identified content markers after identification plates are exhibited in Figure 4.

3. Recognition of Identification Tag Characters

3.1. Validation Segmentation and Normalization Processing

The connected domain of MSER edge segmentation was corrected and divided according to the width, height, and center point coordinate value of the character text. They were normalized to improve the recognition rate of the matching of the character text with the template model. Let be the width and height of the original image, respectively, and be the width and height of the transformed image, respectively, with the corresponding scale ratio of . Then, the matrix after normalization is

Since the premise of recognition was a graphic skeleton, the text content was normalized to extract the image structure skeleton.

3.2. Recognition of Numbers, Letters, and Text

Considering the text attribute characteristics, a recognition method based on thin character partition scanning was proposed, whose scanning direction was left-to-right and top-to-bottom based on the geometric features and position attributes of the graphic. Manhattan distance such as Equation (8) was used in the detection of the acquaintance of the character and the template. Specifically, the position coordinates of each part were subtracted, and the sum of their absolute values was taken based on the calculated value. The smaller the value of the sum, the higher the recognition matched between the character and the template. Moreover, the minimum absolute value was taken as the discriminant constraint to make the recognition effect best. Since the maximum value of text characters in the flight area sign database is 5, the optional value of 5 was adopted as the number of European distance judgments (with the 3rd value as the central reference), where denotes the distance from the center point of the area scanned by , which is the character to be detected, to the center point of the image, and indicates the distance from the center point of the area scanned by the character template to the center point of the template. Although simple texts can be detected through the above MSER algorithm, it was difficult to recognize numbers such as 0 and 6 and locally curved characters such as letters A and B, which can be distinguished by the sum of the inclination of 1/2 point, the upper left 1/4 point, and the lower right 1/4 point. Hence, the coordinate of the highest point at the upper left 1/2 position, the middle point position, and the lower right 1/4 position was , respectively. The sum of the absolute values of the slope reciprocal of slopes was obtained by calculating the connected domain many times.

The figure after initial identification is shown in Figure 5, and set IOU .

3.3. MSER-Based NMS Processing

The NMS (nonmaximum suppression) algorithm is generally accompanied by image area detection algorithms used, playing a role in removing duplicate areas. The main application areas include video target tracking, data mining, 3D reconstruction, target recognition, and texture analysis. The basic idea of NMS is to traverse all the boxes to sort the scores, select the boxes where the largest value is calculated, and then traverse the rest of the boxes to find the overlapping area of them with the highest score at present. Additionally, IOU (Intersection over Union) is greater than a certain threshold box and is deleted. Afterward, another box with a large calculated value is observed, and then the box where the IOU is greater than the threshold is deleted; it loops until the characters of all the texts are fully recognized. The NMS schematic diagram is exhibited in Figure 6.

The principle of the MSER algorithm was combined with the principle of the NMS algorithm, and optimization processing was performed according to geometric morphology. Consequently, the process after the fusion of MSER and NMS was obtained, as illustrated in Figure 7.

In the identification plate identifying the flight area of the airport, the canditate box in Figure 5 was screened through the NMS algorithm, and the final value of the candidate box was calculated, as expressed in where denotes the value of the connected domain under the MSER; indicates the maximum value of the connected domain; represents the border to be detected; refers to the intersection ratio of the border to be detected, namely, and the maximum value of ; and designates the fixed threshold. MSER-NMS was proposed for the algorithm of values, through which the last was calculated:

Among them, a corresponding interval was obtained for after the highest value of the candidate connected domain was normalized. Then, the threshold calculated based on the second-order difference formula was used to filter. At this time, the text confidence was different, contributing to the reduced occurrence of misselection and omission. The segmentation effect after processing is displayed in Figure 8.

4. Experimental Analysis

The flight area identification plates of an airport in different weathers, with different angles, different altitudes, and different content are collected to verify the effectiveness of the improved MSER algorithm, as demonstrated in Figure 1. This involved the number of target intervals [3, 6], the distance and range detection of the area images, and the image range based on the width of the stroke detected. In this paper, a considerable number of experiments were conducted, and random images were selected to compare the results of traditional MSER with improved MSER of fusion NMS algorithm Figure 9.

The two on the left of each picture present the recognition ranges of the traditional MSER algorithm, where the yellow area in the left image represents the connected domain collection area and that in the left image demonstrates the outline of the area prone to repeated recognition after recognition based on the MSER algorithm. Additionally, the left image is the connected domain collection area after the fusion of the MSER and NMS algorithm, and the left image is the outline of the recognition area after the fusion algorithm. With Figure 8(a) as an example, the pixels are , the height interval of the connected domain collection area is [1, 19], the width interval is [0.5, 19.5], the contour of the identification area after MSER recognition is [10.5, 11.5], the width interval is [10, 15], the height interval of the connected domain collection area after the fusion of MSER and NMS algorithm is [8, 13], and the width interval is [10, 15]. The connected domain and identification profile before and after other improvements in Figure 8 are counted, as listed in Tables 1 and 2. The area of the connected domain of the traditional MSER and the improved MSER is , respectively.

Similarly, the recognition profile area of the traditional MSER and the improved MSER is , respectively.

The median height and width of the connected domain intervals were selected as the average value of the image recognition domain to ensure that the recognition of the connected domain did not lose the resolution or the complete extraction of edge information. They were recorded as . The area of the traditional MSER and improved MSER connected domain before the improvement was recorded as , respectively, where . The selected area was simplified, as presented in Table 3. The same was performed to ensure the smoothness and stability of the contour of the selected connected area. The average value of the height and width of the contour of the connected domain was selected as the average value of the identification contour, recorded as . Then, the contour area of the traditional MSER and the improved MSER before the improvement was recorded as , respectively, where the selected contour was simplified, as provided in Table 4.

The results of based on MSER and MSER-NMS in Table 3 were counted, as illustrated in Figure 10(a). The results for based on MSER and MSER-NMS in Table 4 are exhibited in Figure 10(b).

Following the characteristics of geometric segmentation, it was assumed that the identified connected domain area of the traditional MSER was , the difference between the identified connected domains before and after the improvement was , and the connected domain recognition rate before and after the improvement was . Similarly, the identification of the connected domain contour of the traditional MSER was 1, the difference in the connected domain before and after the improvement was , and the connected domain recognition rate before and after the improvement was . The median connected domain data and median area contour data of the histogram in Figure 10 were substituted into . Next, they were compared based on the algorithm before improvement (unit 1), and the result is provided in Figure 11.

As suggested in Figure 11, the traditional recognition accuracy can be set as the reference object since the recognition accuracy of the connected domain of the traditional MSER algorithm was 1 unit level. The median connected domain data in the histogram in Figure 10(a) were substituted into . The results demonstrated that the recognition accuracy of the improved connected domain was (1.31, 1.04, 1.11, 1.07), and the accuracy rate of the improved connected domain was improved by (0.31, 0.04, 0.11, 0.07) compared with the traditional one. Because was obtained. Similarly, the connected domain contour of the traditional MSER algorithm was set as one unit level, and the median connected domain contour data in the histogram in Figure 10(b) was brought into , from which improved connectivity can be observed. The recognition accuracy of the domain contour was (1.18, 1.10, 1.57, 1.09). Since was obtained. In other words, the improved contour recognition accuracy increased by (0.18, 0.10, 0.57, 0.09). Through a comparative analysis of the connected domain and the contour of the connected domain before and after the improved algorithm, the three characteristics of the improved fusion algorithm were verified as follows. First, it possessed the invariance of affine transformation to image grayscale; second, only regions supported within the same threshold range were selected under the consideration of stability; third, a multiscale detection was achieved without any smoothing, suggesting that both small and large structures can be detected.

5. Conclusion

The improved MSER algorithm can play an essential role in the identification of the signs in airport flight areas, as well as the improvement of the speed of target traversal and the accuracy of target sign recognition. In the future, the three-dimensional sense of sign recognition in flight areas and the convenience of feature information queries will be further improved. Thus, VR-3D virtual technology will be integrated with improved MSER algorithms; a small airfield database will be established for the accuracy of signage in the airfield scenes, where machine learning or deep learning is adopted to obtain richer semantic information and local receptive fields; meanwhile, the results will be verified and corrected through the selected heuristic algorithms, which jointly escort the construction of smart and safe airports.

Data Availability

The numerical simulation data used to support the findings of this study are included in the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was sponsored in part by the “Visual Flight Programming and Approval Guideline Project of RNAV Procedures” (No. 14002600100015J013) and “Innovation and Entrepreneurship Training Program for College Students of Sichuan Province” (S202110624185). We are very grateful for the support of these funds.