Abstract

Along with the advancement of wireless technology, indoor localization technology based on Wi-Fi has received considerable attention from academia and industry. The fingerprint-based method is the mainstream approach for Wi-Fi indoor localization and can be easily implemented without additional hardware. However, signal fluctuations constitute a critical issue pertaining to the extraction of robust features to achieve the required localization performance. This study presents a fingerprint feature extraction method commonly referred to as the Fisher score–stacked sparse autoencoder (Fisher–SSAE) method. Some features with low Fisher scores were eliminated, and the representative features were then extracted by the SSAE. Furthermore, this study establishes a hybrid localization model constructed with the use of the global model and the submodel to avoid significant coordinate localization errors attributed to subregional localization errors. Combined with three accessible fingerprint-based positioning methods, namely, the support vector regression, random forest regression, and the multiplayer perceptron classification, the experimental results demonstrate that the proposed methods improve the localization accuracy and response time compared to other feature extraction methods and the single localization model. Compared with some state-of-the-art methods, the proposed methods have better localization performances when large number of features are used.

1. Introduction

Following the rapid development of wireless communication technology in recent years, location-based services (LBS) [1] in indoor environments, such as departments, shopping malls, hospitals, and airports, has become increasingly popular [2]. Accordingly, the context in which positioning technology had been used to obtain accurate position estimates has received widespread attention. Generally, the positioning method can be classified into outdoor and indoor positioning types. Traditional satellite positioning technologies, including the global positioning system (GPS) and the BeiDou navigation satellite system (BDS), are utilized in outdoor positioning applications to meet the needs for outdoor activation. Nevertheless, in indoor environments, the signal from global navigation satellite systems (GNSSs) is limited [3]. Accordingly, it is necessary for researchers to seek positioning indoor localization methods with a higher accuracy. Many indoor localization methods have been proposed in academia as well as in industry. Typical indoor localization methods like Wi-Fi [49], Bluetooth [10], ultrasound (US) [11], infrared (IR) [12], radio frequency identification (RFID) [13, 14], magnetic field (MF) [15], and ultra-wideband (UWB) [16] methods, have been investigated. Indoor positioning systems estimate the location by using different types of measurements, such as angle of arrival (AOA), time of arrival (TOA), time difference of arrival (TDOA), and received signal strength (RSS). The AOA-based, TOA-based, and TDOA-based systems have serious limitations, including their fragility in dynamic environments and their increased costs.

Hence, wireless localization technology based on RSS measurements still constitutes the mainstream research method for these types of applications [17]. Its positioning accuracy can reach 1–10 m [18]. More importantly, the proliferation of smart mobile devices facilitates the development of indoor localization based on WLAN. Wireless positioning methods include two mechanisms, namely, the ranging-based and the fingerprint-based methods. The ranging-based localization method builds the propagation model between RSS and the distance to the access point (AP). Accordingly, the user’s location can be calculated once the distances to the multiple APs are estimated. The method can be easily implemented, yet complex indoor environments induce complex reflections to the signal propagation model, thus leading to the production of multipath signals. Interfered by fading and shadowing, the construction of accurate propagation models is challenging, and the proposed ranging-based approaches generally achieve a relatively poor accuracy.

At present, Wi-Fi fingerprinting localization technology is a popular method implemented in indoor positioning. The fingerprinting approach consists of two phases: the offline training and the online localization phase. During the training phase, the RSS measurement of APs at different reference points (RPs) is collected, after which a radio map is constructed with features extracted from the measured RSS. During the localization phase, the current RSS measurement is captured to map the corresponding location by matching the most relevant RSS fingerprint with entries in the fingerprint database.

Machine learning methods have been extensively used in fingerprint-based localization. Abdou et al. [4] proposed an efficient indoor localization system that used an affinity propagation clustering (APC) algorithm to reduce the computational cost and that relied on support vector regression (SVR) to achieve an increased generalization capacity. Akram et al. [5] utilized principal component analysis (PCA) to reduce the dimension of raw data and established a localization method based on random forest (RF). The method provided 97% accuracy for room prediction. Liu et al. [6] obtained the final estimated location based on the averaging of the outcomes of three algorithms, including forest regression, multiplayer perceptron (MLP) classification, and MLP regression, and achieved superior localization outcomes. Guo et al. [7] proposed an accurate Wi-Fi localization scheme based on the unsupervised fusion of an extended candidate location set (ECLS), whereby the proposed ECLS provided a large space that likely includes the true location. Accordingly, the experimental results showed that the algorithm was more robust to changing environments. Cui et al. [8] proposed a random vector functional link network (RVFL) to develop an efficient and robust indoor positioning system. Luo et al. [9] proposed a multifloor identification model called MA_LDA to find the floor number and LL_KNN algorithm to obtain the location information of a target on the floor.

In this study, we propose a method for fingerprint feature extraction and a hybrid localization model. The contributions of this work include the following:(1)Utilization of Fisher score–stacked sparse autoencoder (Fisher–SSAE) to extract robust features to achieve a better localization performance.(2)Establishment of a hierarchical model that employed the clustering approach, constructed submodels for subregional localization, and tested the effects of hierarchical localization for several common localization algorithms.(3)Proposition of a hybrid localization model that can retain the advantages of the hierarchical localization model and that can avoid significant errors caused by subregional positioning errors.

The remaining parts of this study are organized as follows: Section 2 shows the related works in the Wi-Fi fingerprint-based localization field. Section 3 introduces the preliminaries for the proposed fingerprint-based localization methods. Section 4 introduces the proposed Fisher–SSAE and hybrid model in detail. Section 5 evaluates the performance of the proposed methods in real wireless indoor environments. Section 6 outlines the conclusions of the study.

Fingerprint-based algorithm is classified into deterministic and probabilistic algorithms. The probabilistic algorithm utilizes the distribution of RSS measurements at RPs to construct the probabilistic model in the offline phase and provides confidence intervals for predicted locations in the online phase. Castro et al. [19] described Nibble, a Wi-Fi location service that used Bayesian networks to infer the location of a device and discussed how probabilistic models could be applied to a diverse range of applications that used sensor data. Bian et al. [20] proposed the least expectation of the positioning error (LEPE) algorithm, which utilized the Gaussian mixture model (GMM) and the expectation-maximization (EM) algorithm to minimize the expectation of the positioning error. Many probabilistic algorithms give confidence intervals for predicted locations but do not achieve an ideal positioning accuracy.

The deterministic algorithm exploits deterministic fingerprint features to identify the target location. For instance, the RADAR system [21] used the K-nearest neighborhood (KNN) method to estimate the user’s location. Shin et al. [22] proposed the enhanced weighted KNN (WKNN) which reduced the error compared to KNN based on the adjustment of the number of considered neighbors. Xu et al. [23] treated the extracted features as inputs to SVR and established the mapping between localization features and physical locations. Akram et al. [24] proposed the hybrid indoor localization based on the use of random decision forest to achieve the room-level and latitude-longitude prediction. Gu et al. [25] proposed a semisupervised deep extreme learning machine, which took advantage of deep learning and extreme learning machine methods and improved the accuracy and efficiency. Zou et al. [26] matched standardized fingerprints based on the signal tendency index (STI) to handle device heterogeneity and environmental changes. Additionally, they proposed an indoor positioning system that combined by the STI and weighted extreme learning. Li et al. [27] proposed a SVR algorithm, whereby its parameters were optimized for positioning by particle swarm optimization (PSO) to improve the online localization accuracy.

Fingerprint feature extraction has a significant impact on positioning accuracy. Wi-Fi signals are dynamically changing over time as a function of various situations, such as the obstruction of humidity, the status of door, the change of temperature, and other factors, that led to the poor feature extraction and low localization accuracy. Additionally, the dimensionality reduction was one of the effective strategies used to improve the performance of the classifiers [28]. The RADAR system [21] used the average value of RSS measurements as fingerprint features. The traditional feature extraction method does not yield a stable performance as a result of wireless signal fluctuations. Excessive fingerprint dimensions reduce the performance of a classifier. Thus, the feature extraction work of fingerprint-based localization has received extensive attention. Most fingerprint methods generally operate on the original fingerprints based on feature extraction or feature selections. Chen et al. [29] proposed InfoGain which selected a small subset of the APs as AP features based on the calculation of the information gain. Lin et al. [30] proposed a group discrimination-based access point selection method to improve the positioning accuracy and to reduce the computational overhead. Jia et al. [31] proposed a heuristic AP selection algorithm based on an error analysis. It was demonstrated that this method significantly reduced the redundancy of APs in fingerprint-based localization and considerably improved the localization accuracy. These fingerprint feature selection methods only retain features at a shallow level. Fang and Lin [32] used PCA to extract Wi-Fi feature. Luo and Fu [33] relied on kernel principal component analysis (KPCA) to eliminate the issue of data redundancy and maintain useful characteristics for nonlinear feature extraction. Jia et al. [34] proposed a supervised kernel PCA (SKPCA) method to obtain a nonlinear and optimal embedding in a low-dimensional subspace, whereby online RSS vectors can be transformed within the subspace for localization. According to the processes of these fingerprint feature extraction methods, if the features that do not contribute to the localization performance, or are even counterproductive, are not excluded, they affect the positioning accuracy. Thus, we propose the use of Fisher–SSAE to obtain the robust fingerprint features for classification or regression. Accordingly, during the execution of this method, both the feature selection and feature extraction processes are conducted.

Numerous efforts have been expended in fingerprint clustering algorithms, which reduced the computational complexity to improve the rate of matching. Chen et al. [29] proposed the application of K-means to cluster fingerprint samples. Ding et al. [35] proposed a fingerprint clustering method based on APC to avoid an initial starting point selection. Saha and Sadhukhan [36] proposed a hierarchical clustering strategy combined with KNN to locate the object. Zhou and Van [37] proposed a location fingerprinting algorithm based on fuzzy c-means (FCM) clustering. Li et al. [27] proposed an APC algorithm based on the Shepard similarity metric. In this approach, the authors calculated the Shepard similarity among the fingerprints and eliminated the superposition of distant signals in the estimation of the node position. Akram et al. [5] considered the similarity between the Gaussian distribution and the radio propagation characteristics of a Wi-Fi AP and used Gaussian mixture model (GMM) clustering to divide the fingerprint database into subdatabases. These methods reduced the computational time, but the points located at the boundary region were not considered in detail. Thus, a hybrid positioning model—which includes the use of the global and subregional models—is proposed for use to avoid significant errors caused by subregional localization errors.

3. Preliminaries

3.1. Technological Analysis

The positioning area is divided into several subregions based on clustering. After the determination of the subregions to which the test points (TPs) belongs to, we can identify the candidate RPs in specific subregions to contribute to the estimated coordinate localization. In this study, the localization method is defined as a hierarchical localization. The direct positioning method without clustering is defined as one-step localization.

Compared with the one-step model, the hierarchical positioning method has a better efficiency, and the model is more in line with the local environment. However, when the user is in the boundary area, the localization performance is sometimes not ideal. To our knowledge, two fingerprints are physically far apart, this does not necessarily mean they are very far apart in the signal space. In fact, they may be very close [38]. As shown in Figure 1, the yellow points are the RPs with high probability after calculation. If a suitable localization algorithm, such as the WKNN, is adopted, the user location is estimated in the area surrounded by these three yellow dots and the localization performance is acceptable. When the hierarchical approach is adopted in the subregional localization process, the location of the user can be easily located in the error area (Region A) adjacent to the true subregion (Region B). Accordingly, two high-quality RPs will not contribute to the estimation of location. Under this circumstance, the one-step localization model is more suitable.

3.2. Fisher Score

The subregional localization can be perceived as a multiclassification problem. Quick elimination of the features that contribute weakly or even interfere with the positioning task can effectively save computation resources and improve the accuracy of classification.

The Fisher score is a representative supervised feature selection method and selects features with the best discriminant ability [28]. The powerful discriminant capacity is represented as the set of distances among the data points with the same label such that they are as short as possible and the distances between the data points with different labels such that they are as far as possible. In this study, the Fisher score was utilized to select the suitable AP subset. This led to the reduction of the fingerprint dimension.

Fingerprints of RPs are stored in the fingerprint library. By clustering, RPs are divided into classes and are represented as . These classes, respectively, contain RPs. The Fisher score of the kth feature can be expressed according to the following equation:where is the average value of the k th dimension of the fingerprint vector and is the average value of the k th dimension of the fingerprint vector which belongs to the i th cluster, and it is represented aswhere is the k th dimensional value of the fingerprint vector on the RP.

3.3. SSAE

Mining the intrinsic features of fingerprint data can effectively improve the accuracy of positioning. In this study, the SSAE was utilized to learn the intrinsic characteristics of fingerprint data before training the localization model. The autoencoder (AE) [39] is an extremely effective tool for learning the essential features, which can be used for noise reduction, dimensionality reduction, and the generation of models.

As an unsupervised learning algorithm, the single layer AE is a neural network and includes an input, a hidden, and an output layer. The number of neurons in the input and output layers is equal. Data processing consists of two steps: the original data are encoded from the input layer to the hidden layer, and the feature representation is decoded from the hidden layer to the output layer. The structure is shown in Figure 2.

The original data are represented as , where denotes the feature representation of the hidden layer and , are the number of neuron nodes of the input and hidden layers. In turn, represents the weight matrix, represents the bias vector, and is the activation function. The sigmoid function, which is expressed as , is selected as the activation function. The process of encoding can be expressed as

In this case, denotes the reconstructed vector and represents the bias vector from the hidden to the output layer. The process of decoding can be expressed as

By comparing the input vector with the output vector, we can express the cost function as

The first term is the error between the input and the output. The second term is the weight decay term which can solve the overfitting problem, where is the weight attenuation coefficient and represents the weight corresponding to the input node i and the hidden node j. The goal of training is to minimize the loss function so that the raw data are processed by the encoder and decoder to obtain results which are almost identical with the original data.

Adding the sparsity restrictions on the basis of AE sets most of the nodes in the hidden layer in suppressed states so that a combination of a small number of activated neurons is used to represent the input. In this way, it is easier to learn the essential features, that is, the sparse autoencoder (SAE). The cost function of SAE adds a sparse penalty term, and the loss function is expressed as

The first term is the cost function of AE, and the second term is the sparse penalty term, where is the weight of the sparse penalty item and is the number of neuron nodes of the hidden layer. represents the KL divergence between the target Pi and actual neuron sparsities Qi. It can be expressed as

SSAE has a multilayer structure and stacks multiple SAE. Training bases on a layer-by-layer basis yielded the weight matrix and the bias vector. The abstract features of the input fingerprint data were learnt. This resulted in the low-dimensional feature representation. Compared with the single SAE, the training speed is higher, and a more effective expression can be obtained.

4. Proposed Localization Methods

4.1. Preprocessing

For the purpose of feature extraction and training of the hybrid model, the fingerprint data preprocessing, including the zero-mean normalization and FCM clustering, was utilized. Zero-mean normalization was applied aswhere is the mean and is the variance.

The FCM clustering algorithm was applied in this study. Collection of fingerprint data of n RPs, which are represented as , achieves the goal of their division into c classes. By continually updating the cluster center and the membership matrix, the objective function is minimized. The objective function is defined asAccordingly, the restriction is represented aswhere is the membership matrix, is the clustering center, and m is the weighted index. Based on the Lagrange function, the updating of the clustering center can be expressed as

In addition, the updating of the membership matrix can be expressed as

4.2. Fisher–SSAE

This study proposed the fingerprint feature extraction algorithm known as Fisher–SSAE. We expect that the representative features can be obtained to improve the positioning performance. The Fisher–SSAE algorithm is described in Algorithm 1. It is worth mentioning that is set to control the upper dimension bound, leading to the function of dimension reduction, whereas is set to control the depth of SSAE, thus minimizing model complexity.

Input:
(1)Training fingerprint data
(2)Number of features selected by Fisher criteria
(3)Maximum dimension of hidden layer
(4)Maximum depth for SSAE
Output:
(1)The structure SSAE including the dimension of hidden layer and depth
(2)The fingerprint feature extracted
Calculate the Fisher score for each AP feature
Rank the Fisher score according to its value (large to small), and keep the features that correspond to the first k values
Set the initial depth of SSAE
Set the initial the dimension of hidden layer
t = 1
Repeat
t = t + 1
Calculate the accuracy of classification which is utilized to achieve subregional localization
Until or
Determine
t = 1
Repeat
t = t + 1
Calculate the accuracy of classification which is utilized to achieve subregional localization
Until or
Determine
Training fingerprint data by SSAE
Return the data in the hidden layer
4.3. Proposed Hybrid Localization Model

The hybrid localization model proposed herein was divided into the training and localization phases. We need offline fingerprint features extracted by Fisher–SSAE to train the localization model and online fingerprint features (which were extracted in the same manner), to identify the candidate location. Therefore, feature extraction should initially be completed both in the offline phase or online phase. Hierarchical localization methods allow the construction of one submodel for each subregion. Based on this positioning approach, the global localization model was used to predict the location of TPs which were likely located at the boundary areas. Correspondingly, we expect to avoid the significant errors caused by subregional positioning errors. Based on the above considerations, in the training phase, four classifications or regression models need to be trained, including (a) the boundary classification which relied on the determination on whether the TP was a boundary point, (b) the subregional classification which was utilized to locate to subregion in which the TP belonged to, (c) the precise localization classification (or regression) for high-accuracy positioning in every subregion, and (d) the global localization classification (or regression). The training phase can be summarized as follows:(1)Prepare all the training data and mark them with subregional, boundary, and location labels.(2)Extract features using the Fisher–SSAE.(3)Train four classifications (regressions): the boundary, the subregional, the precise localization, and the global localization classifications (regressions).

The process of offline training is illustrated in Figure 3.

The localization phase can be summarized as follows:(1)Collection of online fingerprint data.(2)Extraction of fingerprint features using the same way as the training phase.(3)Determine if the TP is a boundary point based on boundary classification.(4)If the TP is a boundary point, the location can be estimated by the global model.(5)If the TP is a nonboundary point, the strategy of hierarchical localization is adopted. The subregion to which the TP belongs to is determined first, and the precise coordinate location can be estimated subsequently.

The online process is illustrated in Figure 4.

5. Experimental Work

5.1. Experimental Environment

To test the performance of each algorithm in a real environment, all of the data originated from an actual environment. The experimental Wi-Fi corridor was considered as the experimental area. The scope of the experimental environment included the corridor and stairway. The layout of the 84 RPs is shown in Figure 5. The RP interval was designed to have the dimensions of 2.4 m × 1.2 m. In this environment, we collected 30 s of fingerprint data at each RP, and the sampling interval was 500 ms.

5.2. Fingerprint Data Labels

By clustering, 84 RPs were divided into four classes. Subregional labels were set. The result is shown in Figure 6. We then placed boundary point labels on the RPs near the subarea boundary, and the remaining RPs were set as nonboundary points. The result is shown in Figure 7.

5.3. Fisher Score

To test the performance of Fisher–SSAE in feature extraction, an effective model should be set up at first. All AP features need to be evaluated using the Fisher score. The number of initial features was 181. The scores of these features after normalization are shown in Figure 8. Features with scores less than 0.01 were rejected, and 119 features were saved in total.

5.4. Model Generation of SSAE

After feature selection, SSAE was constructed to obtain the reduced dimensional representation of the original features. The training data collected at the RPs were copied 50 times, and random noise was added to improve the susceptibility of the model to noise. The process of parameter selection for SSAE is given in Algorithm 1. Initial values included the sparse target = 0.05 and sparse weight = 0.4. MLP classification was applied to test the accuracy based on the use of the extracted features. The parameters for the MLP classifier included the “tanh” activation function, l2_reg = 0.002, learning rate = 0.01, two layers, and the hidden neurons were equal to 200 and 80.

In Figure 9, the accuracy of classification varied with the output dimension for SSAE. When the output dimension was 30, the model yielded its best accuracy which was equal to 92.3% in the testing database. After the output dimension of the SSAE was determined, the depth of SSAE was set.

In Figure 10, when the depth was equal to one, the model achieved an accuracy of 92.3%. The depth increased until the accuracy stabilized. When the depth was equal to two, the model yielded a locally optimal result equal to 94.0%. When the depth was added continually, the accuracy reduced considerably. After the structure of SSAE was determined, sparse target and sparse weight were tuned, and the corresponding results are shown in Figures 11 and 12.

Figure 11 presents the accuracy of the subregional localization at different values of the sparse target and sparse weight in the training dataset, while Figure 12 presents the performance in the testing dataset. Evidently, the combination of the parameters sparse weight = 0.4 and sparse target = 0.2 led to the optimal localization accuracy of 97.0%.

5.5. Feature Extraction Performance

The Fisher–SSAE was compared with other feature extraction methodologies: mean value [21], PCA [5], and AE [6]. Different features were placed into the MLP classifier to test the performance for subregional and precise coordinate localizations. We confirmed that robust features lead to better accuracies.

In the process of subregional localization, we tested the feature extraction ability with different ways. The accuracy with different features is shown in Figure 13.

Figure 13 shows the outcomes of four algorithms. The first one “Mean value” is the performance with the mean value feature, and its accuracy is 84.8%. The second is “PCA” which reduces the dimension of the fingerprint data. However, it does not play an effective role in subregional localization, and its accuracy is 81.8%. The “AE” yields a 3.1% improvement compared with the “Mean value,” while the proposed method yields the highest accuracy of 97.0%. By using this subregional localization method, we achieved excellent results in nonboundary areas. For the points in the boundary region, the fusion of the global and the local models was adopted. Accordingly, significant distance errors caused by subregional localization errors will not occur.

In this experiment, different features were also applied in one-step localization and hierarchical localization to verify the effects of feature extraction. The mean value, AE, Fisher–SAE depth = 1, and Fisher–SSAE depth = 2 were applied in two indoor localization schemes, namely, in the SVR and RF regressions. The error distance criterion was adopted to quantify the localization performances. Specifically, if the distance between the estimated location and the excepted location was smaller than the defined error r, i.e., if , the result was considered correct. The localization accuracy was the percentage associated with the number of correct localization accounts for the number of total localizations. The localization performance of different features is illustrated in Figures 1416.

The performance of different features in one-step localization based on SVR is illustrated in Figure 14. As the localization error distance increases, the localization accuracy for each method improves. Specifically, the Fisher–SSAE feature extraction method has a better localization performance than other means. When the error distance is 2 m, 4 m, 6 m, or 8 m, the proposed methods of feature extraction all achieve the best accuracy. Compared with the mean value which only extracted shallow representation, the proposed method learned the robust features from the fingerprint database. Compared with AE, the proposed method deleted the AP features which were noisy in the localization process and had sparse characteristics, to facilitate the extraction of critical features.

Figure 15 shows the performance in one-step localization using RF regression. The proposed feature extraction method yields the best accuracy when the error distances are 2 m and 3 m.

In Figure 16, the same conclusion can be reached based on the hierarchical RF model when the error distances are 1 m, 2 m, 3 m, 4 m, and 5 m, while the localization accuracy of the proposed Fisher–SSAE changes is 15%, 48%, 70%, 82%, and 94%. Even though the RSS from different APs varied considerably, the proposed method of feature learning maintained the localization steady with the representative features. In the next section, we will show the influences of the hierarchical model.

5.6. Performance of Hierarchical Localization

During the process of the hierarchical localization, the submodels for subregions were constructed. To testify whether the hierarchical approach has a positive impact on indoor localization, three categories of indoor localization approaches were taken into account: matching method (WKNN), regression method (RF regression), and classification method (MLP classification). Figures 1719 show the localization error cumulative distribution functions (CDF) of the TPs. Table 1 summarizes the detailed results, including the mean error, medium error, maximum error, and positioning time.

In Figure 17, WKNN is adopted to evaluate the impact. As shown in this figure, the CDF curve using the hierarchical positioning shifts to the right. Compared to the one-step localization whose mean localization error is 2.76 m, the mean error of the hierarchical way is 4.12 m, the accuracy of which decreased by 1.36 m. Similar to the case when the one-step method was used, the online fingerprint is compared with all RPs fingerprints in the radio map, and it is shown to facilitate the identification of the suitable RPs which are close to the TPs. Furthermore, in the process of hierarchical WKNN, RPs which were close to the TP but belonged to other subregions could not contribute to the estimation of location. Conversely, even though the one-step WKNN has a higher accuracy, it has a lower efficiency. The positioning time of the hierarchical WKNN is 0.437 s. Compared to the one-step WKNN, an improvement of 0.763 s was achieved.

In Figures 18 and 19, the RF regression and MLP classifier are adopted. The mean error of these methods changes from 4.21 m and 3.76 m to 2.31 m and 3.20 m, respectively. In comparison to the matching methods, such as WKNN, the training of multiple submodels which are more in line with local RSS distribution situations improves significantly the localization accuracy. More importantly, the positioning time of the MLP classification is significantly reduced. However, the positioning time of RF regression changes inconspicuously. It is speculated that the regression method calculates the location based on the regression model and input fingerprint data, while the increase in candidate RPs does not affect the efficiency. By contrast, the matching and classification methods select suitable RPs from the candidate RPs. Hierarchical localization reduces the number of candidate RPs, thus improving the efficiency.

5.7. Localization Performance of Proposed Hybrid Model

After the validity of the proposed feature extraction method and hierarchical positioning mode, the impact of the hybrid model which used the global model to estimate the TPs in the boundary region and which used the submodel to estimate TPs located at the nonboundary region was tested. The localization of the TP within the boundary area or outside was judged by the trained boundary point classifier. MLP classification was selected as the boundary point classifier. We confirmed that the proposed model can avoid significant errors when the result of the subregion positioning is in error.

In this experiment, TPs with localization errors greater than 4 m were defined as error points. The error rate is the percentage ratio of the number of error points to the total number of test points. Several hierarchical localization methods were adopted, including RF regression, SVR, and MLP classification, which use features extracted by the proposed approach. By comparing them with hybrid model-based methods and with methods that do not rely on the use of hybrid models, we found that the proposed hybrid model effectively avoided large localization errors. Table 2 shows the mean error and error rate. When the proposed hybrid model is used, the mean error decreased by 0.22 m, 0.39 m, and 0.76 m, while the error rate, respectively, decreased by 6.1%, 9%, and 12.1%. The experimental results verified the validity of the proposed hybrid model with the global model and submodel. Large errors caused by subregional localization errors were replaced by localization errors based on the use of the global model.

Also, three algorithms based on the proposed scheme were compared. Table 3 presents the detailed results. After the training of the feature data, which were extracted by the proposed feature extraction method, the CDF curve (which uses the hybrid model based on RF) in Figure 20 shifts to left. This algorithm yields the lowest average error which is only 2.09 m.

Finally, Figure 21 shows the CDFs of different localization algorithms, including method proposed by Li et al. [40], AutLoc [6], ECLS [7], and proposed method. Table 4 lists the detailed results. Method proposed by Li et al. and AutLoc transfer fingerprint data to low-dimensional spaces and improve efficiency but do not eliminate feature redundancy; ECLS expands the candidate points space, and true location is likely considered. However, the performance of classifiers will be affected when large number of APs are used. The results show that the proposed method has a positioning advantage under this experimental environment with a large number of APs.

6. Conclusions

In this study, two problems were highlighted. The common feature extraction methods are unrepresentative owing to the signal fluctuation. A feature extraction method which was called Fisher–SSAE was proposed to obtain robust fingerprint features. This led to localization accuracy improvements. More importantly, given the error of subregional positioning, large errors occurred. Faced with this problem, the TPs were divided into boundary and nonboundary points, and a hybrid scheme was proposed to solve the problem effectively. Moreover, the positioning moving target was not taken into account in this study and thus needs to be studied in more detail in the future.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the National Key Research and Development Project of China (grant no. 2016YFB0502102), the Natural Science Foundation of Jiangsu Province (grant no. BK20181361), and the Fundamental Research Funds for the Central University (grant no. 2011QNA02).