Open-source data-driven urban land-use mapping integrating point-line-polygon semantic objects: A case study of Chinese cities
Introduction
Urban environmental problems caused by the rapid growth of urbanization have received worldwide attention during the past few decades (Kamusoko, 2017; Kawamura et al., 1998). Urban land-use maps are commonly used as a tool in urban analysis for describing socio-economic functions and environmental conditions (Dewan and Yamaguchi, 2009; Herold et al., 2003). With the development of very high resolution (VHR) satellite technologies, it is now possible for VHR images to clearly represent the geometry, texture, size, position, and other relevant information of the ground objects at a finer scale (Wang et al., 2019; Zhang et al., 2018). However, the variability of the object categories and the diversity of the object distributions in the land-use mapping units leads to the “semantic gap” between the low-level data and the high-level semantic information (Zhao et al., 2013), which means that land-use mapping is still a challenge (Grippa et al., 2018; Huang et al., 2018).
With regard to remote sensing image understanding, “scenes” are commonly used to identify accurate urban land-use patterns (Liu et al., 2017). In recent years, many studies applying VHR images have attempted to map urban land use by the use of a scene classification model. The basic idea is to apply the trained model to the target data, thereby giving the target data consistent categories with the training data (e.g., bridge, forest, airport, etc.) (Castelluccio et al., 2015; Huang et al., 2018; Nogueira et al., 2017). According to the different feature levels, these models can be summarized into four types (Huang et al., 2018): 1) models based on low-level features; 2) models based on mid-level features; 3) models based on object-level features; and 4) models based on high-level features. Low-level features are based on the assumption that the land use can be directly represented by its intrinsic features excavated or designed by experts. Examples of low-level features are shape features (Oliva and Torralba, 2001), texture features, and color features (Santos et al., 2010; Serrano et al., 2004). However, low-level features are insufficient to describe the variety and complexity of the true characteristics of the ground surface. The focus of mid-level features is creating and coding a dictionary of the local low-level features, in order to describe the details more robustly (Zhao et al., 2016; Zhu et al., 2018). Object-level features are extracted from the results of object-oriented classification, focusing on the relationships of objects instead of low-level features (Cheng et al., 2017; Li et al., 2016; Zhong et al., 2017). Compared to the previous three types of manual features, high-level features are optimal features autonomously learned through deep neural networks, such as convolutional neural networks (CNNs) (Yao et al., 2017b; Zhang et al., 2017) and fine-tuned CNNs (Hu et al., 2015; Tu et al., 2018), which can achieve relatively high land-use classification accuracies.
The previous land-use classification methods based on VHR images bridged the semantic gap between the low-level features of the objects and the high-level semantics of the scenes. However, through the production survey, we have found that the obtained results cannot easily be applied in practical urban land-use mapping, on account of the “application gap” (Fig. 1). The main reasons for the land-use application gap are as follows.
- 1)
It is difficult for VHR images to express land use composed of ground objects with similar spatial layouts but completely different socio-economic information. For example, consider the case of there being several office buildings in two land parcels. Each floor of the office buildings in one land parcel are enterprises, but in the other land parcel are government organizations. These categories are indistinguishable by VHR images.
- 2)
It is difficult to achieve cross-regional universality with the traditional scene classification model. In practical applications, the training dataset needs to be recreated according to the different research cities, which consumes both time and manpower.
- 3)
The category systems for the scenes and urban land use are inconsistent, with different numbers of category quantities and different meanings of the categories.
To extract socio-economic semantics for urban land-use mapping, multi-source geospatial data, such as social media data and volunteered geographic information (VGI) data, are being increasingly utilized in urban analysis (Soliman et al., 2017). Compared with social media data, VGI data are free and are updated in real time. Furthermore, the data format is diverse, allowing the data to reflect the status of a city from different aspects. For example, OpenStreetMap (OSM) polygon data are commonly used as ground-truth data (Audebert et al., 2017; Fonte et al., 2017), while road line data are often used as land parcel boundaries (Chen et al., 2016). Point of interest (POI) data are widely deployed for their geographic information, including location, place, category, density, and distribution (Rodrigues et al., 2012; Yao et al., 2017a). Land-use classification using POIs generally consists of three main steps: 1) hard reclassification of the POIs; 2) quadrat analysis or kernel density estimation to quantify the spatial distribution patterns of the POIs; and 3) unsupervised classification or supervised classification (Hu et al., 2016; Liu et al., 2017; Long and Liu, 2015). However, hard classifying point-scale POI categories into land-use types loses much of the information contained in the points. For example, incorporating the food and beverage category (a category in the Amap POI standards) into commercial and business facilities clearly violates the fact that the food and beverage category is also likely to occur in residential areas, parks, and industrial areas. Furthermore, POIs lack particular natural attributes, such as river, farmland, or meadow.
In fact, urban land-use identification is strongly associated with both the external physical characteristics and the inner socio-economic semantics. For these reasons, the current land-use classification methods combining VHR images with VGI data are used to recreate datasets and generate land-use maps for specific cities. OSM data are commonly used as boundaries, and a classifier is then used to classify the combined features extracted from the VHR images and POIs (Hu et al., 2016; Liu et al., 2017). However, this traditional land-use classification framework still faces three problems in practical urban analysis. Firstly, most of the existing methods employ supervised classification frameworks, which are not extensible for different cities. In addition, due to the poor transferability of the frameworks, it is necessary to recreate the training samples for different cities manually, resulting in a huge time cost. Secondly, in the traditional framework, the VHR images and POIs are fused at the feature level and, as such, it is difficult to analyze the constituent factors of complex land-use types. In actual circumstances, scenes may only be components of the land use, and the surrounding environment should also be taken into account. For example, basketball courts are commonly found in sports areas, but may also be found in education and science areas, such as university campuses. Thirdly, the categories of the input data used in the traditional methods are consistent with the categories of the output land use, which is not robust for different urban analyses.
Describing land parcels from the perspective of semantic objects can effectively explain the causes of the above problems. In geographic information systems, objects are modeled as points (or pixels), lines, and polygons (or areas) (Goodchild, 1993; Tang et al., 1996). In urban land-use classification research, the land-use mapping units can also be described by points, lines, and polygons (Fig. 2). Land-use mapping units segmented by regular grids cause the mosaic phenomenon, which is manifested as sawtooth boundaries and ambiguous geographical meanings. Therefore, it is necessary to use line objects, e.g., road networks, to segment the geographically meaningful land parcels. However, the semantics represented by the same objects at different scales can be different. The scales of polygon objects (i.e., VHR images) and point objects (i.e., POIs) are different from the land parcel scale of the land-use mapping units, so that they cannot express all the natural and socio-economic attributes. Fortunately, POIs can represent local details and contain socio-economic information, and VHR image polygons can express global information at a larger scale. For example, a land parcel may be “market” at the POI scale and “medium residential” or “playground” at the VHR image scale, but in fact is a “school” containing residential buildings at a larger polygon scale. Finally, the education and science land-use type is attached to the land parcel through the integration and category mapping of the point-line-polygon semantic objects. In addition, since the attributes of the POI and OSM data can reflect the semantic information, and the VHR images have semantic category information after being classified, the objects are called “semantic objects”, in order to distinguish them from ordinary objects.
In this paper, to break through the application gap and meet the requirements of practical urban analysis, point-line-polygon semantic objects are used to map the land use in an open-source data-driven manner, and the point-line-polygon semantic object mapping (PLPSOM) framework is proposed. The main contributions of this paper can be summarized as follows.
- 1)
In order to reduce the time-consuming sample production, the proposed PLPSOM framework utilizes completely open-source data, including VHR images and VGI data. The VHR images provide clues from a visual perspective, and with regard to the VGI data, OSM road networks and water channels are used to generate the land parcels, and POIs are used to provide the socio-economic information.
- 2)
In the PLPSOM framework, to improve the transferability and universality of the framework, an enhanced deep adaptation network (EDAN) scene classification model is proposed to obtain the categories of the polygon semantic objects through the use of the existing public datasets. The EDAN model introduces multi-scale grid sampling to reduce the distribution differences between the public datasets and the target images, and utilizes a weighted network to solve the problem of the categories of the public datasets being more than those of the target domains.
- 3)
In order to fuse the point and polygon semantic objects of the POIs and VHR images, and solve the problem of inconsistent categories between the semantic object and land-use category systems, a rule-based category mapping (RCM) model is proposed, integrating the weighted bag-of-words (BoW) model and the Word2Vec model. The RCM model considers the mutual constraints between the semantic objects and the land use, to match the undefined land parcels with the correct land-use types. Finally, the PLPSOM framework was successfully applied to map urban land-use in six areas of the four Chinese cities of Wuhan, Beijing, Hong Kong, and Macao.
Section snippets
Study areas
Six areas of cities with different levels of urbanization, city size, development degree, and urban structure were selected as the study areas: Beijing city center; Wuhan city center; the Hanyang District of Wuhan; the Hannan District of Wuhan; Macao; and the Wan Chai area of Hong Kong (Fig. 3). Beijing is not only the capital, but also the political, cultural, and educational center of China. According to the Beijing Municipal Bureau of Statistics the population of Beijing reached 21.71
Methods
In order to suit the needs of land-use mapping in various cities, the PLPSOM framework is proposed. In the PLPSOM framework, the land parcels are generated as the cartography units based on the constraint of OSM line data. Naturally, the POI data and the VHR images are matched to the land parcels. Subsequently, the point semantic objects are acquired from the land parcels of POIs, inheriting the categories of the POIs. With regard to the polygon semantic objects, they are sampled from the land
Results
The presented results include five parts: 1) the land parcels obtained by segmentation; 2) the quantitative effect of the VHR image classification results; 3) the mixture of semantic objects in the land parcels; 4) visual interpretation of the land-use maps; and 5) the overall classification accuracy (OA) of the land-use mapping.
Scalability of the land-use category system
Although the established land-use category system can meet the general needs of urban analysis, there are still some imperfections in the category system, due either to limitations in the expert knowledge or the additional requirements of the urban application. Therefore, it was necessary to explore the robustness of the PLPSOM framework with regard to the land-use category system.
Firstly, by reducing the land-use categories to nine (the level I categories in the MOHURD urban land-use
Conclusion
In this paper, the open-source data-driven PLPSOM framework has been proposed to efficiently produce land-use maps that meet the needs of urban analysis, using point-line-polygon semantic objects of VHR images and multi-source geospatial data. The OSM road network data, as line data, provide a boundary constraint for the land parcels. The POIs and VHR images, as point and polygon data, respectively, make up the semantic objects at local and global scales. The EDAN scene classification model
Declaration of Competing Interest
Declaration of Competing InterestThe authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
The authors would like to thank the editor, associate editor, and anonymous reviewers for their helpful comments and advice. This work was supported by National Key Research and Development Program of China under Grant No. 2017YFB0504202, National Natural Science Foundation of China under Grant Nos. 41771385, 41622107, 41820104006 and 61871299, Open Research Fund of State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University under Grant No. 18E03.
References (44)
- et al.
Land use and land cover change in greater Dhaka, Bangladesh: using remote sensing to promote sustainable urbanization
Appl. Geogr.
(2009) - et al.
Urban land-use mapping using a deep convolutional neural network with high spatial resolution multispectral remote sensing imagery
Remote Sens. Environ.
(2018) - et al.
Urban land use extraction from very high resolution remote sensing imagery using a Bayesian network
ISPRS J. Photogramm. Remote Sens.
(2016) - et al.
Towards better exploiting convolutional neural networks for remote sensing scene classification
Pattern Recogn.
(2017) - et al.
Improved scene classification using efficient low-level features and semantic cues
Pattern Recogn.
(2004) - et al.
An object-based convolutional neural network (OCNN) for urban land use classification
Remote Sens. Environ.
(2018) - et al.
Joint learning from earth observation and OpenStreetMap data to get faster better semantic maps
- et al.
Land use classification in remote sensing images by convolutional neural networks
arXiv
(2015) - et al.
Land use classification in construction areas based on volunteered geographic information
- et al.
Remote sensing image scene classification: benchmark and state of the art
A Bayesian hierarchical model for learning natural scene categories
Generating up-to-date and detailed land use and land cover maps using openstreetmap and GlobeLand30
ISPRS Int. J. Geo-Inform.
The state of GIS for environmental problem-solving
Mapping urban land use at street block level using openstreetmap, remote sensing data, and spatial metrics
ISPRS Int. J. Geo Inf.
Deep residual learning for image recognition
Spatial metrics and image texture for mapping urban land use
Photogramm. Eng. Remote Sens.
Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery
Remote Sens.
Mapping urban land use by using Landsat images and open social data
Remote Sens.
Toward mapping land-use patterns from volunteered geographic information
Int. J. Geogr. Inf. Sci.
Importance of remote sensing and land change modeling for urbanization studies
Comparison of urbanization of four Asian cities using satellite data
Doboku Gakkai Ronbunshu
Classifying urban land use by integrating remote sensing and social media data
Int. J. Geogr. Inf. Sci.
Cited by (67)
An MIU-based deep embedded clustering model for urban functional zoning from remote sensing images and VGI data
2024, International Journal of Applied Earth Observation and GeoinformationScale-aware deep reinforcement learning for high resolution remote sensing imagery classification
2024, ISPRS Journal of Photogrammetry and Remote SensingGlobal urban high-resolution land-use mapping: From benchmarks to multi-megacity applications
2023, Remote Sensing of EnvironmentBuilding use and mixed-use classification with a transformer-based network fusing satellite images and geospatial textual information
2023, Remote Sensing of EnvironmentThreshold effect of data amount and grid size on urban land use type identification using multi-source data fusion
2023, Sustainable Cities and SocietySemi-supervised knowledge distillation framework for global-scale urban man-made object remote sensing mapping
2023, International Journal of Applied Earth Observation and Geoinformation