1 Introduction

While modern cities that are undergoing rapid development in newly industrialized countries normally benefit from the construction of mega-scale infrastructure, they also suffer from severe traffic problems. Globally, there is increasing concern regarding the loss of human scale in city centers or suburbs. As a result, research on street-level pedestrian movement has become a central topic in urban development literature. Numerous studies have investigated the impact of urban form on pedestrian travel behavior. Population density, land use, access to public transport, environmental comfort, and urban design are all proven to have an impact on pedestrian street usage. However, most of these studies focus on the city or neighborhood level, while relatively little attention has been paid to overall street patterns and their impacts on pedestrian movement at the street level.

Street patterns affect pedestrian movement at different scales. First, research shows that small-scale urban blocks can increase the walkability of an area [1, 2]. Other studies have shown that higher street density also attracts more active urban functions, such as retail shops and restaurants, which contributes to street vitality and attracts more human activity in the city [3].

Second, large-scale street connections can affect the distribution of different urban functions within cities. The location of office towers and local restaurants requires very different scales of accessibility. Intuitively, pedestrian movement is attracted to various land uses as origins and destinations, and those land uses and public transport stations are affected by large-scale street connections. In other words, although walking is primarily a short-range activity, the distribution of pedestrians in the city and neighborhoods is affected by collective route choices of multiple transportation modes and active land uses, which are affected by multiple scales of street connections.

Third, the morphology of street patterns can affect the vitality of urban places. Making a through-traffic motorway in a densely built fine-grid local street could activate the area with more shops and more pedestrians [4, 5]. It suggests that pedestrian-friendly space as a local experience is also affected by the large-scale networks that allow these vital places to emerge and thrive.

This paper aims to understand the impact of street pattern on pedestrian distribution. The main objective is to understand how street patterns measured at different scales might concurrently affect pedestrian distribution within cities. Specifically, two research questions are explored: (1) What are the major factors that can explain the pedestrian distribution in different neighborhood? (2) How do land use, metro stations, and city- and neighborhood-scale street connectivity affect the pedestrian distribution at different locations?

2 Background

Previous research on walking generally diverts into two categories. The first category primarily addresses the issue of walkability based on “comfort and safety,” which aims to evaluate the impact of physical environment (ranging from pavement, shading, street furniture, and other physical conditions) on walking activities. Through questionnaires, these studies measure the influence of these factors on either people’s willingness to walk or their sense of safety and comfort while walking, while also measuring how these variables affect frequency of walking [6, 7]. The second category investigates the spatial conditions of walking, analyzing the impact of urban form such as density, land use, and street connectivity on walking activity. Through on-site observation (such as gate count surveys) and questionnaires, these studies generally analyze the impact of these spatial conditions on the distribution of pedestrian volumes [8,9,10,11].

Empirical studies on comfort and safety typically compare different neighborhoods in a city but not at the resolution of street segment. On the other hand, some empirical studies on “spatial conditions” searches the factors to explain the detail distribution of pedestrians on the streets. For designers, this could be used to access the potentials of walking within neighborhoods. This research falls under the direction of the second category, focusing on how street patterns can affect both the distribution of land uses as well as the pedestrian volume within cities.

One of the key issues in the study of urban form is how street connectivity is measured. Traditionally, researchers use block size [1, 2], number of intersections per area [12,13,14], or average distance between intersections [15] to measure street connectivity. These methods are similar as they quantify street connectivity based on metric distance. Other researchers measured street connectivity based on accessible walking area and the route directness from a particular location in the street network [1, 13, 15,16,17]. While route directness provides another dimension related to pedestrian perception and navigation within a street network, it is still based on the local area of a street network.

Space syntax presents a different way to measure street connectivity in cities. First, as a theory based on movement economies, space syntax can assess the influence of street patterns on both movement and land use. It proposes that the street pattern can affect both the distribution of movement and the land use pattern, especially for movement economies such as retail, markets, catering, and entertainment [18]. Second, as a method, the syntactical measurements are based on topological geometry which can be calculated at various scales. It also allows for a combination of metric distance with topological distance measured in angular turns [19]. Last, the ability to reveal the multiscale structure embedded in the street network enables space syntax to determine the distribution of urban functions as well as different modes of movement patterns [20,21,22,23,24].

Specifically, there is a growing body of research on the influence of street connectivity on land use and pedestrian movement. Recent studies tend to focus on multiple case areas within a city. They also aim to systematically investigate how street connectivity interacts with other important factors, such as land use or density, to affect pedestrian movement. One study analyzing pedestrian volume in the Atlanta downtown, midtown, and highland areas showed that land use has a major impact on pedestrian volume among the three areas. Street connectivity (measured by a 1-mile metric reach over directional distance) had a significant influence on pedestrian distribution at the street scale in each area [10]. Another study in London used 6-year origin and destination (OD) survey data to build a prediction model with four variables: land use diversity equitability intensity, population density, integration r2 km, and public transport accessibility. The results show that integration r2 km and land use diversity play major roles [25]. Many studies have found a strong influence of active land use and local street patterns on the distribution of pedestrian movement.

Most of these studies, however, neglect the influence of street patterns on the distribution of active land use, and this influence may function at various scales beyond the local scale. In fact, the spatial analysis of commercial functions based on multiple-scale street connectivity is another well-studied topic in space syntax research. Early studies in the UK and Iran found that the location of a “center” depends on both how the area is connected to the city as a whole and the local catchment area (measured as 3–5 turns) [26]. Read’s study on Dutch cities suggests that the configuration of street networks has a “bi-plex” structure: a super-grid network that facilitates movement between different places in the city and local street networks. This bi-plex structure leads to a clear distinction between the volumes of movement (pedestrian and car) and types of shops [20]. Read states that vital urban places are created by the interface between different scales of movement networks [5]. Recent empirical study on the multi-scalar structure of centralities on shopping frontage in Buenos Aires demonstrates that syntactical measurements have stronger explanatory power compared to the distance decaying model [24]. Sheng and Liu’s analysis in the Wangfujing shopping area in Beijing shows that the distribution of restaurants, the number of reviews in Dazhongdianping (Chinese Yelp), and street pedestrian volumes are all highly correlated with the same syntactical measurements (integration r3 km) [27]. These findings suggest that the distribution of active land use can be explained by multiple-scale street connectivity.

However, recent research findings challenge the relationship between aggregate flow data with the scale factors in syntactical measurements and the individual movement pattern [28]. Based on agent simulation, this research argues that it is not the movement scale but the underlying topological structure, the differences between a few well-connected streets with many poorly connected ones, that determines the distribution of flow volumes. In other words, their research suggests that the topological geometry of the local street pattern itself determines the distribution of flow. Following this line of thinking, it could be inferred that different neighborhoods may have very different correlations between flow volumes and street patterns because they may have distinctive street patterns. Therefore, more detailed empirical study is needed to further test the correlation between different modes of movement and the scale attributes of syntactical measurements using a large number of actual flow data.

This paper aims to address the following questions: Can the neighborhood pedestrian distributions be explained by both city- and local-scale street patterns? Are there any differences if we analyze the neighborhood separately or combine them into one model? Finding answers to these questions requires sufficient samples in one city, ranging in location from central areas to inner and outer suburbs. Within these neighborhoods, the pedestrian observation gate should be evenly distributed to cover all kinds of streets, ranging from small alleys to large avenues. Using Tianjin as a case city, this research aims to understand how both local- and city-scale street patterns affect pedestrian distributions within different neighborhoods.

3 Case Study Area

3.1 Overview of Tianjin

Tianjin is the fourth largest city in China, with a population of approximately 15 million people. From the 1900s to the 1940s, Tianjin consisted of concession territories ceded from nine countries. Beginning in the 1980s, Tianjin underwent a period of rapid urban development. This resulted in a complex, hybrid street pattern of both Western and Eastern (historical and modern) influences, which makes Tianjin an excellent city for this type of study.

Three sets of data are used in this research: street connectivity maps, non-residential land use location, and pedestrian/vehicle movement data. The street network for all of Tianjin city is drawn based on the Baidu street map 2014. It is measured by the Depthmap software with two widely used measurements: integration and choice. “Integration” (also known as “angular closeness centrality”) measures the average angular distance from one street segment to all other street segments within a given metric radius. It shows the potential of the “to-movement” of each street segment. “Choice” (also known as “angular betweenness centrality”) measures the potential of a street segment to be passed by all pairs of shortest paths within a given metric radius. It shows the potential of the “through-movement” of each street segment. Because the value of choice does not follow a normal distribution, most empirical studies use a log-choice value of a certain radius. Figure 1 shows the street connectivity in Tianjin measured by the integration and logged choice values within a 10 km radius. One new syntactical measurement, normalized angular choice (NACH), based on choice and total depth [29], will also be used in the analysis.

Fig. 1
figure 1

Integration r10 km and log-choice r10 km of Tianjin 2014

Land use data are obtained from a Baidu point of interest (POI) database. Previous study shows that commercial land use has a strong and positive relationship with the pedestrian flow on streets in China [10, 11]. Therefore, this research focuses on the sum of five important commercial POIs including commercial and office towers, retail, restaurants, offices, and hotels, all of which are chosen because of their potential attractions for pedestrian movement.

3.2 Pedestrian Movement

Thirteen neighborhoods are selected within the city to monitor pedestrian movement and volume. The location of these neighborhoods ranges from typical city centers to the suburbs. Table 1 shows the general information about these neighborhoods. Based on the distance from the city center (the crossing of NanJingLu and YingKouDao), six neighborhoods are located in the central area of Tianjin, five neighborhoods are located in the inner suburbs, and two neighborhoods are located in the outer suburbs.

Table 1 General information of 13 neighborhoods

For each neighborhood, 30–100 street segments are chosen as gates to monitor traffic flows. For each street segment, 5 min of two-directional flow of pedestrian, cyclist, and vehicle traffic is counted four times per day in one weekday as well as one weekend day in September 2014 and September 2015. The only exception is the Binjiangdao neighborhood, which only has one weekday’s gate count. In total, 215,577 pedestrians are recorded in these two survey periods. Their distributions are presented in Fig. 2.

Fig. 2
figure 2

Spatial distribution of pedestrian volumes in 13 neighborhoods

4 Method

4.1 Variables

Based on previous research, this study divides the measurable variables into four categories: (1) Urban performance data, which refers to the actual use of urban space. These include pedestrian and vehicle volume and POI distribution within each neighborhood. (2) City-scale street connectivity, which will be quantified by integration, choice, or normalized angular choice (NACH) value of large radii (5 km–n). These values correlated well with the observed vehicle volume data and will serve as a proxy for the variable. Another variable included is the proximity to metro stations which potentially have an impact on pedestrian distributions. (3) Local-scale street connectivity, which is quantified by the integration or choice value of small radii (500 m–3 km) to reflect different transportation movement at the neighborhood scale. (4) Street density, which refers to the total length of accessible streets from each street segment within certain radii (100 m–1 km). It has been widely used in previous studies on pedestrian volume and POIs without the use of space syntax.

4.2 Theoretical Framework

Based on the four types of variables mentioned above, Fig. 3 shows an analytical diagram for different neighborhoods. This diagram provides a hypothetical framework for the analysis. For instance, in those center neighborhoods with high levels of POI and pedestrian volume, the local street network usually has a fine grid structure (high integration or choice value in small radii). If the city-scale streets are meshed well with the local grids, a strong attractor can be created for both active land use and pedestrians [5]. This strong attractor might be further intensified by building a metro station. Neighborhoods located in the sub-center likely have a good local street network but limited connections with city level streets. For suburban neighborhoods, they probably either have a fragmented local street network or the local grid is poorly connected to outside street network. This likely leads to lower concentration of active land use and fewer pedestrians on the streets.

Fig. 3
figure 3

Diagram of the multiscale street network model for different neighborhoods

Space syntax provides a great tool to measure both the city and local street connectivity. With these measurements, the detail distribution of pedestrians at street scale can be analyzed in those 13 neighborhoods across the city.

5 Analysis

To understand the impact of city street pattern on the distribution of pedestrian volumes, this research firstly analyzes the vehicle and POI data which might have an impact on pedestrian distribution within cities.

5.1 Vehicle Volume

Although the vehicle volume is not the main subject of this study, it can be compared with the analysis on pedestrian movement to illustrate the relationship between different scales of movement and syntactical measurements. Together with the distance from the metro, this could potentially show where pedestrians are coming from at the city scale. Because driving in a city normally involves much longer distances than walking, we analyzed the vehicle movement data of 13 neighborhoods in one model.

Figure 4 shows the determination coefficient (R2 value) between weekday and weekend vehicle flow with three syntactical measurements of the 12 different radii. In total, there are 637 street segments which allow driving. The result shows that for both the weekday and weekend flows, 10 km radius log-choice and NACH value have the strongest correlation (R2 = 0.577–0.607) with vehicle volume. Thus, in later analysis, the NACHr10 km value will be used to create a vehicle accessibility and visibility measurement.

Fig. 4
figure 4

City-scale street connectivity analysis on vehicle volumes for all neighborhoods in one model

In order to better measure vehicle accessibility and visibility within the selected neighborhoods, we further combine NACHr10 km with the angular turn measurement (angular step depth, or ASD) and create a new variable, “main road ASD [normalized angular choice weight (NACH wgt.)].” The threshold to define a main road is set to 1500 cars/h. Angular step depth is a way of calculating the angular distance from one street segment to other segments. For instance, a 45° turn is 0.5, a 90° turn is 1, and a 135° sharp turn is 1.5.

In Fig. 5, we use the BinJiangDao (BJD) neighborhood as an example to illustrate how this variable is constructed. First, the main roads are selected and the average values of NACHr10 km on those main roads are calculated. Second, angular step depths are calculated from those two main roads (NanJingLu and XinAnLu, respectively). Finally, the main road ASD for the BinJiangDao (BJD) neighborhood is calculated by the following formula:

$$\begin{aligned} & {\text{Main\;road\;ASD}} = {\text{1}}.{\text{4215}}/\left( {{\text{1}} + {\text{ASD}}_{{\text{1}}} } \right) + 1.1421/\left( {{\text{1}} + {\text{ASD}}_{{\text{2}}} } \right) \\ & {\text{ASD}}_{{\text{1}}} \;{\text{is\;the\;Angular\;Step\;Depth\;from\;NanJingLu}}. \\ & {\text{ASD}}_{{\text{2}}} \;{\text{is\;the\;Angular\;Step\;Depth\;from\;XinAnLu}}. \\ \end{aligned}$$
Fig. 5
figure 5

Illustrations of main road ASD variable (using BinJiangDao as an example)

It is necessary to point out that main road ASD is a hybrid measurement of syntactical measurement and actual layout of main road structure in the city. It combines the vehicle accessibility with the visibility: NACHr10 km is well correlated with the vehicle volumes; angular step depth gives a topological decaying factor which penalized the street without direct connection to main roads.

5.2 POIs

After constructing one variable to quantify vehicle accessibility and visibility, we further developed a new measure to properly quantify POI data within each neighborhood. In previous studies, the land use or POI data are either treated directly at the street segment scale or use a certain metric reach as a buffer zone to aggregate the data together onto the street segment [10]. The principle behind this type of analysis is that for each street segment, not only should the shops that directly open their doors onto this segment be considered as potential of this particular space, but all the other shops located in the vicinity should also be taken into account. However, metric distance alone may not be a good way of defining vicinity. In this research, we propose another way of measuring POI data which combines the human visual perception and movement potentials.

Figure 6 illustrates the logic of this method. A 10 m × 10 m grid is constructed in Depthmap. We start with a street segment in the corner (marked by white double arrows), showing where all shops align with the street, and then flow into a distance decay function. As illustrated in the bar chart below, the shops 200 m ahead will be counted 50%, and this number will be added to the starting position. Wherever people make a turn to a side street there is a sudden drop in values because not all shops on side streets are instantly visible from the starting position. The formula for this method is as follows:

$${\text{POI}}\# 200 = {{2000/((2000}} + 0.05*{\text{MSD}}^{{\rm 2}})* {\text{(ASD}} + 1)^{\rm 2})$$
Fig. 6
figure 6

Illustration of the function of POI#200 and its application in a street segment in BinJiangDao (BJD) neighborhood.

POI#200 measures the percent of POI counted for any street segments away from the starting street segment. The unit is measured as a percentage. Metric Step Depth (MSD) is the metric distance from each selected starting street segment. The unit is measured in meters. Angular Step Depth (ASD) is the angular turn from each selected starting street segment. The unit is measured by decimal points.

We tested this method by comparing its results with other methods in a regression analysis against the observed pedestrian volume in neighborhood and all neighborhoods together. The other POI measures only simply count the number of POI on each street segment and aggregate them within certain metric distances (presented as POI R100 or POI R500 in Table 2). Table 2 shows the proposed method POI#200, has a substantially higher correlation with pedestrian volume in most of neighborhoods (8 out of 13). When putting all neighborhoods in one model, POI#200 still explains 0.297 of the distribution of pedestrian volumes in all street segments. It shows that POI#200 is a much better measure than other POI indexes. Therefore, it will be used in the following analysis. In the later part of this paper, wherever we refer to POI, it is POI#200.

Table 2 R2 value between pedestrian volume and different ways of measuring POI in each individual case and all cases together

5.3 Correlation Between Pedestrian Volume, POI, and Syntactical Measurements

After setting up variables for POI, main road ASD, and metro MSD (metric distance from each metro station), a correlation analysis between pedestrian volumes, POI, and all spatial variables is performed. There are two kinds of syntactical measurements used: integration and log-choice. For each kind of measurement, 13 radii are tested, ranging from 500 m to n (means the radius covers the whole map) (Table 3).

Table 3 Correlation matrix between pedestrian volumes, POI data and all spatial variables

First, the results show that in most neighborhoods (with the exception of the HuaYuan and NanJingLu neighborhoods) the weekday pedestrian data are strongly correlated with the syntactical measurements. Second, POI data also show a stronger correlation with syntactical measurements in most neighborhoods. That tendency is also very clear when observing all cases together: integration of small radii (1000–2000 m) are well related with POI (Pearson correlation coefficient R = 0.64–0.66), while the highest correlation between pedestrian volume and syntactical measurement is log-choice r2000 m (R = 0.46). Third, in different neighborhoods the highest correlation between pedestrian volumes and syntactical measurements are very different. In some cases, the integration value is stronger than log-choice, while in other cases the opposite is true. Besides the type of measurement used, these best correlated variables also vary in scale, ranging from 1 km all the way to 25 km and n. This raises an important question that is analyzed in the later part of this paper.

Finally, street density measurements show very different results among each of the case areas. In neighborhood AnShanDao (AShD), they are negatively related to pedestrian volume. In the HongQiNanLu (HQNL), Jinwan (JW), XiaoBaiLou (XBL), and ZhongShanLu (ZSL) case areas, they are positively related to pedestrian volumes.

It is interesting to point out that when we put all the neighborhoods together, the correlation between street densities and pedestrian volumes decreased significantly, but the correlation between street densities and the distribution of POIs remained quite high: the segment length r1000 m R = 0.639. This result can be compared with the correlation of syntactical measurements with POIs: the integration r1000 m is R = 0.660. They measure the same radii. The integration r2500m is R = 0.702, which marks the strongest correlation between syntactical measurements with POIs, although the syntactical measurements work better than street density. We should point out that the formula of integration is: Integration Ri = Node Count Ri2/Total Depth Ri. Node Count Ri calculates the number of street segments within a given radius i. Thus, while node count is a measurement of street density which neglects their length, it is still related to street density, especially in a small radius.

This finding suggests that the way street density affects pedestrian volumes is through its more direct influence on POIs: the higher the street density, the more active land uses are located in the area and the more pedestrians are attracted to the area. To put it in simple terms, density centralizes pedestrian movement across the city; syntactical structure distributes pedestrians within neighborhoods.

5.4 Multiple Regressions of Pedestrian Volume, Syntactical Measures, and POI

In this section, we will explore how pedestrian volume can be explained by POI and syntactical measures. Four variables, i.e. POI #200, main road ASD, metro ASD, and one space syntax measurement (integration or log-choice), are included in the analysis. To standardize the results, the Z-score of each variable is used (Table 4).

Table 4 Multiple regression on the distribution of pedestrian volumes with 4 variables in individual neighborhoods and all cases together

The R2 values of the model are highlighted for each neighborhood as well as all cases together. The results show that except for XBL (R2 = 0.3304) and YKD (R2 = 0.1961), the R2 values for the individual neighborhood are above 0.4. All four variables are significant when putting all samples in one model. However, when looking at the results case by case, there are some differences. For instance, POI measure was significant in all neighborhoods except for AnShanDao (AShD), Jinwan (JW), XiNanJiao (XNJ), and NanJingLu (NJL) (P value > 0.05). In these four neighborhoods, the distribution of pedestrian volumes tends to depend more on street connectivity.

Generally, syntactical measures are significant, except in BinJiangDao (BJD), HongQiNanLu (HQNL), XiaoBaiLou (XBL), and NanJingLu (NJL) (P value > 0.05). In these cases, the pedestrian volumes are more related to POIs. Comparing the contributions from POIs and syntactical measures, they tend to be an “either/or” pair of variables, as they are related in most cases as shown in previous analysis. Therefore, in the following section, we will focus on the influence of spatial conditions, removing POI from the model. Furthermore, among the three remaining variables (main road ASD, metro ASD, and one space syntax measurement), we select two variables with stronger significance (low P value) for the following analysis.

Not surprisingly, by removing two variables (especially POI) the R2 value of the model decrease; however, there are still nine neighborhoods with R2 values higher than 0.4. Table 5 shows that the selected syntactical measurement is significant in all cases expect the HongQiNanLu (HQNL). For the other 12 cases, the coefficient values of syntactical measurements are generally over four times higher than those of the other variable. The exceptional cases are XiaoBaiLou (XBL), XiaoBaiLou (NJL), and NanLou (NL), where the distance from the metro station has a similar impact on distribution of pedestrian volumes.

Table 5 multiple regression on distribution of pedestrian volumes with two selected spatial variables

Second, although the main road ASD is the least selected syntactical variable and only significant in three cases: BinJiangDao (BJD), HongQiNanLu (HQNL), XiNanJiao (XNJ), if we consider those cases which were explained by syntactical measurements with radii larger than 3 km, almost half of the neighborhoods (6 out of 13, marked by yellow background) are still affected by large-scale street connectivity. The fact that main road ASD does not perform well in many cases might be because it is based on a large-radius (NACHr10 km) syntactical measurement, which is strongly related to large-scale integration or log-choice values; therefore, its influence is “overwritten” by other syntactical measurements of large radii.

Third, the syntactical measurements and metric distance from the metro are often selected as good combinations. When looking at the radii of syntactical measurement, 7 out of 13 cases are well correlated at small radii, 1000–2000 m. AnShanDao (AShD), YingKouDao (YKD), NanJingLu (NJL), and WuJiaoYao (WJY) cases are highly related at very large radii, 20 km to n. When looking at the type of syntactical measurements, integration is better than log-choice in 5 out of 13 cases, especially at small radii (three cases). As mentioned before, small radii of integration values are often well related with both street density and distribution of POI. This result indicates that in those three cases (BinJiangDao, NanLou, XiNanJiao) the “to-movement” (to the concentration area of POI) plays a dominant role over the “through-movement.” Similarly, there are also four cases in which small-scale log-choice plays a better role. In these four cases, the local street pattern has a great impact on distributing the pedestrians attracted by POIs. For those neighborhoods where pedestrian volumes are well related with large-scale syntactical measurements, log-choice played better role than integration. This result suggests that the distribution of pedestrian volume is a complex spatial phenomenon. Although the street pattern has a clear impact on each individual case, there are still variations which could not be explained by one type of syntactical measurement with one radius.

Fourth, when putting all cases into one model, log-choice r2000 and distance from metro together generate a R2 of 0.2752. This result is much lower than the analysis for vehicle movement. Unlike cars, which travel over longer distances and larger areas, pedestrian movement is restricted to much smaller areas and more affected by local environments.

Figure 7 better illustrate the influence of different variables on pedestrian distribution within the city. It combined all pedestrian observation in model and started with four variables, then removed POI and main road ASD gradually. In the four-variable model, POI has the highest coefficient. But as discussed earlier, this is because they are treated by both the metric and angular reach. Only choosing POI as a single variable could have an R2 value of 0.291 (adjusted R2 = 0.269). This partly confirmed the previous findings in Atlanta, GA, USA [10], but in the case of Tianjin, the non-residential land use had much less impact. Removing POI in the model caused a 0.1 decrease in R2 value. The three spatial variables had different impact on pedestrian volumes. Main road ASD and metro MSD represent the major street connectivity at the city scale. Log-choice r2000 represents detailed street connectivity at the local scale. In a three-variable model, the coefficient for main road ASD and metro MSD are 0.043 and − 0.100, indicating that the impact of the metro station location is more important than the accessibility and visibility of city scale street networks. Removing the main road ASD causes only about a 0.01 decrease in R2 value in the model, which also suggests that city-scale network has a weak impact on local pedestrian movement when putting all neighborhoods in one model.

Fig. 7
figure 7

Multiple-variant model analysis of four, three, and two variables (above) and the scatter plots of three variables: logChr2000, main road ASD, and metro MSD (below)

5.5 Neighborhood Summary

Previous analysis shows in at least six neighborhoods, the distribution of pedestrian volume is strongly correlated with a syntactical measurement smaller than 1.5 km radii. In six other neighborhoods, the radii of best correlated syntactical measurement are larger than 5 km, which is clearly beyond the reach of most walking activity. When combined all samples together, a syntactical measurement (log-choice) of 2 km radii is strongly correlated with pedestrian distribution within cities.

The syntactical measurements of different scales related to both pedestrian volumes and POIs in different neighborhoods are presented in Fig. 8. The relationship with choice values is shown in red. The relationship with integration values is shown in blue. The heights of the gray background represent the R2 value between pedestrian volumes and POIs for each neighborhood study. Comparing the effectiveness of choice (in red) and integration (in blue), regardless of the chart format, the choice measurement shows a more stable pattern than that of the integration measurement. Based on the curves of log-choice values, these 13 cases could be divided into three groups.

Fig. 8
figure 8

R2 value between two kinds of syntactical measurements with pedestrian volumes (line chart) and POIs (bar chart)

The curved line of group A appears in the shape of a “high heel,” starting with a low correlation in small radius, quickly reaching the peak, then gradually descending with the radii growing. Among the five cases, most neighborhoods are urban centers or have been considered sub-centers for years. JinWan (JW) is an exceptional case because it has an extremely low correlation between POI. In fact, this area has been redeveloped in 2014; therefore, there are very few shops or offices open at that time. Despite this fact, the distribution of pedestrians still revealed a fairly good correlation with the log-choice r1000. This group of neighborhoods demonstrated that the syntactical measurement of street pattern alone could explain the distribution of pedestrians to certain degree even there is almost no active land uses.

The curved line of group B appears in the shape of a “climbing” hill, starting with low correlation in small radius, then gradually going up as the radius grows. Some of these neighborhoods are normally located nearby the vital urban centers. For example, Anshandao (AShD), YingKouDao (YKD), and NanJingLu (NJL) are located to the east, south, and west of BinJiangDao (BJD) neighborhood, respectively. BinJiangDao (BJD) is the busiest shopping center of Tianjin. Except for their locations, these three neighborhoods are quite similar: even though the distribution of POIs is highly related to syntactical measurements, the correlation between POIs and pedestrian volume (R2 < 0.2) is still relatively low. These areas are highly accessible by cars because of their central location, but the shops inside are far less attractive compared with BinJiangDao (BJD). Therefore, pedestrian movement in these cases is mostly concentrated along the main roads such as NanJingLu (NJL) and YingKouDao (YKD), shown as a seepage pattern penetrating the neighborhoods. As a result, large-scale syntactical measurement (such as log-choice r10 km or r25 km) and distance decay from the metro have a large impact on the distribution of pedestrian volumes. Although the other cases of WuJiaYao (WJY), XiNanJiao (XNJ), and HuaYuan (HY) are not located nearby any dominant urban centers, all of them are located nearby very busy junctions for vehicle traffic. This result in a similar distribution of pedestrian traffic concentrated on main roads, which later penetrates to individual neighborhood.

Two neighborhoods do not show either the high heel or climbing patterns. XiaoBaiLou (XBL) is a famous historical center of Tianjin which still has a lot of shops. However, the correlation between POI and pedestrian volumes is very low (R2 = 0.273) when compared with other centers such as BinJiangDao (BJD) (R2 = 0.597). The correlations between syntactical measurements are also very low at all radii. Only the distance decay from the metro station has a relative higher coefficient value when compared with log-choice r1000 (see Table 5). The reason is that most shops are concentrated on KaiFengDao road, which is not a syntactically well-connected street at either large or small radii. The mismatch between POIs and syntactical center makes it is difficult to explain the distribution of pedestrians within the neighborhood.

In the case of HongQiNanLu (HQNL), the results are due to its unique location. First, it is a vital residential area with lots of local shops concentrated on a well-connected local street YuanZhongLu. Secondly, it is near a busy intersection of major streets, which attracts many large-scale shops. As a result, there is a positive correlation between POI and pedestrian volumes in this case, but neither the distributions of POI nor pedestrian volume show a good correlation with syntactical measurements. This is because the neighborhood has both a local center and city-scale center in one area, which makes it a hybrid neighborhood that cannot be explained by any syntactical measurements at one radius.

6 Discussion: What Truly Motivates People to Walk?

This paper starts with the research question of exploring the impact of multiscale street patterns on pedestrian distribution within the city. Using data from 703 gate count surveys in 13 neighborhoods in Tianjin, we analyzed the correlation between pedestrian volumes with POIs and street connectivity measured by the space syntax tool. Based on the empirical analysis, we constructed and tested different ways of aggregating POIs and different ways of including vehicle accessibility and visibility in the model.

The modeling process started with a correlation analysis between urban performance data (pedestrian and vehicle movement and POIs) and different spatial variables. The street density measurement of small radii (500 m–1 km) still correlated well with POIs, but not with pedestrian volumes. Pedestrian volumes showed good correlation with syntactical measurements but varied in the types and radii.

In the multiple regression analysis using four variables, when each neighborhood was analyzed separately, POI and syntactical measures were both major variables affecting the pedestrian volume. After removing POI and one less effective spatial variable, the regression model still had strong explanatory power (R2 > 0.4 in 9 out of 13 cases). Among all three spatial variables, syntactical measurements of various scales had the highest explanatory power but vary in different radii. Together with the neighborhoods affected by main road ASD, nearly half of the cases were affected by syntactical measurements beyond walking distance. This finding demonstrates that the scale attributes of syntactical measurement have a correlation with different scales of movement (vehicle and pedestrian). However, when analyzing pedestrian movement, it suggests that the distribution follows at least two patterns: (1) a concentration pattern following the human-scale network, which is attracted by the local-scale street connectivity and agglomerations of POIs in radii ranging from 1000 to 2000 m; and (2) a seepage pattern beyond the human-scale network, which is attracted by the interface between city-scale movement (vehicle and metro accessibility) and local street pattern.

These two patterns show two distinct types of walking scenarios: in the neighborhoods dominated by concentration pattern, the local street networks show clear differences between a few well-connected streets and many poorly connected ones. It follows the explanation described by Jiang [28]. The distribution of active urban functions further enhances the unevenness of the local streets. The pedestrian distribution logic of these neighborhoods is similar to many cases in the UK [25] and downtown cases in the United States [10]. In the neighborhoods dominated by seepage pattern, the local street connectivity tends to be similar. Only large-radii analysis can reveal the differences. Pedestrians are mostly attracted by office towers or an agglomeration of shops along main streets.

When putting all neighborhoods into one model and including POI and three syntactical measurements, the R2 value increases to 0.38. Among them, POI has the highest coefficient if using the proposed way of aggregating POIs (POI#200). However, other ways of aggregating POIs showed that log-choice r2000 had the highest coefficients. These findings suggest that in an across-city model, the local street pattern plays a vital role in the distribution of pedestrian volumes. City-scale street network, represented by metro MSD and main road ASD, has minor effects.

For the bigger question on what motivates people to walk, although POI and local street connectivity proved to be major factors, the city-scale variables (main road ASD and metro MSD) should not be underestimated. As a matter of common sense for planners and developers, changing the local street pattern or land use is relatively easy, since it is merely a local intervention, but changing how a particular area is linked within the city as a whole is far more difficult. It is a strategic decision that has long-term city-wide impacts. In this sense, creating a vital urban place requires a multiscale network strategy including the right local street pattern together with good access to metro and main roads in the city.