Elsevier

Applied Geography

Volume 45, December 2013, Pages 131-137
Applied Geography

Variation in low food access areas due to data source inaccuracies

https://doi.org/10.1016/j.apgeog.2013.08.014Get rights and content

Highlights

  • We replicated two US food access measures comparing results from commercial with field census data.

  • Low food access areas vary substantially depending on secondary data sources used.

  • Verification of secondary data with field work is advisable for community efforts.

Abstract

Several spatial measures of community food access identifying so called “food deserts” have been developed based on geospatial information and commercially-available, secondary data listings of food retail outlets. It is not known how data inaccuracies influence the designation of Census tracts as areas of low access. This study replicated the U.S. Department of Agriculture Economic Research Service (USDA ERS) food desert measure and the Centers for Disease Control and Prevention (CDC) non-healthier food retail tract measure in two secondary data sources (InfoUSA and Dun & Bradstreet) and reference data from an eight-county field census covering 169 Census tracts in South Carolina. For the USDA ERS food deserts measure accuracy statistics for secondary data sources were 94% concordance, 50–65% sensitivity, and 60–64% positive predictive value (PPV). Based on the CDC non-healthier food retail tracts both secondary data demonstrated 88–91% concordance, 80–86% sensitivity and 78–82% PPV. While inaccuracies in secondary data sources used to identify low food access areas may be acceptable for large-scale surveillance, verification with field work is advisable for local community efforts aimed at identifying and improving food access.

Introduction

Neighborhood characteristics have been shown to be associated with community food access, which in turn can influence healthy dietary behaviors (Edmonds et al., 2001, Moore et al., 2008, Morland et al., 2002). A number of studies have shown that low access to healthier food outlets, specifically supermarkets, can contribute to poor diet quality, e.g., lower intake of fruits and vegetables and higher intake of calories from dietary fat (Franco et al., 2009, Laraia et al., 2004, Larson and Story, 2009). Additionally, access to unhealthy food outlets, such as convenience stores and fast food restaurants, also contributes to poor diet quality (Jago et al., 2007, Pearce et al., 2008).

Improving access to healthy and affordable food is an explicit goal of several federal and state policy initiatives in the United States (US), including the Healthy Food Financing Initiative (HFFI), a partnership of the Department of Agriculture (USDA), Department of the Treasury (Treasury), and Department of Health and Human Services (DHHS) (“Health food financing initiative (HFFI)”, 2010). In addition, the Centers for Disease Control and Prevention (CDC) (Center for Disease Control and Prevention, 2011a, Centers for Disease Control and Prevention, 2011c) and a variety of state efforts, such as the Pennsylvania Fresh Food Financing Initiative (FFFI) (Pennsylvania Fresh Food Financing Initiative (FFFI), 2010) have initiated several food environment initiatives. In order to identify areas eligible for the federal support initiatives, several agencies have developed spatial measures of community food access, including the USDA Economic Research Service's (ERS) food desert (FD) (Ver Ploeg et al., 2009) and CDC's healthier food retail tract (HFRT) (“Children's Food Environment State Indicator Report”, 2011b; “State Indicator Report on Fruits and Vegetables, 2009”, 2009). While the CDC's measure focuses on HFRT, the counterpart, non-healthier food retail tracts (NHFRT), provide a measures of low access similar to the USDA's FD.

Each of these measures of community food access was operationalized using geographic information system (GIS)-based approaches (Hubley, 2011, McEntee and Agyeman, 2010). These approaches relied on different sources of secondary data to locate and classify retail food outlets. For instance, USDA ERS used the database of stores authorized to receive Supplemental Nutrition Assistance Program (SNAP) benefits and data from Trade Dimensions TDLinx (New York, NY) in 2006 to define the FD (U.S. Department of Agriculture, 2013, U.S. Department of Agriculture, 2011). CDC used the Dun & Bradstreet (D&B) data (Short Hills, NY) for locations of supermarkets in 2007 to define HFRT (“State Indicator Report on Fruits and Vegetables, 2009”, 2009). In recent years, the validity of secondary retail food data sources has been evaluated in different geographic settings in studies using ground-truthed field census validation (Fleischhacker et al., 2012, Gustafson et al., 2012, Liese et al., 2010, Liese et al., 2013, Powell et al., 2011). The ground-truthed field census has been considered as the gold standard for measuring food environment in such studies and the validity measures were estimated for the secondary data sources. Although the findings were inconsistent across studies, all those studies have consistently reported that secondary data sources such as Dun & Bradstreet and InfoUSA (Omaha, NE) contain substantial amounts of error, including undercounts and overcounts of outlets, geospatial inaccuracies, and incorrect assignments of store types (Fleischhacker et al., 2012, Gustafson et al., 2012, Liese et al., 2010, Liese et al., 2013, Powell et al., 2011). Errors in these secondary data may introduce bias into studies focusing on individual behaviors and also may affect policy-level food environment indicators such as FD and NHFRT. To date, very little is known about how inaccuracies in secondary data sources may influence the designation of a Census tract as an area of low food access.

The purpose of this study was to examine the variation in designation of low food access areas due to data source inaccuracies and to quantify the magnitude and direction of the inaccuracies. This study identified low access areas according to two agency-developed community food access measures (FD and NHFRT), using two secondary data sources (D&B and InfoUSA) and data from a validated field census.

Section snippets

Study area

The study area included eight contiguous counties in the Midlands of South Carolina (Fig. 1). The area covers approximately 5,575 square miles and includes a population of more than 720,000, which accounts for about 16% of the total population of South Carolina. Geographically, according to the 2010 U.S. Census, this area includes 169 Census tracts.

Field census on food outlets (reference data)

A field census of retail food outlets that included direct observation and verification of all food outlets using global positioning systems (GPS)

Results

The number of food outlets by type is shown according to data source in Table 1. The reference data (field census) identified fewer supermarkets and fruit and vegetable markets but more supercenters than listed in either secondary data source. It also included outlets in the categories warehouse club and large grocery store, which were not distinguishable in the secondary data sources because of lack of specific NAICS codes.

Compared to the reference data, D&B data identified fewer tracts as low

Discussion

In this study, secondary data sources such as D&B and InfoUSA, identified a similar number of low food access areas compared with field census data; however the low food access areas were not the same across different data sources. A much lower percentage of Census tracts were designated as low food access areas by USDA ERS FD definition than by CDC NHFRT definition. Compared to reference data, secondary data sources had good to excellent concordance for both FD and NHFRT, and had moderate

Conclusions

Our results suggest that Census tracts identified as having low food access vary substantially depending on the secondary data source used and the particular community food access measure chosen. The amount and direction of error introduced due to using secondary data sources is not acceptable to designate the USDA ERS FD and is probably acceptable to designate the CDC NHFRT if these community food access measures are used for large-scale surveillance purposes, e.g., to estimate the size of

Funding information

This work was supported by a grant from the RIDGE Center for Targeted Studies at the Southern Rural Development Center at Mississippi State University. The food environment data were funded by NIH 1R21CA132133. The contents of this article are solely the responsibility of the authors and do not necessarily represent the official views of the RIDGE Center for Targeted Studies or the National Cancer Institute or the National Institutes of Health.

Authors' contributions

XM conducted statistical analyses and drafted the manuscript; SB provided geographic expertise; BB provided statistical expertise; JH participated in acquisition of data, geocoded the data and conducted GIS-based data management; TB participated in collecting and managing the field census data; AL wrote the funding application, developed the idea for this manuscript, acquired and interpreted the data. All authors reviewed and edited the manuscript, and approved the final version of the

Conflict of interests

None.

References (29)

  • Centers for Disease Control and Prevention

    Children's food environment state indicator report

    (2011)
  • Centers for Disease Control and Prevention

    Communities putting prevention to work

    (2011)
  • S.E. Fleischhacker et al.

    Evidence for validity of five secondary data sources for enumerating retail food outlets in seven American Indian communities in North Carolina

    The International Journal of Behavioral Nutrition and Physical Activity

    (2012)
  • A. Gustafson et al.

    Validation of food store environment secondary data source and the role of neighborhood deprivation in Appalachia, Kentucky

    BMC Public Health

    (2012)
  • Cited by (0)

    View full text