Features of the built environment are increasingly being recognised as potentially important determinants of obesity. This has come about, in part, because of advances in methodological tools such as Geographic Information Systems (GIS). GIS has made the procurement of data related to the built environment easier and given researchers the flexibility to create a new generation of environmental exposure measures such as the travel time to the nearest supermarket or calculations of the amount of neighbourhood greenspace. Given the rapid advances in the availability of GIS data and the relative ease of use of GIS software, a glossary on the use of GIS to assess the built environment is timely. As a case study, we draw on aspects the food and physical activity environments as they might apply to obesity, to define key GIS terms related to data collection, concepts, and the measurement of environmental features.
The role of the built environment in explaining the spatial patterning of obesity has recently received considerable attention in the public health and epidemiology literature [1, 2]. The built environment comprises of urban design, land use, and transportation systems [3]. Research in this field has shown that features of the built environment exert an influence on physical and mental health as well as health behaviours, independently of the socio-demographic characteristics of the people living in these places [4–6]. For instance, researchers have evaluated whether aspects of the food environment including access to supermarkets, convenience stores, and fast food outlets are associated with body mass index (BMI) [7, 8]. Similarly, other features of the built environment that influence obesity through the promotion of physical activity include street connectivity, transport infrastructure, and the location and quality of community resources (e.g. parks and schools) [9, 10]. Built environments that encourage unhealthy eating or are not conducive to physical activity are often termed obesogenic [11].
Public health researchers with an interest in the built environment have benefited from the emergence of Geographic Information Systems (GIS) technology [12]. GIS offers the opportunity to integrate spatial information from a range of disparate sources into a single framework, and to use these data to develop precise measures of the built environment. The tools available within a GIS also enable precise spatial measures to be derived such as the road distance from a household location to the nearest supermarket or calculations of the amount of neighbourhood greenspace.
This glossary introduces unfamiliar users to key terminology and some of the ways in which GIS can be utilised to measure and represent features of the built environment that may relate to obesity as well as highlighting some basic methodological issues. The terms covered are restricted to those where GIS has, or has the potential to assist in developing more precise measures of the built environment. Text in italics refers terms defined elsewhere in the glossary. Terms are divided into three key categories: 1) data collection; 2) concepts; and 3) measurement.
Data collection
Data acquisition
One of the greatest challenges facing GIS users is the acquisition of detailed data sources that contain locational and attribute information on the built environment. Spatial data can be acquired using primary or secondary data collection methods. Primary data are often collected using two common methods: 1) "psychometric"[13–15] based on surveys of individuals who report on characteristics of the environmental feature of interest; and/or 2) "ecometric"[16, 17] though direct or "systematic social" observations undertaken by fieldwork auditors who visit neighbourhoods to make observations or to complete an audit tool [18]. More recently, tools that enable the direct integration of collected spatial data into GIS have been developed including Global Positioning Systems (GPS)[19] and remote sensing (captured remotely using satellites to identify green space, topography etc.). Secondary spatial data are collected by external sources and include administrative data (e.g. from a census), commercial data (e.g. from market research companies), internet resources (e.g. company websites or Google street view), and phone directories (e.g. yellow pages). Commercial data are increasingly being acquired by researchers as a key data source for identifying features of the built environment [20–22]. Compared to primary data these, and other secondary data sources, may be relatively cost-effective to obtain and can usually be sourced for specific study areas or across a large geographical area (e.g. nationwide). Where secondary data are utilised, it is important to record the steps taken in this process (in the form of metadata) so future users can accurately interpret and use these data and that the process can be replicated by other researchers. A key drawback of secondary data sources is that they are often not designed for the analytical purposes for which they are being used and therefore may not entirely meet the needs of the researcher. Therefore, in order to ensure their accuracy, validation against primary data is often preferable. Discordance between data collected in the field (primary data) and secondary data are mainly due to three possible errors:[23] 1) facilities included in the commercial database are not found in the field; 2) facilities are included in the commercial database but not considered to be the same service type when identified in the field; 3) facilities found in the field were not in the commercial database. Specific results on the accuracy of secondary data sources have previously been reported for physical activity facilities [23, 24] and the food environment [24–27]. To summarise, findings suggest most sources of secondary data have sufficient error to potentially introduce bias into analyses. Both primary and secondary data often require manual geocoding to transpose the data into a GIS compatible format.
Geocoding is the process of matching raw address information (e.g. the household addresses of study participants or the addresses of neighbourhood resources such as supermarkets) with a digital spatial dataset that includes all addresses within the area of interest mapped to latitude and longitude coordinates [28]. Geocoding is often preceded by data acquisition whereby data are acquired from primary or secondary sources. Geocoding is prone to a number of errors which can bias estimates of the associations between the built environment and health [23, 29, 30]. The first source of error relates to the match rate which is the percentage of addresses that are successfully geocoded. Higher match rates are achieved when the raw address file is accurate and the digital data set is comprehensive and regularly updated. Low match rates may occur because of incomplete address information and errors such as incorrect street suffixes, mis-spelling of street names, suburbs, and postal area information. Second, even when high match rates are achieved, addresses may be geocoded to the incorrect location. This error may arise because of inaccuracies in the raw address and spatial digital files or the program settings (i.e. the criteria used to define a match such as sensitivity to spelling of street names).
Global Positioning System (GPS)
A Global Position System (GPS) is a device that uses a satellite system to pinpoint a stationary location on the earth to a latitude and longitude coordinate. In environment and health work, it is a valuable tool for field auditors that can facilitate the accurate and precise primary data acquisition of the location of features within the built environment such as food stores, parks or outdoor advertising [31]. GPS devices also enable investigators to track the mobility patterns of individuals through the environment to develop measures of their travel routes and activity spaces [32]. These technologies have recently been coupled with devices such as accelerometers (that provide objective measures of physical activity) so that the precise location where the physical activity is occurring is also captured [33, 34]. Given the high cost of the equipment, these data are often costly to collect, especially when seeking sufficient numbers to power epidemiological analyses. Further, GPS technologies are at the developmental stage and challenges remain including signal loss, slow location detection, precision of the device, battery power, and study participants forgetting to switch on the device. These factors may affect the completeness and accuracy of the GPS data. However, to aid new users, data collection and cleaning protocols to reduce the severity of these potential issues have been developed [19, 34, 35].
Accessibility refers to the ease of access to a particular neighbourhood feature with more accessible destinations having lower travel costs in terms of distance, time, and/or financial resources [36]. Accessibility to built environment features is not only determined by their distribution across space but also by mobility factors such as private vehicle ownership or public transportation networks [36–38]. Handy and Niemeier [36] suggest three categories of accessibility measures: 1) cumulative opportunity measures which is simply a count of features within a given distance with an equal weight applied to all occurrences of a specific feature; 2) gravity based models where features are weighted by factors such as the size of the destination or travel cost; and 3) random utility-based measures where theory is used to inform the probability of an individual making a particular choice depending on the attributes assigned to that choice (e.g. attractive of destination or potential travel barriers) relative to all choices. An alternate accessibility measure that incorporates a temporal dimension has been proposed by Kwan [39]. Space-time measures' incorporate the constraints imposed by the fixed locations an individual must visit during the day (e.g. location of work-place, child's school) when determining potential accessibility to discretionary locations that an individual may visit (e.g. supermarket) (see also activity space). Greater locational access to neighbourhood features may improve or worsen the health-related behaviours of local residents. For example, high levels of accessibility to a greengrocer or large supermarket may better enable the purchase of fresh fruits and vegetables while greater accessibility to outlets selling fast food may encourage the consumption of fast food at levels that are damaging to health. Traditionally, accessibility has been rather simply measured through the presence or absence of a resource in a particular locality because these data were readily available. These measures assume equal exposure for each person within the area unit irrespective of where they live in that unit, the amount of time that they spend in the area, and their ability to travel within and beyond the boundary of the administrative unit. GIS has improved measurements of accessibility by enabling the creation of more refined individual-level metrics such as density within buffers from a household location, proximity based on network distance, activity-spaces, and continuous surfaces of accessibility such as Kernel density estimations.
