2. Transport Performance: An Overview
An overview of using the transport_performance
package to calculate the transport performance of urban centre public transit networks.
This page discusses the main methods and tools used within the package and provides links to additional resources for further reading. In particular, this page presents a methodology for assessing the performance of urban centre public transit networks using transport_performance
. Although, it is possible to modify and extend the approach presented to suit the requirements of most transport analyses including:
- Analysis area (no strict requirement on using Eurostat’s urban centre definition)
- Date of analysis
- Time of day
- Transport modes such as walking, cycling, public transit, private car or a combination of these modes
- Maximum journey duration
This page does not cover retrieving input data or transport_performance
API usage. See the how-to, tutorials, and API reference pages for more information on these aspects. It should be noted that transport_performance
will work with any custom boundary provided, in which case urban centre detection will not be required. Also that public transit schedule preprocessing is not required for modalities other than public transit.
transport_performance
can be used to assess urban centre public transit performance by following the overall approach shown in Figure 1.
The process starts with urban centre detection. This definition was created by Eurostat, and represents high density population clusters (see the Eurostat level 1 degree of urbanisation methodology document for more details). In short, it is a cluster of contiguous 1 Km2 grid cells with a density of at least 1,500 inhabitants/Km2 and a total population of at least 50,000. This definition is advantageous since it can be applied consistently internationally.
transport_performance
currently works with gridded population estimates. Such a data source is the Global Human Settlement Layer (GHSL). The GHSL-POP layer provides high resolution estimates with worldwide coverage. It uses combined satellite imagery and national census data to produce population estimates down to 100 metre grids (see section 2.5 of the GHSL technical paper for more details). Using transport_performance
, it is also possible to reaggregate gridded population estimates (e.g. from 100m to 200m grids) as a balance between achieving granular results and performance at the transport network routing stage.
When considering public transit performance, schedule data is a core input (for other modalities this step is not required). The widely adopted General Transit Feed Specification (GTFS) data are required for defining the public transit network within transport_performance
. This is scheduled data, therefore the effects of delays (such as traffic) are not accounted for in the final transport performance results. transport_performance
provides a range of GTFS validation, cleaning, and filtering methods to pre-process the inputs for use during the transport network routing stage.
The underlying route network is built using OpenStreetMap (OSM) data. OSM is an open, community-maintained source of map data worldwide. OSM data provides the spatial information about the street network, such as road and pathway locations, speed limits, transport rules and junction locations. With transport_performance
it is possible to optimise these data by spatially filtering OSM files to an area of interest (using Osmosis). This filtering also removes OSM features that are not required for transport routing (such as buildings and waterways).
The transport network routing stage calculates the feasible journey travel times over multiple departure times. transport_performance
uses R5py, to undertake performant transit routing with the Round-Based Public Transit Routing engine (RAPTOR). It is also is highly configurable and caters for a range of transport modalities, including public transit, private car, cycling, and walking. This improves upon the ONS Data Science Campus’ previous transport modelling work by calculating robust median travel times over many journeys. Calculated travel duration at a single journey departure time can vary significantly, depending on the public transport service availability within the locality of the journey. Travel time statistics are calculated across multiple consecutive journies within a given time window. These statistics are a fairer representation of average journey travel times within a given area. For more details, see Fink, Klumpenhouwer, Saraiva, Pereira, and Tenkanen (2022) and Conway, Byrd, and van der Linden (2017).
The final stage uses the network routing results (travel times) to calculate the transport performance. See the Transport Performance: A Definition page for more details on this step.
For more information on the known transport_performance
package limitations, see the limitations and caveats page.