type | message | table | rows | GTFS | |
---|---|---|---|---|---|
0 | error | Invalid route_type; maybe has extra space char... | routes | [1, 2, 3, 4] | /opt/hostedtoolcache/Python/3.11.9/x64/lib/pyt... |
1 | warning | Unrecognized column agency_noc | agency | [] | /opt/hostedtoolcache/Python/3.11.9/x64/lib/pyt... |
2 | warning | Feed expired | calendar | [] | /opt/hostedtoolcache/Python/3.11.9/x64/lib/pyt... |
3 | warning | Repeated pair (route_short_name, route_long_name) | routes | [13] | /opt/hostedtoolcache/Python/3.11.9/x64/lib/pyt... |
4 | warning | Unrecognized column stop_direction_name | stop_times | [] | /opt/hostedtoolcache/Python/3.11.9/x64/lib/pyt... |
5 | warning | Unrecognized column platform_code | stops | [] | /opt/hostedtoolcache/Python/3.11.9/x64/lib/pyt... |
6 | warning | Unrecognized column trip_direction_name | trips | [] | /opt/hostedtoolcache/Python/3.11.9/x64/lib/pyt... |
7 | warning | Unrecognized column vehicle_journey_code | trips | [] | /opt/hostedtoolcache/Python/3.11.9/x64/lib/pyt... |
Explanation
These explanation pages provide an understanding of the assess-gtfs
package.
assess-gtfs
allows users to validate, clean, inspect and filter transit timetable data in the General Transit Feed Specification (GTFS) format.
What is GTFS?
GTFS files are compressed zip archives of text files. Each text file containing information about routes, trips, calendar, stop locations and so on. Various transport modelling software are able to use these files as a relational database in order to undertake routing operations.
Below are the file contents of a small sample of UK GTFS.
.../tests/data/chester-20230816-small_gtfs/
├── agency.txt
├── calendar.txt
├── calendar_dates.txt
├── feed_info.txt
├── routes.txt
├── shapes.txt
├── stop_times.txt
├── stops.txt
└── trips.txt
1 directory, 9 files
Working with GTFS
If you would prefer a demonstration of assess-gtfs
, please follow the tutorial.
Filtering GTFS
When undertaking routing operations with GTFS, you typically need to filter large feeds to an area of interest. This ensures that building a transport network with a package such as r5py is optimised. Feeds can be restricted based upon location with a bounding box. They can also be restricted to a date or list of dates within the feed calendar. For more on filtering GTFS, please see the assess-gtfs
api docs.
Inspecting GTFS
Undertaking routing analysis tends to happen at a specific location and time or time window. It is important to assess the service distribution over the available dates within the GTFS. GTFS tend to come with a range of calendar dates, but the service volume across those dates can be variable and dependent upon the publication frequency of the specific feed.
The objective is to ensure a selected time of analysis is representative of average service volume within the feed. For a guide to doing this with assess-gtfs
, please see the tutorial section on summarising GTFS.
Validating GTFS
When working with GTFS from a range of sources, it is important to understand whether the feed you intend to use is compliant. Online tools like that available on the French government’s Transport Data Portal are excellent choices for manual validation of a small number of feeds.
assess-gtfs
produces tabular outputs for specification warnings and errors using gtfs_kit
under the hood. Note that not all of these errors are as severe as they initially appear. For example, the below validation table is commonly seen when validating British GTFS:
The first row in the validity table shows an apparent error, reporting “Invalid route_type; maybe has extra space characters”. Examining the routes table for the affected rows:
0 3
1 200
2 200
3 200
4 200
5 3
6 3
7 3
8 3
9 3
Name: route_type, dtype: int64
We see that rows 1 through 4 use route_type 200. Google have proposed an extension to GTFS route_type that many publishers of GTFS have adopted. Here you can see that route_type 200 means a coach service and would not cause a problem for most routing software. For more on validating GTFS feeds, consult the api reference for implementation details.
Cleaning GTFS
assess-gtfs
can be used to attempt to resolve some of the identified problems in GTFS. To see how to do this, please follow along with the tutorial’s clean_feed
section. Alternatively, visit the api documentation for more detail.
Note that cleaning for all specification alerts has not been implemented. To raise a feature request with the package maintainers, please do so on GitHub.