import pandas as pd
import geopandas
import matplotlib.pyplot as plt
import requests
import contextily as cx
24 Visualising Travel Times in Python
24.1 Working with precalculated travel matrices
You may be able to find useful datasets that other people have created of common lookups you may need to do.
This is the travel time from every LSOA in the country to every hospital in the country (by car, assuming there is no traffic)
You’ll need to check whether the data is in hours, minutes or seconds! It may vary depending on the web service used and other processing.
In this dataset,
Rows = LSOA
Columns = Postcodes of Hospital and Clinic Sites
Cell = travel time from LSOA to hospital/clinic (minutes)
24.1.1 Calculating the shortest travel time
Let’s look at an example of how to take a matrix like this and calculate the shortest travel time to a site from each LSOA.
We’ve been given a dataset containing postcodes of hospitals with 24-hour A&E departments within 50 miles of the centre of Birmingham.
Let’s first import the packages we’ll be needing.
birmingham_area_hospitals_df
postcode name
0 B18 7QH City Hospital
1 B15 2TH Queen Elizabeth Hospital Birmingham
2 B9 5SS Heartlands Hospital
3 B71 4HJ Sandwell District General Hospital
4 B75 7RR Good Hope Hospital
5 WS2 9PS Manor Hospital
6 DY1 2HQ Russells Hall Hospital
7 WV10 0QP New Cross Hospital
8 B98 7UB Alexandra Hospital
9 CV10 7DJ George Eliot Hospital
10 CV34 5BW Warwick Hospital
11 CV2 2DX University Hospital (Coventry)
12 WR5 1DD Worcestershire Royal Hospital
13 ST16 3SA Stafford Hospital
14 TF1 6TF Princess Royal Hospital
15 LE1 5WW Leicester Royal Infirmary
16 DE22 3NE Royal Derby Hospital
17 OX16 9AL Horton General Hospital
18 ST4 6QG Royal Stoke University Hospital
19 SY3 8XQ Royal Shrewsbury Hospital
We know the extent we are interested in, and need to visualise the current areas with longer journey times.
We’ve got a precalculated travel matrix of car times available to us, showing the distance to any hospital in the UK from any LSOA in the UK.
= pd.read_csv("https://github.com/hsma-programme/h6_3c_interactive_plots_travel/raw/main/h6_3c_interactive_plots_travel/example_code/travel_matrix_minutes.zip") car_travel_time_lookup
When distances relate to LSOAs, MSOAs or postcodes, they will often relate to the centroid of that region (the centre point).
Remember - MSOAs and LSOAs (and even some postcodes) can cover a fairly large area. Datasets exist of population-weighted centroids, which may be more appropriate to use.
The first thing we’re going to do is make a new dataframe that only contains the columns for the relevant hospitals.
Always be very careful with boundary issues
In this case we’ve been told to only visualise their shortest journey to any one of the hospitals in this list. But someone at the edge of the region showing a very short journey time could be just down the road from a different hospital not in our dataset!
= car_travel_time_lookup.loc[
car_travel_time_lookup_birmingham_hospitals # keep all of the rows
:,# keep the LSOA column and columns for relevant hospitals
'LSOA'] + birmingham_area_hospitals_df.postcode.to_list()
[
]
car_travel_time_lookup_birmingham_hospitals.head()
LSOA B18 7QH ... ST4 6QG SY3 8XQ
0 Welwyn Hatfield 010F 122.066667 ... 158.966667 168.683334
1 Welwyn Hatfield 012A 122.783334 ... 159.683334 169.400000
2 Welwyn Hatfield 002F 123.533333 ... 160.433333 170.150000
3 Welwyn Hatfield 002E 123.399999 ... 160.299999 170.000001
4 Welwyn Hatfield 010A 122.100000 ... 159.000000 168.716667
[5 rows x 21 columns]
Now, we want to return the shortest travel time to any of these hospitals.
So what we need is the smallest value in any of the columns - except the LSOA name!
So first, let’s change the LSOA name to be our index.
This means it will be ignored when we try to look for the minimum travel time.
= car_travel_time_lookup_birmingham_hospitals.set_index('LSOA')
car_travel_time_lookup_birmingham_hospitals
car_travel_time_lookup_birmingham_hospitals
B18 7QH B15 2TH ... ST4 6QG SY3 8XQ
LSOA ...
Welwyn Hatfield 010F 122.066667 129.000000 ... 158.966667 168.683334
Welwyn Hatfield 012A 122.783334 129.716667 ... 159.683334 169.400000
Welwyn Hatfield 002F 123.533333 130.466667 ... 160.433333 170.150000
Welwyn Hatfield 002E 123.399999 130.316666 ... 160.299999 170.000001
Welwyn Hatfield 010A 122.100000 129.033333 ... 159.000000 168.716667
... ... ... ... ... ...
Selby 009C 135.750000 142.783334 ... 130.416667 165.850000
Selby 007D 139.116666 146.150001 ... 133.783334 170.050000
Selby 007B 136.750000 143.766667 ... 133.633333 168.550000
Selby 006A 138.550000 145.583333 ... 135.433333 170.350000
Selby 007A 140.150000 147.166667 ... 137.033333 171.816667
[32843 rows x 20 columns]
Now we can calculate the row-wise minimum (i.e. look across the columns and find the smallest value for that row) And turn this into a new column.
'shortest'] = (
car_travel_time_lookup_birmingham_hospitals[# axis = 1 means row-wise minimum (instead of columnwise)
min(axis=1)
car_travel_time_lookup_birmingham_hospitals.
)
car_travel_time_lookup_birmingham_hospitals.head()
B18 7QH B15 2TH ... SY3 8XQ shortest
LSOA ...
Welwyn Hatfield 010F 122.066667 129.000000 ... 168.683334 90.516666
Welwyn Hatfield 012A 122.783334 129.716667 ... 169.400000 91.233333
Welwyn Hatfield 002F 123.533333 130.466667 ... 170.150000 91.983334
Welwyn Hatfield 002E 123.399999 130.316666 ... 170.000001 91.850000
Welwyn Hatfield 010A 122.100000 129.033333 ... 168.716667 90.549999
[5 rows x 21 columns]
Now we need to join this to some kind of geometry - as at the moment we have the name of each LSOA, but no way of plotting them.
We don’t really know where they are until we import something like a geojson, geopkg or shapefile.
= 'https://github.com/hsma-programme/h6_3c_interactive_plots_travel/raw/main/h6_3c_interactive_plots_travel/example_code/LSOA_2011_Boundaries_Super_Generalised_Clipped_BSC_EW_V4.geojson'
lsoa_geojson_path
= geopandas.read_file(lsoa_geojson_path) lsoa_boundaries
Let’s just plot this to check it’s what we’re expecting and what the CRS is.
This looks like it’s using the British National Grid system.
(we can tell because the range is 0 to ~ 700,000 in each direction rather than numbers < 100 if it was latitude and longitude)
lsoa_boundaries.plot()
24.1.2 Joining the travel times to geometry data
Now it’s time to join it to the travel time data per LSOA.
We use the pandas merge() function.
Note that we pass in the lsoa boundaries geodataframe first out of the two dataframes - this ensures the output will be a geopandas file and not get converted to a standard dataframe.
Side note: we are resetting the index of our travel time dataframe so that LSOA11NM is a column, not an index (i.e. our row identifiers are 0, 1, 2, 3 etc., not the names of the LSOAs).
= pd.merge(
nearest_hospital_travel_bham_gdf
lsoa_boundaries,
car_travel_time_lookup_birmingham_hospitals.reset_index(),="LSOA11NM",
left_on="LSOA",
right_on="right"
how )
We’re doing a right join, which means every row in the car_travel_time_lookup_birmingham_hospitals dataframe will be kept, even if there isn’t a match in the lsoa_boundaries file.
24.1.3 Plotting the travel times
Now let’s try plotting this!
It looks like the travel times are shortest around Birmingham - which makes perfect sense as we’ve only been checking against the hospitals in Birmingham!
nearest_hospital_travel_bham_gdf.plot(='shortest',
column=True
legend )
24.1.4 Refining the travel time plot
24.1.4.1 Adjusting the plot boundaries
Let’s zoom in a bit so we’re not giving misleading information.
If we just zoom, the scale of our output isn’t very helpful.
This is because the legend is based on the entire dataframe, which still includes those distant regions, even if they aren’t shown in the final plot.
= 370000, 420000
xmin, xmax = 250000, 310000
ymin, ymax
= nearest_hospital_travel_bham_gdf.plot(
ax ='shortest',
column=True
legend
)
ax.set_xlim(xmin, xmax)
(370000.0, 420000.0)
ax.set_ylim(ymin, ymax)
(250000.0, 310000.0)
Remember the .cx method for filtering down to just those regions within a boundary box?
If we use that first, our plot becomes a lot more informative!
= 370000, 420000
xmin, xmax = 250000, 310000
ymin, ymax
= nearest_hospital_travel_bham_gdf.cx[xmin:xmax, ymin:ymax]
nearest_hospital_travel_bham_gdf_filter
= nearest_hospital_travel_bham_gdf_filter.plot(
ax ='shortest',
column=True
legend )
Would it look better with some whitespace around the edge of the map to make it clear what area we are considering?
24.1.4.2 Extending the plot extent back out
We can make the matplotlib limits slightly different to the boundary box to achieve this.
= 370000, 420000
xmin, xmax = 250000, 310000
ymin, ymax
= nearest_hospital_travel_bham_gdf.cx[xmin:xmax, ymin:ymax]
nearest_hospital_travel_bham_gdf_filter
= nearest_hospital_travel_bham_gdf_filter.plot(
ax ='shortest',
column=True
legend
)
-20000, xmax+20000) ax.set_xlim(xmin
(350000.0, 440000.0)
-20000, ymax+20000) ax.set_ylim(ymin
(230000.0, 330000.0)
24.1.4.3 Changing the colourschemes
Let’s experiment with a different colourscheme.
= 370000, 420000
xmin, xmax = 250000, 310000
ymin, ymax
= nearest_hospital_travel_bham_gdf.cx[xmin:xmax, ymin:ymax]
nearest_hospital_travel_bham_gdf_filter
= nearest_hospital_travel_bham_gdf_filter.plot(
ax ='shortest',
column=True,
legend="Reds"
cmap
)
-20000, xmax+20000) ax.set_xlim(xmin
(350000.0, 440000.0)
-20000, ymax+20000) ax.set_ylim(ymin
(230000.0, 330000.0)
24.1.4.4 Adding in points to show the possible destinations
Let’s make use of the postcodes.io web API.
We import requests to help us interact with the API.
We pass a list of postcodes to the relevant web address.
= requests.post(
postcode_lookup "https://api.postcodes.io/postcodes",
={"postcodes": birmingham_area_hospitals_df['postcode'].tolist()}
json
)
postcode_lookup
<Response [200]>
Then we turn it into a dataframe and join it to our hospital df.
# label: convert_postocde_json_to_df
= pd.json_normalize(postcode_lookup.json()['result'], sep='_')
postcode_lookup
postcode_lookup
query result_postcode ... result_codes_lau2 result_codes_pfa
0 B18 7QH B18 7QH ... E08000025 E23000014
1 B15 2TH B15 2TH ... E08000025 E23000014
2 B9 5SS B9 5SS ... E08000025 E23000014
3 B71 4HJ B71 4HJ ... E08000028 E23000014
4 B75 7RR B75 7RR ... E08000025 E23000014
5 WS2 9PS WS2 9PS ... E08000030 E23000014
6 DY1 2HQ DY1 2HQ ... E08000027 E23000014
7 WV10 0QP WV10 0QP ... E08000031 E23000014
8 B98 7UB B98 7UB ... E07000236 E23000016
9 CV10 7DJ CV10 7DJ ... E07000219 E23000017
10 CV34 5BW CV34 5BW ... E07000222 E23000017
11 CV2 2DX CV2 2DX ... E08000026 E23000014
12 WR5 1DD WR5 1DD ... E07000237 E23000016
13 ST16 3SA ST16 3SA ... E07000197 E23000015
14 TF1 6TF TF1 6TF ... E06000020 E23000016
15 LE1 5WW LE1 5WW ... E06000016 E23000021
16 DE22 3NE DE22 3NE ... E06000015 E23000018
17 OX16 9AL OX16 9AL ... E07000177 E23000029
18 ST4 6QG ST4 6QG ... E06000021 E23000015
19 SY3 8XQ SY3 8XQ ... E06000051 E23000016
[20 rows x 41 columns]
= pd.merge(
birmingham_area_hospitals_df
birmingham_area_hospitals_df,
postcode_lookup,="postcode",
left_on="query"
right_on )
Then we’ll turn the output into a geodataframe.
Because our original dataset was in eastings/northings, we can use those columns from our postcodes.io lookup result, and tell geopandas that we’re using EPSG:27700
= geopandas.GeoDataFrame(
birmingham_area_hospitals_gdf
birmingham_area_hospitals_df,= geopandas.points_from_xy(
geometry 'result_eastings'],
birmingham_area_hospitals_df['result_northings']
birmingham_area_hospitals_df[
),= 'EPSG:27700'
crs )
Now we can plot the points to check the results look roughly correct.
= birmingham_area_hospitals_gdf.plot()
ax
cx.add_basemap(
ax,=birmingham_area_hospitals_gdf.crs.to_string(),
crs=9
zoom )
Now we can add these points to our map.
First we plot the choropleth.
By passing the ax argument - remembering that we called our initial plot object ax - we make sure it plots these as an extra layer - not a totally new plot!
# label: plot_points_and_choropleth
= 380000, 440000
xmin, xmax = 250000, 310000
ymin, ymax
= nearest_hospital_travel_bham_gdf.cx[xmin:xmax, ymin:ymax]
nearest_hospital_travel_bham_gdf_filter
= nearest_hospital_travel_bham_gdf_filter.plot(
ax ='shortest',
column="Reds",
cmap=True,
legend="black",
edgecolor=0.5,
linewidth=0.6,
alpha=(12,12)
figsize
)
-20000, xmax+20000) ax.set_xlim(xmin
(360000.0, 460000.0)
-20000, ymax+20000) ax.set_ylim(ymin
(230000.0, 330000.0)
= birmingham_area_hospitals_gdf.plot(
hospital_points =ax,
ax='black',
color=50
markersize
)
cx.add_basemap(
ax,=nearest_hospital_travel_bham_gdf.crs.to_string(),
crs=10
zoom
)
"Travel Time (minutes - by car) to Nearest Hospitals with A&E Department\nBirmingham Area")
plt.title('off') ax.axis(
(360000.0, 460000.0, 230000.0, 330000.0)