class FacilityLocationObjective:
def __init__(self, demand, travel_matrix, merge_col, demand_col):
'''
Store the demand and travel times
Args:
demand: pd.DataFrame:
travel_matrix: pd.DataFrame:
'''
self.demand = demand.set_index(merge_col)
self.travel_matrix = travel_matrix.set_index(merge_col)
self.demand_col = demand_col
29 Using code to solve facility location problems
Right - now we know the combinations we want to test and how long it takes to get to each clinic location, we’ll need some sort of demand figures.
29.1 Demand data
Chances are your demand isn’t spread evenly across your potential service user catchment area.
- Maybe certain LSOAs have more people of childbearing age, so maternity and paediatric services will get more demand from those areas
- Maybe certain LSOAs have an older population, so stroke services are more centred around there
- Maybe other LSOAs have a higher incidence of obesity, so you might expect more demand for type 2 diabetes services and retinopathy clinics
Your demand data can be formatted quite simply!
You just want to make sure that you’re using the same level of location data as in your travel matrix (e.g. they need to both use LSOA, or both use postcode sectors - not a mixture)
You might use historic demand figures as your starting point.
But always consider…
- If you seem to have areas with no historic demand, is this actually a pattern, or is it just that the historic numbers are small and there’s an element of chance?
- Is there some preexisting bias or issue with access that you might be perpetuating by using the historic figures?
How can you deal with these issues?
If you’re using historic data, can you bring in more years?
Alternatively, you could explore what happens when you just weight demand by the number of people in each LSOA.
If your service only covers a certain age range, you can find LSOA-level figures estimating the proportion of the population in that age group.
You might also just make predictions of demand based off something else. - Number of people in a target age demographic or of a particular gender in your LSOAs, multiplied by some factor - (e.g. 20% of women aged 18-34)
Just always consider whether what you’re doing is
- appropriate
- sensible
- transparent
- justifiable
Make it clear what data you have used and why!
Make the limitations of your conclusions explicit.
29.2 Evaluating the weighted average travel time
Right! We’ve got everything we need - except a way to evaluate the weighted average travel time…
Cast your mind back many many slides - when we present our options, we want to give more weight to the travel times from areas that have higher demand.
Let’s write some code to do this.
It’s something we’re going to want to reuse quite a lot - so we can turn it into a class with…
Attributes
- our travel dataframe
- our demand dataframe
Methods
- Evaluate a solution
- Limit the travel dataframe to a subset of candidate locations
- Work out the smallest value per location (the ‘minimum cost’)
- Return a dataframe with demand and min cost per region
- Return the weighted average for the solution
- Return the unweighted average for the solution
- Return the maximum travel time for any locations in the solution
The FacilityLocationObjective code is modified from the Metapy library created by Tom Monks
https://github.com/health-data-science-OR/healthcare-logistics/tree/master/optimisation/metapy
29.2.0.1 The init method
demand is our dataframe containing demand sources (e.g. LSOAs) as rows and the demand (e.g. yearly patients coming from that LSOA) as the single column
merge_col is the name of the column that is consistent between our demand and travel dataframes
travel_matrix is the dataframe with demand sources as rows and potential service locations as columns, with the travel cost (time) as the values in the cells
demand_col is a string referring to the column that contains the demand value in the demand dataframe
29.2.0.2 The evaluate_solution method
This method, given a list of sites in the form of their column indices (e.g. [1, 4, 5] for sites 2, 5 and 6), will return a dataframe with the demand and minimum travel time per row for this potential solution.
def evaluate_solution(self, site_list):
'''
Args:
site_list: list: column indices of solution to evaluate
(to apply to travel matrix)
Returns:
Pandas dataframe to pass to evaluation functions
'''
= self.travel_matrix.iloc[:, site_list].copy()
active_facilities
# Assume travel to closest facility
# Need to drop the column that contains
'min_cost'] = active_facilities.min(axis=1)
active_facilities[
# Merge demand and travel times into a single DataFrame
= self.demand.merge(active_facilities,
problem =True, right_index=True,
left_index='inner')
how
return problem.reset_index()
29.2.0.3 The generate_solution_metrics method
This method, given a list of sites in the form of their column indices (e.g. [1, 4, 5] for sites 2, 5 and 6) - Runs the evaluate_solution method - Returns a range of metrics relating to the
def generate_solution_metrics(self, site_list):
'''
Calculates the weighted average travel time for selected sites
Args:
site_list: list or np.array: A list of site IDs as a list or array (e.g. [0, 3, 4])
merge_col: string: The column name to use for merging the data.
n_patients_or_referrals_col: string: The column name to use for the number of patients or referrals.
Returns:
A tuple containing the problem and the maximum travel time.
'''
= self.evaluate_solution(site_list)
problem
# Return weighted average
= np.average(problem['min_cost'], weights=problem[self.demand_col])
weighted_average = np.average(problem['min_cost'])
unweighted_average = np.max(problem['min_cost'])
max_travel
return {
'site_indices': site_list,
'site_names': ", ".join(self.travel_matrix.columns[site_list].tolist()),
'weighted_average': weighted_average,
'unweighted_average': unweighted_average,
'max': max_travel,
'problem_df': problem
}
:::{.callout-tip collapse=“True”} ### Click here to view the whole class
# Tweaked WeightedAverageObjective from Metapy package
# https://github.com/health-data-science-OR/healthcare-logistics/tree/master/optimisation/metapy
# Credit: Tom Monks
class FacilityLocationObjective:
'''
Encapsulates logic for calculation of
metrics in a simple facility location problem
Demand and travel matrices must have a common column
demand: pd.dataframe: Two column dataframe. One column should be labels for the
demand locations (e.g. LSOA identifiers, postcodes). Second column should contain
demand figures of some kind (e.g. number of historical cases)
If demand assumed to be equal, all values in this column could be 1.
travel_matrix: pd.dataframe: dataframe with columns representing sites
and rows representing locations demand will come from.
One column should be labels for the demand locations (e.g. LSOA identifiers, postcodes).
All other values will be either distance or time in float form.
No additional columns of information must be included or they will be used as part of the
calculation of the lowest-cost solution, which may lead to incorrect results.
'''
def __init__(self, demand, travel_matrix, merge_col, demand_col):
'''
Store the demand and travel times
Args:
demand: pd.DataFrame:
travel_matrix: pd.DataFrame:
'''
self.demand = demand.set_index(merge_col)
self.travel_matrix = travel_matrix.set_index(merge_col)
self.demand_col = demand_col
def evaluate_solution(self, site_list):
'''
Calculates the
Args:
site_list: list: column indices of solution to evaluate
(to apply to travel matrix)
Returns:
Pandas dataframe to pass to evaluation functions
'''
= self.travel_matrix.iloc[:, site_list].copy()
active_facilities
# Assume travel to closest facility
# Need to drop the column that contains
'min_cost'] = active_facilities.min(axis=1)
active_facilities[
# Merge demand and travel times into a single DataFrame
= self.demand.merge(active_facilities,
problem =True, right_index=True,
left_index='inner')
how
return problem.reset_index()
def generate_solution_metrics(self, site_list):
'''
Calculates the weighted average travel time for selected sites
Args:
site_list: list or np.array: A list of site IDs as a list or array (e.g. [0, 3, 4])
merge_col: string: The column name to use for merging the data.
n_patients_or_referrals_col: string: The column name to use for the number of patients or referrals.
Returns:
A tuple containing the problem and the maximum travel time.
'''
= self.evaluate_solution(site_list)
problem
# Return weighted average
= np.average(problem['min_cost'], weights=problem[self.demand_col])
weighted_average = np.average(problem['min_cost'])
unweighted_average = np.max(problem['min_cost'])
max_travel
return {
'site_indices': site_list,
'site_names': ", ".join(self.travel_matrix.columns[site_list].tolist()),
'weighted_average': weighted_average,
'unweighted_average': unweighted_average,
'max': max_travel,
'problem_df': problem
}
29.2.1 Setting up an instance of this class
Let’s first import the packages we will need throughout this section.
import pandas as pd
import geopandas
import contextily as cx
import matplotlib.pyplot as plt
We need to - import a dataset of demand - import (or create) a geodataframe of site locations - and import (or create) a travel matrix for this combination of sites and demand sources
= pd.read_csv("https://raw.githubusercontent.com/hsma-programme/h6_3d_facility_location_problems/main/h6_3d_facility_location_problems/example_code/brighton_demand.csv").drop(columns=["Unnamed: 0"])
brighton_demand brighton_demand.head()
LSOA | demand | |
---|---|---|
0 | Brighton and Hove 027E | 3627 |
1 | Brighton and Hove 027F | 2323 |
2 | Brighton and Hove 027A | 2596 |
3 | Brighton and Hove 029E | 3132 |
4 | Brighton and Hove 029D | 2883 |
= geopandas.read_file("https://raw.githubusercontent.com/hsma-programme/h6_3d_facility_location_problems/main/h6_3d_facility_location_problems/example_code/brighton_sites.geojson")
brighton_sites brighton_sites.head()
site | geometry | |
---|---|---|
0 | Site 1 | POINT (-0.19544 50.84511) |
1 | Site 2 | POINT (-0.13365 50.84435) |
2 | Site 3 | POINT (-0.10763 50.83347) |
3 | Site 4 | POINT (-0.17652 50.83075) |
4 | Site 5 | POINT (-0.11961 50.86597) |
= pd.read_csv("https://raw.githubusercontent.com/hsma-programme/h6_3d_facility_location_problems/main/h6_3d_facility_location_problems/example_code/brighton_travel_matrix_driving.csv").drop(columns=["Unnamed: 0"])
brighton_travel brighton_travel.head()
LSOA | Site 1 | Site 2 | Site 3 | Site 4 | Site 5 | Site 6 | |
---|---|---|---|---|---|---|---|
0 | Brighton and Hove 027E | 773.93 | 527.69 | 444.29 | 491.85 | 607.54 | 554.91 |
1 | Brighton and Hove 027F | 757.39 | 499.11 | 517.57 | 561.07 | 578.97 | 538.37 |
2 | Brighton and Hove 027A | 763.24 | 601.38 | 517.98 | 410.40 | 681.23 | 557.35 |
3 | Brighton and Hove 029E | 743.62 | 651.72 | 660.36 | 379.72 | 731.58 | 557.58 |
4 | Brighton and Hove 029D | 665.85 | 664.65 | 658.20 | 313.00 | 744.50 | 570.51 |
Finally, we bring in our LSOA boundaries for plotting later.
= geopandas.read_file("https://raw.githubusercontent.com/hsma-programme/h6_3d_facility_location_problems/main/h6_3d_facility_location_problems/example_code/LSOA_2011_Boundaries_Super_Generalised_Clipped_BSC_EW_V4.geojson")
lsoa_boundaries lsoa_boundaries.head()
FID | LSOA11CD | LSOA11NM | LSOA11NMW | BNG_E | BNG_N | LONG | LAT | GlobalID | geometry | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | E01000001 | City of London 001A | City of London 001A | 532123 | 181632 | -0.097140 | 51.51816 | a758442e-7679-45d0-95a8-ed4c968ecdaa | POLYGON ((532282.629 181906.496, 532248.250 18... |
1 | 2 | E01000002 | City of London 001B | City of London 001B | 532480 | 181715 | -0.091970 | 51.51882 | 861dbb53-dfaf-4f57-be96-4527e2ec511f | POLYGON ((532746.814 181786.892, 532248.250 18... |
2 | 3 | E01000003 | City of London 001C | City of London 001C | 532239 | 182033 | -0.095320 | 51.52174 | 9f765b55-2061-484a-862b-fa0325991616 | POLYGON ((532293.068 182068.422, 532419.592 18... |
3 | 4 | E01000005 | City of London 001E | City of London 001E | 533581 | 181283 | -0.076270 | 51.51468 | a55c4c31-ef1c-42fc-bfa9-07c8f2025928 | POLYGON ((533604.245 181418.129, 533743.689 18... |
4 | 5 | E01000006 | Barking and Dagenham 016A | Barking and Dagenham 016A | 544994 | 184274 | 0.089317 | 51.53875 | 9cdabaa8-d9bd-4a94-bb3b-98a933ceedad | POLYGON ((545271.918 184183.948, 545296.314 18... |
We will then pass the travel and demand datasets into our FacilityLocationObjective
; the site df will be used afterwards.
= FacilityLocationObjective(
location_problem_brighton =brighton_demand,
demand=brighton_travel,
travel_matrix="LSOA",
merge_col="demand"
demand_col )
29.2.2 Outputs
29.2.2.1 evaluate_solution
This is the output of the evaluate_solution method.
And now we can easily pass in a range of solutions - including with different numbers of sites.
Note the column names changing to reflect the site indices we are passing in.
Now we bring in our code from an earlier chapter to calculate every possible combination of a certain number of facilities.
from itertools import combinations
import numpy as np
def all_combinations(n_facilities, p):
= np.arange(n_facilities, dtype=np.uint8)
facility return [np.array(a) for a in combinations(facility, p)]
We could even loop through this to find every combination with every possible number of facilities…
= all_combinations(
possible_combinations_brighton len(location_problem_brighton.travel_matrix.columns),
4
)
possible_combinations_brighton
[array([0, 1, 2, 3], dtype=uint8),
array([0, 1, 2, 4], dtype=uint8),
array([0, 1, 2, 5], dtype=uint8),
array([0, 1, 3, 4], dtype=uint8),
array([0, 1, 3, 5], dtype=uint8),
array([0, 1, 4, 5], dtype=uint8),
array([0, 2, 3, 4], dtype=uint8),
array([0, 2, 3, 5], dtype=uint8),
array([0, 2, 4, 5], dtype=uint8),
array([0, 3, 4, 5], dtype=uint8),
array([1, 2, 3, 4], dtype=uint8),
array([1, 2, 3, 5], dtype=uint8),
array([1, 2, 4, 5], dtype=uint8),
array([1, 3, 4, 5], dtype=uint8),
array([2, 3, 4, 5], dtype=uint8)]
29.2.3 Obtaining solution metrics in a loop
Now we can loop through every possible combination and save the outputs.
= []
outputs
for possible_solution in possible_combinations_brighton:
outputs.append(
location_problem_brighton.generate_solution_metrics(
possible_solution
) )
And easily turn our list of dictionaries into a dataframe!
pd.DataFrame(outputs)
site_indices | site_names | weighted_average | unweighted_average | max | problem_df | |
---|---|---|---|---|---|---|
0 | [0, 1, 2, 3] | Site 1, Site 2, Site 3, Site 4 | 313.257086 | 312.565758 | 1001.33 | LSOA demand Site 1 S... |
1 | [0, 1, 2, 4] | Site 1, Site 2, Site 3, Site 5 | 374.295378 | 370.452061 | 1001.33 | LSOA demand Site 1 S... |
2 | [0, 1, 2, 5] | Site 1, Site 2, Site 3, Site 6 | 368.551505 | 355.460000 | 1001.33 | LSOA demand Site 1 S... |
3 | [0, 1, 3, 4] | Site 1, Site 2, Site 4, Site 5 | 397.907659 | 385.938061 | 1302.58 | LSOA demand Site 1 S... |
4 | [0, 1, 3, 5] | Site 1, Site 2, Site 4, Site 6 | 410.033802 | 388.522909 | 1371.46 | LSOA demand Site 1 S... |
5 | [0, 1, 4, 5] | Site 1, Site 2, Site 5, Site 6 | 460.215392 | 436.115273 | 1302.58 | LSOA demand Site 1 S... |
6 | [0, 2, 3, 4] | Site 1, Site 3, Site 4, Site 5 | 304.810704 | 306.939576 | 1001.33 | LSOA demand Site 1 ... |
7 | [0, 2, 3, 5] | Site 1, Site 3, Site 4, Site 6 | 313.630625 | 311.418303 | 1001.33 | LSOA demand Site 1 ... |
8 | [0, 2, 4, 5] | Site 1, Site 3, Site 5, Site 6 | 363.242023 | 354.221818 | 1001.33 | LSOA demand Site 1 ... |
9 | [0, 3, 4, 5] | Site 1, Site 4, Site 5, Site 6 | 414.657901 | 393.878606 | 1302.58 | LSOA demand Site 1 S... |
10 | [1, 2, 3, 4] | Site 2, Site 3, Site 4, Site 5 | 316.865395 | 319.753939 | 1001.33 | LSOA demand Site 2 ... |
11 | [1, 2, 3, 5] | Site 2, Site 3, Site 4, Site 6 | 311.756166 | 304.151576 | 1001.33 | LSOA demand Site 2 ... |
12 | [1, 2, 4, 5] | Site 2, Site 3, Site 5, Site 6 | 379.697853 | 370.415455 | 1001.33 | LSOA demand Site 2 ... |
13 | [1, 3, 4, 5] | Site 2, Site 4, Site 5, Site 6 | 399.681921 | 382.101333 | 1302.58 | LSOA demand Site 2 S... |
14 | [2, 3, 4, 5] | Site 3, Site 4, Site 5, Site 6 | 306.446684 | 302.913394 | 1001.33 | LSOA demand Site 3 ... |
Then it’s easy to find the best combinations and tidy up the output table.
'weighted_average').round(1) pd.DataFrame(outputs).sort_values(
site_indices | site_names | weighted_average | unweighted_average | max | problem_df | |
---|---|---|---|---|---|---|
6 | [0, 2, 3, 4] | Site 1, Site 3, Site 4, Site 5 | 304.8 | 306.9 | 1001.3 | LSOA demand Site 1 ... |
14 | [2, 3, 4, 5] | Site 3, Site 4, Site 5, Site 6 | 306.4 | 302.9 | 1001.3 | LSOA demand Site 3 ... |
11 | [1, 2, 3, 5] | Site 2, Site 3, Site 4, Site 6 | 311.8 | 304.2 | 1001.3 | LSOA demand Site 2 ... |
0 | [0, 1, 2, 3] | Site 1, Site 2, Site 3, Site 4 | 313.3 | 312.6 | 1001.3 | LSOA demand Site 1 S... |
7 | [0, 2, 3, 5] | Site 1, Site 3, Site 4, Site 6 | 313.6 | 311.4 | 1001.3 | LSOA demand Site 1 ... |
10 | [1, 2, 3, 4] | Site 2, Site 3, Site 4, Site 5 | 316.9 | 319.8 | 1001.3 | LSOA demand Site 2 ... |
8 | [0, 2, 4, 5] | Site 1, Site 3, Site 5, Site 6 | 363.2 | 354.2 | 1001.3 | LSOA demand Site 1 ... |
2 | [0, 1, 2, 5] | Site 1, Site 2, Site 3, Site 6 | 368.6 | 355.5 | 1001.3 | LSOA demand Site 1 S... |
1 | [0, 1, 2, 4] | Site 1, Site 2, Site 3, Site 5 | 374.3 | 370.5 | 1001.3 | LSOA demand Site 1 S... |
12 | [1, 2, 4, 5] | Site 2, Site 3, Site 5, Site 6 | 379.7 | 370.4 | 1001.3 | LSOA demand Site 2 ... |
3 | [0, 1, 3, 4] | Site 1, Site 2, Site 4, Site 5 | 397.9 | 385.9 | 1302.6 | LSOA demand Site 1 S... |
13 | [1, 3, 4, 5] | Site 2, Site 4, Site 5, Site 6 | 399.7 | 382.1 | 1302.6 | LSOA demand Site 2 S... |
4 | [0, 1, 3, 5] | Site 1, Site 2, Site 4, Site 6 | 410.0 | 388.5 | 1371.5 | LSOA demand Site 1 S... |
9 | [0, 3, 4, 5] | Site 1, Site 4, Site 5, Site 6 | 414.7 | 393.9 | 1302.6 | LSOA demand Site 1 S... |
5 | [0, 1, 4, 5] | Site 1, Site 2, Site 5, Site 6 | 460.2 | 436.1 | 1302.6 | LSOA demand Site 1 S... |
And, since we stored the combinations in our dataframe too, we can easily pull this out by chaining together a few steps.
'weighted_average').head(1)['site_names'].values pd.DataFrame(outputs).sort_values(
array(['Site 1, Site 3, Site 4, Site 5'], dtype=object)
29.2.4 Plotting the best solution
We can now put this all together with the map plotting skills we’ve learned over the last few sessions.
When plotting our sites as well, we can ensure we are only plotting the sites included in our solution - see the highlighted sections of the code below.
Let’s explore what the output of the first line of code is:
= pd.DataFrame(outputs).sort_values('weighted_average').head(1)
best_solution = best_solution['problem_df'].values[0]
best_solution_df
best_solution_df
LSOA | demand | Site 1 | Site 3 | Site 4 | Site 5 | min_cost | |
---|---|---|---|---|---|---|---|
0 | Brighton and Hove 027E | 3627 | 773.93 | 444.29 | 491.85 | 607.54 | 444.29 |
1 | Brighton and Hove 027F | 2323 | 757.39 | 517.57 | 561.07 | 578.97 | 517.57 |
2 | Brighton and Hove 027A | 2596 | 763.24 | 517.98 | 410.40 | 681.23 | 410.40 |
3 | Brighton and Hove 029E | 3132 | 743.62 | 660.36 | 379.72 | 731.58 | 379.72 |
4 | Brighton and Hove 029D | 2883 | 665.85 | 658.20 | 313.00 | 744.50 | 313.00 |
... | ... | ... | ... | ... | ... | ... | ... |
160 | Brighton and Hove 012A | 2497 | 446.54 | 1108.11 | 519.16 | 626.02 | 446.54 |
161 | Brighton and Hove 005C | 2570 | 469.80 | 1008.24 | 569.40 | 526.15 | 469.80 |
162 | Brighton and Hove 012B | 2051 | 464.53 | 1132.60 | 537.15 | 650.51 | 464.53 |
163 | Brighton and Hove 005A | 1164 | 567.49 | 1105.93 | 664.11 | 623.84 | 567.49 |
164 | Brighton and Hove 005B | 1097 | 495.50 | 1033.94 | 595.10 | 551.85 | 495.50 |
165 rows × 7 columns
This line then ensures we’re just pulling out the site indices for the best sites, and filtering our site dataframe to just those.
= brighton_sites.to_crs('EPSG:27700').iloc[best_solution["site_indices"].values[0]]
brighton_sites_bng
brighton_sites_bng
site | geometry | |
---|---|---|
0 | Site 1 | POINT (527142.275 106616.053) |
2 | Site 3 | POINT (533356.778 105476.782) |
3 | Site 4 | POINT (528513.424 105052.430) |
4 | Site 5 | POINT (532421.163 109069.196) |
Now let’s do the plotting.
= pd.DataFrame(outputs).sort_values('weighted_average').head(1)['problem_df'].values[0]
best_solution_df
= pd.merge(
nearest_site_travel_brighton_gdf
lsoa_boundaries,
best_solution_df,= "LSOA",
right_on = "LSOA11NM"
left_on
)
"min_cost_minutes"] = nearest_site_travel_brighton_gdf["min_cost"] / 60
nearest_site_travel_brighton_gdf[
= nearest_site_travel_brighton_gdf.plot(
ax "min_cost_minutes",
=True,
legend="Blues",
cmap=0.7,
alpha="black",
edgecolor=0.5,
linewidth=(12,6)
figsize
)
= brighton_sites.to_crs('EPSG:27700').iloc[best_solution["site_indices"].values[0]]
brighton_sites_bng
= brighton_sites_bng.plot(ax=ax, color='magenta', markersize=60)
hospital_points
=nearest_site_travel_brighton_gdf.crs.to_string(), zoom=14)
cx.add_basemap(ax, crs
for x, y, label in zip(brighton_sites_bng.geometry.x,
brighton_sites_bng.geometry.y,
brighton_sites_bng.site):=(x,y), xytext=(10,3), textcoords="offset points", bbox=dict(facecolor='white'))
ax.annotate(label, xy
'off')
ax.axis(
"Travel Time (driving - minutes) for best sites in Brighton") plt.title(
Text(0.5, 1.0, 'Travel Time (driving - minutes) for best sites in Brighton')
29.2.5 Plotting all solutions
And then it doesn’t take much effort to create a map showing weighted average travel times for every possible solution!
The key thing is that we just iterate through our dataframe of all solutions - pulling one row out at a time.
If we order them by weighted average first, the map in the top left is the best solution.
Then it’s just our usual plot code - making sure to specify the axis we are plotting on to.
Another benefit of saving all these different things is that we can then easily add the weighted average travel time to the title of each plot!
= plt.subplots(3, 5, figsize=(30, 15))
fig, axs
for i, ax in enumerate(fig.axes):
= pd.DataFrame(outputs).sort_values('weighted_average').iloc[[i]]
solution = solution['problem_df'].values[0]
solution_df
= pd.merge(
nearest_site_travel_brighton_gdf
lsoa_boundaries,
solution_df,= "LSOA",
right_on = "LSOA11NM"
left_on
)
"min_cost_minutes"] = nearest_site_travel_brighton_gdf["min_cost"] / 60
nearest_site_travel_brighton_gdf[
= nearest_site_travel_brighton_gdf.plot(
ax "min_cost_minutes",
=True,
legend="Blues",
cmap=0.7,
alpha="black",
edgecolor=0.5,
linewidth=(12,6),
figsize=ax
ax
)
= brighton_sites.to_crs('EPSG:27700').iloc[solution["site_indices"].values[0]]
brighton_sites_bng
= brighton_sites_bng.plot(ax=ax, color='magenta', markersize=60)
hospital_points
for x, y, label in zip(brighton_sites_bng.geometry.x,
brighton_sites_bng.geometry.y,
brighton_sites_bng.site):=(x,y), xytext=(10,3), textcoords="offset points", bbox=dict(facecolor='white'))
ax.annotate(label, xy
'off')
ax.axis(
= solution["weighted_average"].values[0]
weighted_travel_time_solution = (weighted_travel_time_solution / 60).round(2)
weighted_travel_time_solution_minutes
f"Weighted Average:\n{weighted_travel_time_solution_minutes} minutes") ax.set_title(
29.2.6 Other ways to display the output
Consider other ways to display the outputs too.
import plotly.express as px
= pd.DataFrame(outputs)
outputs_df 'weighted_average_minutes'] = (outputs_df['weighted_average']/60).round(2)
outputs_df[
px.bar(=outputs_df.sort_values("weighted_average_minutes", ascending=False),
data_frame="site_names",
y="weighted_average_minutes",
x="Possible Site Combinations"
title )
29.3 Full Code Example
Finally, here is a full copyable code example for this section.
from itertools import combinations
import numpy as np
import pandas as pd
import geopandas
import contextily as cx
import matplotlib.pyplot as plt
# Tweaked WeightedAverageObjective from Metapy package
# https://github.com/health-data-science-OR/healthcare-logistics/tree/master/optimisation/metapy
# Credit: Tom Monks
class FacilityLocationObjective:
'''
Encapsulates logic for calculation of
metrics in a simple facility location problem
Demand and travel matrices must have a common column
demand: pd.dataframe: Two column dataframe. One column should be labels for the
demand locations (e.g. LSOA identifiers, postcodes). Second column should contain
demand figures of some kind (e.g. number of historical cases)
If demand assumed to be equal, all values in this column could be 1.
travel_matrix: pd.dataframe: dataframe with columns representing sites
and rows representing locations demand will come from.
One column should be labels for the demand locations (e.g. LSOA identifiers, postcodes).
All other values will be either distance or time in float form.
No additional columns of information must be included or they will be used as part of the
calculation of the lowest-cost solution, which may lead to incorrect results.
'''
def __init__(self, demand, travel_matrix, merge_col, demand_col):
'''
Store the demand and travel times
Args:
demand: pd.DataFrame:
travel_matrix: pd.DataFrame:
'''
self.demand = demand.set_index(merge_col)
self.travel_matrix = travel_matrix.set_index(merge_col)
self.demand_col = demand_col
def evaluate_solution(self, site_list):
'''
Calculates the
Args:
site_list: list: column indices of solution to evaluate
(to apply to travel matrix)
Returns:
Pandas dataframe to pass to evaluation functions
'''
= self.travel_matrix.iloc[:, site_list].copy()
active_facilities
# Assume travel to closest facility
# Need to drop the column that contains
'min_cost'] = active_facilities.min(axis=1)
active_facilities[
# Merge demand and travel times into a single DataFrame
= self.demand.merge(active_facilities,
problem =True, right_index=True,
left_index='inner')
how
return problem.reset_index()
def generate_solution_metrics(self, site_list):
'''
Calculates the weighted average travel time for selected sites
Args:
site_list: list or np.array: A list of site IDs as a list or array (e.g. [0, 3, 4])
merge_col: string: The column name to use for merging the data.
n_patients_or_referrals_col: string: The column name to use for the number of patients or referrals.
Returns:
A tuple containing the problem and the maximum travel time.
'''
= self.evaluate_solution(site_list)
problem
# Return weighted average
= np.average(problem['min_cost'], weights=problem[self.demand_col])
weighted_average = np.average(problem['min_cost'])
unweighted_average = np.max(problem['min_cost'])
max_travel
return {
'site_indices': site_list,
'site_names': ", ".join(self.travel_matrix.columns[site_list].tolist()),
'weighted_average': weighted_average,
'unweighted_average': unweighted_average,
'max': max_travel,
'problem_df': problem
}
def all_combinations(n_facilities, p):
= np.arange(n_facilities, dtype=np.uint8)
facility return [np.array(a) for a in combinations(facility, p)]
= pd.read_csv("https://raw.githubusercontent.com/hsma-programme/h6_3d_facility_location_problems/main/h6_3d_facility_location_problems/example_code/brighton_demand.csv").drop(columns=["Unnamed: 0"])
brighton_demand
= geopandas.read_file("https://raw.githubusercontent.com/hsma-programme/h6_3d_facility_location_problems/main/h6_3d_facility_location_problems/example_code/brighton_sites.geojson")
brighton_sites
= pd.read_csv("https://raw.githubusercontent.com/hsma-programme/h6_3d_facility_location_problems/main/h6_3d_facility_location_problems/example_code/brighton_travel_matrix_driving.csv").drop(columns=["Unnamed: 0"])
brighton_travel
= geopandas.read_file("https://raw.githubusercontent.com/hsma-programme/h6_3d_facility_location_problems/main/h6_3d_facility_location_problems/example_code/LSOA_2011_Boundaries_Super_Generalised_Clipped_BSC_EW_V4.geojson")
lsoa_boundaries
= FacilityLocationObjective(
location_problem_brighton =brighton_demand,
demand=brighton_travel,
travel_matrix="LSOA",
merge_col="demand"
demand_col
)
= all_combinations(
possible_combinations_brighton len(location_problem_brighton.travel_matrix.columns),
4
)
= []
outputs
for possible_solution in possible_combinations_brighton:
outputs.append(
location_problem_brighton.generate_solution_metrics(
possible_solution
)
)
= pd.DataFrame(outputs).sort_values('weighted_average').head(1)
best_solution = best_solution['problem_df'].values[0]
best_solution_df
= plt.subplots(3, 5, figsize=(30, 15))
fig, axs
for i, ax in enumerate(fig.axes):
= pd.DataFrame(outputs).sort_values('weighted_average').iloc[[i]]
solution = solution['problem_df'].values[0]
solution_df
= pd.merge(
nearest_site_travel_brighton_gdf
lsoa_boundaries,
solution_df,= "LSOA",
right_on = "LSOA11NM"
left_on
)
"min_cost_minutes"] = nearest_site_travel_brighton_gdf["min_cost"] / 60
nearest_site_travel_brighton_gdf[
= nearest_site_travel_brighton_gdf.plot(
ax "min_cost_minutes",
=True,
legend="Blues",
cmap=0.7,
alpha="black",
edgecolor=0.5,
linewidth=(12,6),
figsize=ax
ax
)
= brighton_sites.to_crs('EPSG:27700').iloc[solution["site_indices"].values[0]]
brighton_sites_bng
= brighton_sites_bng.plot(ax=ax, color='magenta', markersize=60)
hospital_points
for x, y, label in zip(brighton_sites_bng.geometry.x,
brighton_sites_bng.geometry.y,
brighton_sites_bng.site):=(x,y), xytext=(10,3), textcoords="offset points", bbox=dict(facecolor='white'))
ax.annotate(label, xy
'off')
ax.axis(
= solution["weighted_average"].values[0]
weighted_travel_time_solution = (weighted_travel_time_solution / 60).round(2)
weighted_travel_time_solution_minutes
f"Weighted Average:\n{weighted_travel_time_solution_minutes} minutes") ax.set_title(