Health Service Modelling Associates Programme
Don’t forget that you can book time to see me.
Slots are available at
Slots are currently available up until the second week of June.
Book via this link
Pop me a message if you can’t make those times and we can work something out.
Notebook LM is a Google product that allows you to pass in specfic sources that will be used by a generative AI model.
You can feed it a range of sources, including
(Thanks to Joel for originally making me aware of this tool)
Because it can be trained on specific sources, you can use it to remind yourself of the HSMA way of doing things.
Notebook LM has the really nice feature of actually referring back to the source it’s used
But in code, that means you get the sources appearing in [] in a way that would break the code…
The ‘deep dive’ feature is a way to get a good summary of complex information like scientific papers.
It generates a podcast-like conversation!
It can also generate other formats like an FAQ document.
This repository contains a script to automatically generate documentation from a codebase.
https://github.com/The-Pocket/PocketFlow-Tutorial-Codebase-Knowledge
It’s designed to write beginner-friendly documentation with loads of analogies and clear breakdowns of what is going on.
Documentation is the thing no-one wants to do…
And it always goes out of date!
In addition, if you’re trying to work with someone else’s code, it can be really hard to get your head around how everything is done.
The framework
(I’ve then been converting the output into Quarto for easy integration with existing documentation)
It generates code examples…
And diagrams…
Because of the way this code is designed, you can modify the prompt that gets fed in.
It just lives in text in the nodes.py file.
So you can ask it to make the documentation more or less beginner friendly, or add very specific requests.
Google’s most advanced model (gemini-2.5-pro-exp-03-25) is currently free for roughly 25 requests a day.
For me, that was enough to generate two sets of documentation per day before hitting the limit.
If there isn’t a billing account linked to your Google account, you shouldn’t be able to exceed - but check and set up spending limits first!
I’ll put together a quick guide on the ‘how’ in the next few days, but if you’re keen to try it out sooner, have a go at following the instructions in the repository readme (and message me if you get stuck).
And have a read of a sample generated here: https://sammirosser.com/vidigi_autodoc_test/
There’s a new chapter in the DES book on getting distributions from real data!
(And many more - scipy provides functions for over 80…)
Source: By Skbkekas - Own work, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=9447142
Source: By Inductiveload - Own work (Original text: self-made, Mathematica, Inkscape), Public Domain, https://commons.wikimedia.org/w/index.php?curid=3817954
How to work out what best represents our data?
We’ll use the fitter package.
Let’s assume we start with a dataframe of historical activity times.
You pass in a list of values…
We can then apply the ‘get_best’ method to our object to return
And this can be used to then pass into your model.
Remembering that our output looked like this:
Running get_nurse_appt_duration() then returns one suitable time from that distribution.
There is also a new chapter on event logging.
We will be using the term ‘event logging’ to describe the process of generating a step-by-step log of what happens to each entity as they pass through our system.
The resulting file will be an ‘event log’.
…
entity_id: a unique identifider to allow us to follow a given entity through their journey
event_type: this column is used to distinguish between three key kinds of events:
event: this column further breaks down what is happening during each event type, such as what stage of the system people are waiting to interact with
time: this can be an absolute timestamp in the form of a datetime (e.g. 2027-01-01 23:01:47), or a relative timestamp in time units from the start of the simulation.
run: in a multi-run Trial, which run the result came from
We first add an empty list to our model to store the logs…
Then, each time we want to record something about the patent, we add in something like this:
self.event_log.append(
{'patient': entity_identifier,
'pathway': 'My_Pathway_Name',
'event_type': 'arrival_departure', # or 'queue', 'resource_use', or 'resource_use_end'
'event': 'arrival', # or 'depart', or for 'queue' and 'resource_use' or 'resource_use_end' you can determine your own event name
'time': self.env.now}
)(You could wrap that in a helper function if you preferred!)
In our ‘run’ method and the ‘Trial’ class, you make a few more changes to turn these dictionaries into a final output and save them.
We can then output this - either as a value returned when the model is run, or as a csv (or both).
Once you have this log, you can use it for a wide range of visuals
Note
Library of code snippets coming soon!
(fancy contributing?)
bupaR is an R library for process mining - but we won’t let that stop us!
With our event logs + some handy code snippets, generating some process mapping outputs becomes possible!
There are a few different ways we could get our Python event logs to work with bupaR:
the reticulate package (which runs Python from R) - though due to the complexity of our code, this is likely to run into issues
the r2py package (which runs R from Python) - as we only want a little bit of R in a primarily Python project, this might be a better option
Quarto’s features for passings objects like dataframes between R and Python cells
exporting our event log as a csv, importing this into R, and saving the resulting bupaR visuals
You can choose between mean stage time, max, min, etc.
Find some code to help you do this with your own simulation logs: https://des.hsma.co.uk/process_logs_with_bupar.html
In that section you will find
activity_log object in R, give the path to your csv.Explore other visuals in their documentation: https://bupaverse.github.io/docs/visualize.html
Book chapter coming soon…
Verification = did we build it right (to match our conceptual model, and without bugs?)
Validation = does it actually match the real world well enough to be useful?
Assumptions and simplifications documented and justified
There are a range of tests you could consider running to compare your simulation outputs with your historical data.
| Test | Type of Data | Compares | Purpose | Use Case in Simulation Validation |
|---|---|---|---|---|
| t-test | Continuous | Means between groups | Assess if the average values differ significantly | Compare average response times, service durations, etc. |
| Chi-squared test | Categorical | Frequency distributions | Assess if category frequencies match expected distribution | Compare call types, dispatch priority levels, station allocations |
| Kolmogorov–Smirnov (KS) 2-sample test | Continuous | Entire distributions | Assess if two samples come from the same distribution | Compare distributions of interarrival times, on-scene durations |
# Pull out daily number of calls across simulation and reality
sim_calls = np.array(average_monthly_calls['daily_calls']) # simulated data
# basically a list like [4, 6, 12, 5, 7, 4, 4, 19, ...]
real_calls = np.array(historical_monthly_calls['daily_calls']) # real data
# Welch’s t-test (does not assume equal variances)
t_stat, p_value = stats.ttest_ind(sim_calls, real_calls, equal_var=False)
# Thresholds
p_thresh = 0.05# Mean difference and effect size
mean_diff = np.mean(sim_calls) - np.mean(real_calls)
pooled_std = np.sqrt((np.std(sim_calls, ddof=1) ** 2 + np.std(real_calls, ddof=1) ** 2) / 2)
cohen_d = mean_diff / pooled_std
# Thresholds
effect_size_fail_thresh = 0.5
# Will only fail if significance threshold is met and cohen's D is sufficiently large
if p_value < p_thresh and abs(cohen_d) > effect_size_fail_thresh:
pytest.fail(f"""[FAIL - COMPARISON WITH REALITY]
**Mean Daily Calls** significantly different between simulation and reality.
p={p_value:.4f}.
Cohen's d={cohen_d:.2f}.
Sim mean: {np.mean(sim_calls):.2f}
Real mean: {np.mean(real_calls):.2f}.
Mean diff: {mean_diff:.2f}.""")A lot more than before!
Custom fonts appear to be supported via downloaded font files
The sidebar theme is fully customizable too.
So how would we build up a new candidate combination?
LSOAs (or any other geography) have a concept of neighbours (something sharing a border)
Here, we start with a territory that’s pretty central in the existing territories
On each step, it
This randomness ensures we don’t end up with every player owning the same number of units of territory each time.
We can then generate multiple allocations, which will all (probably) differ.
We can then score each solution on a metric
Rather than just randomly generating thousands of candidate solutions, we may move to a better solution faster by varying the best random solutions.
With some more boundary logic, we can create new possible solutions that are a variation on our best solution from the previous step.
We then evaluate again.
And - in theory - we’d keep going for a set number of generations or a defined amount of compute time - and see how good a solution we can come up with!