Skip to content

Conversation

thomasleese
Copy link
Contributor

@thomasleese thomasleese commented Mar 20, 2025

This refactors the way outcome statuses are generated to avoid instantiating ActiveRecord model objects and instead pluck values from the database directly and process the raw values.

Locally, I’m seeing session pages load in about half the time than they would normally with 10,000, 50,000 and 100,000 patients. Further work is necessary to improve performance, likely requiring some sort of state machine or caching of the various statuses so they can be queried directly in SQL.

I’ve tried to make the commits in this PR as small as possible while avoiding breaking the functionality between the commits, but in general the refactors are wide reaching and therefore touch lots of part of the code. The feature tests haven’t required any changes.

This adds a Rake task which we can use to quickly generate fake data
suitable for some basic performance testing. It skips the import process
and instead generates data directly in the database.
@thomasleese thomasleese added the performance Improving performance label Mar 20, 2025
@thomasleese thomasleese force-pushed the improve-performance branch from b6cfe83 to 6927b4c Compare March 20, 2025 09:06
@thomasleese thomasleese force-pushed the improve-performance branch from 6927b4c to ce45a8b Compare March 20, 2025 09:52
@thomasleese thomasleese force-pushed the improve-performance branch from ce45a8b to 4734102 Compare March 20, 2025 10:31
@thomasleese thomasleese force-pushed the improve-performance branch from 4734102 to 42b77a1 Compare March 20, 2025 10:45
@thomasleese thomasleese force-pushed the improve-performance branch from 42b77a1 to 1d7a02e Compare March 20, 2025 11:47
@thomasleese thomasleese force-pushed the improve-performance branch from 1d7a02e to 2d2fa80 Compare March 20, 2025 12:51
@thomasleese thomasleese force-pushed the improve-performance branch from 2d2fa80 to 1517e33 Compare March 20, 2025 13:07
@thomasleese thomasleese force-pushed the improve-performance branch from 0d37cbc to 405a78e Compare March 20, 2025 13:17
@thomasleese thomasleese force-pushed the improve-performance branch from 405a78e to 3880f97 Compare March 20, 2025 13:19
@thomasleese thomasleese force-pushed the improve-performance branch from bf9c0fa to f654e60 Compare March 20, 2025 15:29
@thomasleese thomasleese force-pushed the improve-performance branch from f654e60 to d411390 Compare March 20, 2025 16:01
This makes the profiling Gem available in development which can be used
to check why parts of the code are slow.
This refactors the `SearchForm` in preparation for some future
performance optimisations, to make it easier to integrate with those
upcoming changes.
This refactors the `SessionOutcome` class to accept a scope of patient
sessions and determine the outcome of the patient sessions in one go by
running a query that plucks only the necessary data and avoids the need
to instanciate lots of ActiveRecord model objects.
This refactors the `RegisterOutcome` class to accept a scope of patient
sessions and determine the outcome of the patient sessions in one go by
running a query that plucks only the necessary data and avoids the need
to instanciate lots of ActiveRecord model objects.
This refactors a number of classes to accept a scope of patients
and determine the outcome of the patients in one go by running a
query that plucks only the necessary data and avoids the need to
instanciate lots of ActiveRecord model objects.

Ideally the three classes would be refactored in separate commits, but
they are closely tied together where changing one requires all of them
to be changed.
This refactors the `ConsentOutcome` class to accept a scope of patients
and determine the outcome of the patients in one go by running a query
that plucks only the necessary data and avoids the need to instanciate
lots of ActiveRecord model objects.
This refactors the `NextActivity` class to use the new `Outcomes` class
introduced in the earlier commits.
This removes a scope on the `PatientSession` that was previously used to
preload all the data necessary to generate an outcome status. Instead,
we should preload only the data necessary at the point of fetching the
patient sessions to avoid fetching unnecessary data.
This refactors the component to fetch all the patients sessions for all
the sessions at once to avoid N+1 issues.
@thomasleese thomasleese force-pushed the improve-performance branch from 54655bd to f4c8f35 Compare March 20, 2025 20:16
@tvararu tvararu temporarily deployed to mavis-pr-3262 March 20, 2025 20:16 Inactive
Copy link

@thomasleese thomasleese marked this pull request as ready for review March 20, 2025 20:56

def no_response_count(session:)
number_stat(session:) { it.patient.consent_outcome.no_response?(programme) }
number_stat(session:) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned by the it syntax (not just here). It's not apparent what it stands for. I had to jump into the method to see what it returns. I think we should use a block variable with a suitable name to make the code more readable (where appropriate).

@thomasleese thomasleese added this to the v2.1.2 milestone Mar 24, 2025
@thomasleese
Copy link
Contributor Author

This has been superseded by #3263, #3268, #3272, #3273 and #3275 which caches the statuses in the database.

thomasleese added a commit that referenced this pull request Mar 28, 2025
This introduces a new model, `Patient::ConsentStatus`, which stores the
status of a patient-programme pair in the database so it can queried and
rendered quickly. Locally, I'm seeing the consent tab on sessions with
50,000 patients load almost instantly even when filtering on statuses.
The session overview page has a slight improvement, but that will only
load quickly once the other statuses are cached.

I've decided not to implement a state machine as had been previously
discussed, but this could be built on top of this change to enhancement
the functionality. For now, this effectively caches the existing logic
that runs on the fly to the database.

I've also decided not to implement this using Rails callbacks and
instead be explicit where the status is refreshed. This allows us to
optimise for bulk refreshes, as is necessary when adding new programmes
to sessions, running the seeds, or generally updating the status of an
entire session.

This is proposed as an alternative to #3262 for consent statuses, we'd
need to apply the same functionality to triage, programmes, sessions and
next actions. If we're happy with this approach, we can migrate the
statuses one by one to avoid a big bang approach unlike #3262.
@thomasleese thomasleese deleted the improve-performance branch May 6, 2025 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Improving performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants