Skip to content

notifications from resolver CF updates #56

Open
@jerch

Description

@jerch

Coming from the discussion in #49. We should establish some way of getting notified when CFs are getting updated by the resolver. There are several ways to achieve this, mainly custom signals and/or method hooks/overloads.

To get reliable behavior any follow-up code can work with, several things have to be considered, before we can place a hook or trigger a custom signal:

  • general resolver layout
  • atomicity of resolver updates
  • needed data in signal follow-up code
  • where and how to declare the signal tracking

general resolver layout

The resolver currently works as a DFS resolving and updating CFs in the dependency graph on descent. There is no explicit backtracking / ascending work done atm. The logic is as follows (pseudocode):

update_dependent(input_queryset, update_fields):
  # resolve dependent CFs (queryset construction)
  targets = _queryset_for_update(input_queryset, update_fields)
  # walk nodes on current dependency tree level
  for target_queryset, computed_fields in targets:
    bulk_updater(target_queryset, computed_fields)

bulk_updater(target_queryset, computed_fields):
  # some queryset local code (local MRO, apply select/prefetch lookups)
  # walk and recalc all fields in MRO and records (building a [fields x records] matrix)
  ...
  # save in batches with `bulk_update`
  target_queryset.bulk_update(changed_data)
  if target_queryset:    # recursion exit if queryset is empty (needed for cycling deps)
    # descent to next level in dependency tree
    update_dependent(target_queryset, computed_fields)

atomicity of resolver updates

As shown in the resolver layout above, the atomic update data type is a queryset filtered by dependency constraints updating certain computed fields on that model. This could be an entry for a custom signal / method hook. But there are several issues in doing it at that level:

  • we are still in the middle of the update cascade, CFs are only partially synced
  • follow-up code triggered from such a signal would have to beware of the dependency tree itself, and the current position in the tree, which is really bad and hard to deal with

Imho hooking into DFS runs is a bad idea, it should not be the official way offered by the API. It still could be done by overloading update_dependent or bulk_updater yourself (if you know what you are doing).

From an outer perspective resolver updates should be atomic for a full update tree, not its single nodes. Due to the nature of spread updates across several models, this is somewhat hard to achieve. (Directly linked to this is the question about concurrent access and whether the sync state of computed fields can be trusted. Also see #55.)

To get atomicity for whole tree updates working with custom signals/method hooks, we basically could call into it at the end of the top level update_dependent invocation. The work data would have to be collected somehow (now with backtracking).

needed data in signal follow-up code

With a signal on whole tree update level, this question gets ugly, since the backtracking would have to carry the updated data along. My suggestion not to waste too much memory - every node in the update tree simply places entries with pks into a container. The entries could look like this:

{
  model1: {
    'comp1': set(pks_affected),
    'comp2': set(other_pks),
  },
  model2: {
    'compX': set(some_pks)
  }
}

The signal finally could get the container as argument and can work with it.

where and how to declare the signal tracking

Some ideas regarding this:

  • data tracking should be off by default (to save memory from nonsense data collecting)
  • place a keyword argument on @computed like resolver_signal=True to indicate collecting its updates during resolver runs
  • resolver has one resolver-wide default signal, which would trigger if data was collected

If we run into memory issues for quite big updates (really huge pk lists), we might have to find a plan B on how to aggregate the updated data.

@mobiware, @olivierdalang Up for discussion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions