notifications from resolver CF updates

Coming from the discussion in #49. We should establish some way of getting notified when CFs are getting updated by the resolver. There are several ways to achieve this, mainly custom signals and/or method hooks/overloads.

To get reliable behavior any follow-up code can work with, several things have to be considered, before we can place a hook or trigger a custom signal:
- general resolver layout
- atomicity of resolver updates
- needed data in signal follow-up code
- where and how to declare the signal tracking

**general resolver layout**

The resolver currently works as a DFS resolving and updating CFs in the dependency graph on descent. There is no explicit backtracking / ascending work done atm. The logic is as follows (pseudocode):
```python
update_dependent(input_queryset, update_fields):
  # resolve dependent CFs (queryset construction)
  targets = _queryset_for_update(input_queryset, update_fields)
  # walk nodes on current dependency tree level
  for target_queryset, computed_fields in targets:
    bulk_updater(target_queryset, computed_fields)

bulk_updater(target_queryset, computed_fields):
  # some queryset local code (local MRO, apply select/prefetch lookups)
  # walk and recalc all fields in MRO and records (building a [fields x records] matrix)
  ...
  # save in batches with `bulk_update`
  target_queryset.bulk_update(changed_data)
  if target_queryset:    # recursion exit if queryset is empty (needed for cycling deps)
    # descent to next level in dependency tree
    update_dependent(target_queryset, computed_fields)
```


**atomicity of resolver updates**

As shown in the resolver layout above, the atomic update data type is a queryset filtered by dependency constraints updating certain computed fields on that model. This could be an entry for a custom signal / method hook. But there are several issues in doing it at that level:
- we are still in the middle of the update cascade, CFs are only partially synced
- follow-up code triggered from such a signal would have to beware of the dependency tree itself, and the current position in the tree, which is really bad and hard to deal with

Imho hooking into DFS runs is a bad idea, it should not be the official way offered by the API. It still could be done by overloading `update_dependent` or `bulk_updater` yourself (if you know what you are doing).

From an outer perspective resolver updates should be atomic for a full update tree, not its single nodes. Due to the nature of spread updates across several models, this is somewhat hard to achieve. (Directly linked to this is the question about concurrent access and whether the sync state of computed fields can be trusted. Also see #55.)

To get atomicity for whole tree updates working with custom signals/method hooks, we basically could call into it at the end of the top level `update_dependent` invocation. The work data would have to be collected somehow (now with backtracking).

**needed data in signal follow-up code**

With a signal on whole tree update level, this question gets ugly, since the backtracking would have to carry the updated data along. My suggestion not to waste too much memory - every node in the update tree simply places entries with pks into a container. The entries could look like this:
```python
{
  model1: {
    'comp1': set(pks_affected),
    'comp2': set(other_pks),
  },
  model2: {
    'compX': set(some_pks)
  }
}
```
The signal finally could get the container as argument and can work with it.

**where and how to declare the signal tracking**

Some ideas regarding this:
- data tracking should be off by default (to save memory from nonsense data collecting)
- place a keyword argument on `@computed` like `resolver_signal=True` to indicate collecting its updates during resolver runs
- resolver has one resolver-wide default signal, which would trigger if data was collected

If we run into memory issues for quite big updates (really huge pk lists), we might have to find a plan B on how to aggregate the updated data.

@mobiware, @olivierdalang Up for discussion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

notifications from resolver CF updates #56

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

notifications from resolver CF updates #56

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions