Skip to content

Conversation

ialarmedalien
Copy link
Collaborator

@ialarmedalien ialarmedalien commented Oct 8, 2025

This PR adds a function to detect cycles when recursively executing a function, e.g. when retrieving class ancestors and repeatedly executing schemaview.class_parents(...).

The _closure function allows one to recursively execute functions without getting caught in cycles, but it is also useful to see if cycles do exist as they may represent errors in the schema logic that need to be fixed. Adding the kwargs detect_cycles to the _closure function or any functions that call it under the hood (e.g. class_ancestors) will execute the cycle detection algorithm before generating the closure.

It would be possible to integrate this function into the _closure function -- the main reason for not doing so is that it would make the _closure function more complicated and less readable. The performance benefits are unlikely to be worth it unless someone is dealing with a massive schema with a very complex inheritance hierarchy (and even then...).

It may be worth moving both the _closure and detect_cycles functions into a separate module at some point if it seems as though there might be more call for graph-type algorithms.

@ialarmedalien ialarmedalien self-assigned this Oct 8, 2025
@ialarmedalien ialarmedalien added the enhancement New feature or request label Oct 8, 2025
Copy link

codecov bot commented Oct 8, 2025

Codecov Report

❌ Patch coverage is 86.95652% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.65%. Comparing base (e63dbb0) to head (7384670).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
linkml_runtime/utils/schemaview.py 86.95% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #469      +/-   ##
==========================================
+ Coverage   77.59%   77.65%   +0.06%     
==========================================
  Files          52       52              
  Lines        4459     4480      +21     
  Branches      968      973       +5     
==========================================
+ Hits         3460     3479      +19     
- Misses        778      779       +1     
- Partials      221      222       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines +102 to +108
Uses the classic white/grey/black colour coding algorithm to track which nodes have been explored. In this
case, "node" refers to any element in a schema and "neighbours" are elements that can be reached from that
node by executing function `f`.
WHITE: unexplored
GREY: node is being processed; processing includes exploring all neighbours reachable via f(node)
BLACK: node and all of its neighbours (and their neighbours, etc.) have been processed
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This naming system could be changed (e.g. to TODO / PROCESSING / DONE or something similar) but the white/grey/black nomenclature is used in numerous places where this algorithm is described, so may as well stick with it.

Comment on lines +175 to +176
if kwargs and kwargs.get("detect_cycles"):
detect_cycles(f, x)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allows users run schemaview.type_descendants("my_type", detect_cycles=True) or schemaview.class_ancestors("CoolClass", detect_cycles=True)



@pytest.mark.parametrize(("target", "cycle_start_node"), [(k, v) for k, v in CYCLES[TYPES][0].items()])
@pytest.mark.parametrize("fn", ["detect_cycles", "graph_closure", "type_ancestors"])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test calling the function directly, via the graph closure (_closure function), and through SchemaView methods that use _closure

Comment on lines +2412 to +2418
def check_recursive_id_slots(class_name: str) -> list[str]:
"""Given a class, retrieve any classes used as the range for the class identifier slot."""
id_slot = sv_cycles_schema.get_identifier_slot(class_name, use_key=True)
if not id_slot:
return []
ind_range = sv_cycles_schema.slot_range_as_union(id_slot)
return [sv_cycles_schema.get_class(x).name for x in ind_range if sv_cycles_schema.get_class(x)] or []
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use detect_cycles with a custom function -- here I'm checking whether I will enter recursion hell whilst trying to ascertain the type(s) allowed by a slot with a class as the range.

Copy link
Contributor

@kevinschaper kevinschaper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super cool!

@kevinschaper kevinschaper merged commit ff268a3 into main Oct 9, 2025
17 checks passed
@kevinschaper kevinschaper deleted the schemaview_cycle_check branch October 9, 2025 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants