Skip to content

[FEA] Add multi-label node matching predicates for GFQL #696

@lmeyerov

Description

@lmeyerov

name: Feature request
about: Suggest an idea for this project
title: "[FEA] Add multi-label node matching predicates for GFQL"
labels: enhancement
assignees: ''


Is your feature request related to a problem? Please describe.
Currently, GFQL provides the is_in() predicate for matching nodes that have any of several values (OR logic) for a given attribute. However, there's no built-in way to match nodes that must have multiple labels/values simultaneously (AND logic), which is a common pattern in graph databases like Neo4j where nodes can have multiple labels (e.g., :Person:Employee:Manager).

When working with dataframe-based graphs where labels might be stored as:

  • Array/list columns containing multiple labels per node
  • Multiple boolean columns (one per label)
  • Delimited strings containing multiple labels

There's no elegant way to express "find nodes that have ALL of these labels" without resorting to query strings.

Describe the solution you'd like
Add new predicates to support multi-label matching patterns:

  1. contains_all(values) - For array/list columns, match if the column contains all specified values

    n({"labels": contains_all(["Person", "Employee"])})  # Node must have both labels
  2. contains_any(values) - Alias for is_in() but clearer for array columns

    n({"labels": contains_any(["Person", "Organization"])})  # Node has at least one label
  3. has_labels(labels) - Specialized predicate for label matching (could handle various storage formats)

    n({"labels": has_labels(["Person", "Employee"])})  # Must have all labels

These predicates would complement the existing is_in() predicate and make GFQL more expressive for multi-label graph patterns.

Describe alternatives you've considered

  1. Query strings - Currently possible but less readable and not type-safe:

    n(query="'Person' in labels and 'Employee' in labels")
  2. Multiple boolean columns - Works but requires different schema:

    n({"is_person": True, "is_employee": True})
  3. Custom predicates - Users could implement their own, but built-in support would be better:

    def contains_all(values):
        return lambda col: col.apply(lambda x: all(v in x for v in values))

Additional context

  • This feature would make GFQL more compatible with multi-label graph patterns common in Neo4j and other graph databases
  • The implementation could be optimized for vectorized operations on pandas/cuDF
  • Should be documented in the language spec alongside other predicates
  • Would enhance the synthesis examples for LLM-based code generation with multi-label patterns

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions