Typed Graph References for Dataclasses

lex00 · December 28, 2025, 12:03am

Dataclasses naturally model tree-structured data, but many domains require graph structures where objects reference each other. Currently, there’s no standard way to express “this field references another dataclass” with full type safety.

The Problem

@dataclass
class Network:
    cidr: str

@dataclass
class Subnet:
    network: ???  # How do we express “reference to a Network”?
    cidr: str

Common workarounds each have drawbacks:

String identifiers (network_id: str) — no type safety
Forward references (network: “Network”) — type checker sees the class, not a reference relationship
The class itself (network: type[Network]) — semantically wrong (we want an instance reference, not the class)

A Proposal: Ref[T] and Attr[T, “name”]

I’ve been exploring typed reference markers that work with the existing type system:

  from graph_refs import Ref, Attr

  @dataclass
  class Subnet:
      network: Ref[Network]           # Reference to a Network
      gateway_id: Attr[Gateway, "Id"] # Reference to Gateway's Id attribute

  @dataclass
  class LoadBalancer:
      targets: RefList[Instance]      # List of references

This enables:

Type checkers verify reference targets
IDE autocomplete for valid targets
Static dependency graph analysis
Framework introspection via get_refs() and get_dependencies()

Use Cases

This pattern appears in infrastructure-as-code, configuration management, entity relationships, and workflow systems — anywhere objects form a graph rather than a tree:

  @entity
  class Order:
      customer: Ref[Customer]
      items: RefList[Product]

  @task
  class ProcessData:
      depends_on: RefList[Task]
      output: Attr[Storage, "Path"]

Implementation

I’ve published two packages exploring this:

GitHub - lex00/graph-refs: Typed graph references for Python dataclasses — The type markers (Ref, Attr, RefList, RefDict, ContextRef) plus introspection (get_refs, get_dependencies)
GitHub - lex00/graph-refs-dataclasses: Dataclass runtime machinery for declarative DSLs using the no-parens pattern — Runtime machinery for building DSLs with a “no-parens” pattern where references are expressed as direct class names rather than function calls

The graph-refs-dataclasses repo example demonstrates a complete mini-DSL with custom decorator, dependency ordering, and JSON serialization.

Design Principles

Zero runtime cost — Type markers have no instance data; value is at development time
Minimal surface area — Five primitives that compose with existing types
Compatibility first — Works with get_type_hints(), mypy, pyright, dataclasses

Questions for Discussion

Is there interest in standardizing vocabulary for inter-dataclass references?
Are the semantics of Ref[T] vs type[T] clear and useful?
Should Attr[T, “name”] validate that T has attribute name?
Could @dataclass_transform (PEP 681) gain parameters to better support reference-aware decorators?

I’d welcome feedback on the concept, API design, or alternative approaches.

I would like to thank @ericvsmith for the foundation here as well, thanks Eric!

gcewing · December 28, 2025, 1:27am

You’ll have to elaborate on that. How exactly does the type checker treat that’s different from just ?

lex00 · December 28, 2025, 2:09am

Ref[Network] doesn’t behave ( much ) differently from Network.

The value is for frameworks to inspect type hints and treat Ref fields differently such as serializing as references or building dependency graphs. It is more convention than a type system feature.

Should affect type checking somehow, or is the introspection use case enough?

gcewing · December 28, 2025, 4:09am

I believe there’s already a mechanism for attaching non-type metadata to type hints. It sounds like it would be better to use that for these kinds of purposes rather than add a new type notation that type checkers would need to be taught about, even if just to ignore it.

Eneg · December 28, 2025, 5:35am

Every variable is a reference in python, so what is it that you’re trying to achieve that network: Network doesn’t?

lex00 · December 28, 2025, 6:15am

Thanks @gcewing , I was able to use Annotated to achieve what I need. I deleted graph-refs and graph-refs-dataclasses.

@Eneg Annotated[T, Ref()] is just semantic distinction between a reference being a class, or an actual dependency reference. This only matters if it helps your implementation, in my case it does.