Essay

Capability Graphs for Delegated Physical-World Action

Loose tool lists collapse under physical execution; capability graphs preserve the dependencies between identity, policy, service access, state, and order.

May 26, 2026

Central claim

The research direction is to make an action boundary visible before execution: what the agent can do, what it cannot do, and which dependencies shape the plan.

OpenMatter alignment

OpenMatter describes primitives for places, movement, storage, access, and receipts; capability graphs make those primitives inspectable and composable.

Abstract

Background

Physical-world agent plans often depend on chained services — places, access, payment, messaging, coverage — that flat tool lists hide.

Objective

Formalize capability graphs as a planning structure for delegated physical-world action.

Methods

We compare tool-list planning with graph-based planning on dependency visibility, missing-scope detection, and pre-execution inspection.

Results

Capability graphs make dependencies explicit and treat missing capability as useful feedback before execution.

Conclusion

Graphs are a necessary planning primitive when agents compose services that touch atoms.

Keywords

capability graphs · agent planning · delegated action · tool composition · physical-world services · policy boundaries

Introduction

Tool lists hide the dependencies that matter in physical execution. A courier action may depend on places, access, payment, messaging, and coverage.

Capability graphs make these dependencies explicit, giving both humans and agents a way to inspect what can happen before anything is attempted.

Flat tool registries optimize for “what can the model call?” Physical execution requires “what must be true before this call is allowed?” Those are different questions. Graphs answer the second.

Literature Review

Prior agent frameworks emphasize tool registration and invocation order, but rarely model cross-service constraints as first-class graph edges.

Physical workflows require identity, policy, service access, state, and ordering to remain linked through the plan, not reconstructed at each step.

Orchestration UIs often show a linear trace after the fact. They rarely show a dependency graph before execution — which is when developers and users can still change the plan safely.

When a step fails mid-run, tool-list systems blame the last API. Graph systems surface the missing upstream capability: no payment scope, no resolved place, no custody record, no access grant.

Methodology

We model capabilities as nodes and dependencies as edges: scope requirements, service bindings, state preconditions, and ordering constraints.

Evaluation focuses on whether blocked steps are detected before execution and whether missing services produce actionable planning feedback.

Nodes represent executable capabilities: resolve place, authorize spend, request courier, unlock locker, emit receipt. Edges represent prerequisites: custody must be known before dispatch; payment scope before spend; place before route.

Results

Blocked steps should not be treated as failure. They are often useful safety outcomes: the system detected a missing permission, missing service, missing address, or unresolved dependency before execution.

Consider a courier plan as a minimal graph. Root intent: deliver parcel P to recipient R. Nodes: (1) resolve storage site S, (2) read custody for P at S, (3) bind payment mandate M, (4) quote route, (5) create delivery, (6) handoff receipt. Edges enforce order and scope: (2) requires addressable P and S; (5) requires (3) and (4); (6) requires (5) or an explicit blocked outcome.

If node (3) is missing, a tool list might still invoke (5) and fail at runtime. The graph blocks (5) at plan time with a legible reason: missing payment capability. The agent or human can fix scope before anything physical happens.

Missing capability as feedback is a design feature. It converts silent runtime failure into inspectable planning state.

Discussion

Graphs also support partial execution: some branches complete while others remain blocked, each with receipts. That matches how physical operations actually unfold — not every dependency resolves on the first attempt.

Future work should test whether graph inspection reduces unsafe or impossible plans across heterogeneous provider sets.

Limitations: graph richness must be bounded. Over-modeling every edge creates noise; under-modeling recreates tool-list blindness. The right granularity is workflow-dependent and should be validated in live ADK runs.

Conclusion

Loose tool lists collapse under physical execution. Capability graphs preserve the dependencies agents need to plan governed action across real-world systems.

For builders, the practical test is simple: can you see why a step is not allowed before you spend money, move inventory, or unlock a door? If not, you have a tool list. If yes, you have a graph.

Contributions

A graph-based alternative to flat tool lists for physical execution.
A model for missing capability as pre-execution safety feedback.
Shared graph primitives across movement, storage, payment, and access.

Declarations

Funding

This work was conducted as part of OpenMatter internal research and protocol design.

Conflict of interest

The authors are affiliated with OpenMatter.

Ethics approval

Not applicable. These essays present systems research and protocol proposals rather than human-subjects experimentation.

Data availability

Supporting notes and implementation artifacts referenced in this paper are available to qualified researchers on request.

Back to Essay