Founder essay

OpenMatter Thesis: Making the Physical World Callable

Iddris Sandu

May 25, 2026

This is a founder note on where I think agentic systems are heading — not a product brief. The bottleneck in physical-world work is no longer reasoning. It is whether the field builds shared execution infrastructure: items and places you can reference, authority you can scope, context that inherits and persists, and proof that closes the loop when software touches the real world.

The next chapter is not on a screen

Intelligent systems are leaving the document. The work that still resists automation is physical: moving and storing things, unlocking access, coordinating people and machines, confirming what actually happened in the world.

Call the easy part pixel work: anything reducible to tokens on a screen — writing, coding, analysis, support tickets, scheduling. The marginal cost of that work is racing toward zero. Models improve; agents multiply. Differentiation on “we use AI for the same digital tasks” will not last.

Leverage is moving to atoms: custody, mobility, access, fulfillment — execution that software alone cannot finish. The companies that matter in this cycle will not be the ones with the best chat surface. They will be the ones that make physical execution legible, governable, and composable.

That is the field I am betting on. Not better transcripts — better rails.

What I see in the field

Agents are marketed as autonomous, but most products still assume a human operator behind the glass: someone approves spend, someone meets the driver, someone re-enters billing and shipping when the workflow hops to another agent.

Tool use improved what agents can request. Orchestration improved how they chain requests. Neither installed a standard completion path when the outcome requires presence, custody, or permission in the world. A successful HTTP response is not operational completion.

Payment and checkout are advancing on their own axis — scoped credentials, machine payments per API call, structured carts instead of scraping HTML. That is real progress for software-metered access. It does not close custody accepted on an actor’s behalf, access granted at a door, or an item delivered — and it does not keep billing, shipping, and mandate state when a different agent picks up the thread.

The market is also naming the gap in blunt ways: humans-for-hire wired to agents to receive on site or run errands because the default stack cannot complete those steps. That is an honest admission, not a substitute for governed execution.

I see the same pattern everywhere I look: plausible language, partial artifacts, and a person or field system closing what software left open.

The physical world is still pre-API

APIs gave software stable references — query, compose, revoke. The physical world still behaves like software before that layer: items and places exist, but agents cannot resolve them durably across services.

Watch any demo that “dispatches a courier.” The agent often re-derives the item, the site, and the mandate on every turn. That is fine for a screenshot. It collapses the moment the workflow runs for hours and crosses providers.

Agentic commerce is the same story under nicer UI. The agent may find a product and call checkout APIs, but the actor still supplies payment method, billing address, and shipping address at nearly every boundary — or a human approves each field again. Payment cleared is not delivery completed.

Product language splits payments, mobility, custody, and state into separate nouns. In execution they are one problem: context has to travel when the agent crosses a boundary.

Read the world, act on it, leave a receipt

The loop I want the field to standardize is simple: read state, form intent, execute through the right channel, return a receipt that updates shared context.

Atoms have to flow into bits so agents can see custody, location, and constraints. Intent has to flow back into atoms so agents are not permanently advisory. Without the first, they are blind. Without the second, they are slides.

Imagine an item in storage that must reach an actor via a courier, with payment, access, and handoff scoped to the same run. In today’s stacks, each step is often a fresh tool call with locally rebuilt context. In the world I am working toward, the item, site, route, mandate, and grant are already addressable — the agent composes them instead of rediscovering them from prompts.

Proof is not optional logging. A receipt should say what was intended, under which grant, what changed in the world, and what blocked if execution stopped. Spend proof and completion proof are different receipts. The industry is starting to say the same thing about verification — did the agent do what was asked, not only did payment clear? I think that instinct is right and still incomplete without physical completion.

What I think has to exist

Software already has permissions, roles, logs, and revocation. The physical world still runs on fragmented keys, apps, schedules, phone calls, and improvised human handoffs.

A control surface is what makes the difference: who may act, when, under which constraints, with which proof. That is how physical execution stops being ad hoc coordination and becomes infrastructure agents can actually run.

Five requirements keep showing up when I pressure-test real workflows. Addressability — stable handles for the item, site, and services. Bidirectional translation — custody readable, dispatch writable. Scoped authority — grants for unlock, spend, and handoff, not vague permission. Context — inherited before the action starts, multimodal evidence bound to the same run, persistent references across providers without session amnesia. Proof — receipts that tie each physical step back to intent.

Context is three problems, not one slogan. Inherited context is mandates, custody, and prior grants that should enter every action without prompt archaeology. Multimodal context is scans, photos, location, signatures — field evidence tied to the run, not orphaned in chat. Persistent context is the same item, place, and policy references from custody through commerce through admission.

These should compose as one execution harness — not a scattered feature list per vendor. Most stacks today chain API calls and hope someone closes the gaps. The vision I want to carry forward assumes the gaps are modeled: visible before execution, receipted after, revocable by policy — including when a physical step must still be assigned to a person or robot with explicit authority.

Where workflows still break

When I pressure-test products claiming operational agents, the same failure modes return. They are not academic — they are what I see in demos and roadmaps.

Custody on behalf of an actor: software may know an item is arriving, but cannot be present at a home, curb, counter, or facility to witness the handoff and record it under grant.

Outbound handoff: drafts and forms, not scoped release authority a carrier can execute against.

Mobility: reservation IDs without durable infrastructure state — the actor still standing outside while the run thinks it succeeded.

Access: “code sent” without admission bound to policy, receipted and revocable.

Commerce: checkout without end-to-end actor context — payment, billing, shipping, fulfillment proof, continuity when another agent resumes.

Context across the run: the next turn cannot answer where the item is, who may release it, what mobility ran, who was admitted, what was purchased for whom — without rebuilding from prompts.

If most of these are still open, the product is selling autonomy while shipping human proxy dependency. The field should be honest about that gap instead of hiding it behind better chat.

The vision I want to carry forward

One question captures what I care about: when a workflow crosses a physical boundary, do context, authority, and proof travel with it — or does a human carry what software forgot?

Every real-world action inherits context before it starts — preference, place, time, identity, custody, prior choices, local constraints. The field needs protocols that make that context portable without making it invisible. People need to see what an agent can identify, what authority it carries, and how to revoke it.

Agency must stay with people. The goal is not automation for its own sake. It is amplification: less friction, more legibility, override when it matters.

OpenMatter is where I am placing that bet — open protocols and software so agents can act across human-built systems with governed execution. I do not believe one company owns the answer. I do believe the design bar is real: items addressable, authority explicit, services composable, context that inherits and persists, proof that closes the loop.

The physical world does not need another chat interface. It needs callable execution — and a field that treats physical completion with the same seriousness software already treats API success.

Author

Iddris Sandu

Iddris Sandu signature