Pre-alpha · Milestone M1 · 223+ tests

The operating system for AI agents.

Orchestrate, run, and govern multi-agent flows with contracts that don't drift. Self-host day one. Air-gap supported. Built on AgentsKit.

Join the waitlist

Desktop app coming soon · CLI alpha shipping in M1

12
ADRs published
6
RFCs in review
30+
LLM adapters
>90%
Core test coverage
MIT
Open license

Flow engine

DAG-native. Durable. Time-travel debuggable.

Compose agents into flows: compare, vote, debate, auction, blackboard. Pause for humans. Branch from any past step.

TriggerwebhookResearcherclaude-opusCriticgpt-4.1Votequorum=2HITL gate
comparevotedebateauctionblackboardcheckpointreplaybranch-from-step

Why it's different

Foundation over speed.

Existing agent platforms optimized speed-of-shipping. The result: drift, lock-in, abandoned plugins. We optimize the opposite.

Stable contracts

Zod at every boundary. SemVer strict. ADR before architecture. RFC before breaking changes. Backward-compat within a major.

Zero lock-in

30+ LLM adapters. Self-host day one. Air-gap supported. Workspace lockfile guarantees byte-reproducible runs across machines.

Enterprise-native

Signed audit log (Merkle chain). Capability-based RBAC. Egress default-deny. SOC 2 / HIPAA / GDPR aligned, not bolted on.

Capabilities

Everything serious teams need.

Signed audit log

Merkle-chained, HSM-ready. Tamper-evident trails for regulated workloads.

OpenTelemetry gen_ai

Datadog, Honeycomb, Langfuse, New Relic, Grafana, PostHog — out of the box.

MCP bridge v2

Publish AgentsKit tools as MCP servers. Consume any MCP server. Bidirectional.

Generative OS

Natural language → agent, flow, trigger, or tool. Editable, never opaque.

Run modes

production · preview · dry-run · replay · simulate · deterministic. Pick the safety floor.

Multi-agent topologies

compare · vote · debate · auction · blackboard. ReAct loops. Speculative execution.

Pre-flight cost estimate

Token + dollar projection before run. Live counter during. Per-tenant guardrails.

Sandbox runtimes

Side-effect declarations + tiered isolation. e2b built-in. Bring your own runtime.

Built for

Four wedges. One platform.

Healthcare & clinical

Air-gap mode. Safe-Harbor PII redaction. Patient consent + break-glass. Determinism mode for regulated decisions.

Coding & dev tooling

Repo-aware agents. Multi-runtime sandbox. Diff primitives. Cost-per-PR. Local-model fallback for offline work.

Marketing agencies

Multi-client workspace isolation. BrandKit (tone, banned phrases, disclaimers). Approval HITL. Per-client cost reporting.

Ops & SRE

Durable flows. Cron + webhook + CDC triggers. Cost heat map. Anomaly detection on traces. PagerDuty + Slack native.

Architecture

Thin layer. Strong contracts.

@agentskit/os-core stays under 15 KB gzipped. Everything else is independently installable. Use one piece without the desktop.

os-desktop
os-cli
os-flow
os-triggers
@agentskit/os-core
Zod contracts · event bus · principal/cap · errors · workspace model
os-security
os-marketplace
os-mcp-bridge
os-cloud-sync
AgentsKit (upstream)
core · runtime · adapters · memory · tools · skills · rag · sandbox · eval
<15 KB
Core gzipped
<800 ms
Cold start
<15 MB
Installer size
<60 s
Time to first agent

Quick start

Six commands to your first agent.

Pre-flight cost estimates. Workspace lockfile. Docker deploy. All from the CLI.

  • initScaffold workspace + sane defaults
  • doctorDiagnose env, providers, sandbox
  • runExecute flow with run-mode + estimate
  • lockPin versions for reproducibility
  • deployShip to docker / cloud target
~/projects/my-flow
# install once core+cli ship
pnpm add -D @agentskit/os-cli
# scaffold
pnpm agentskit-os init
# diagnose
pnpm agentskit-os doctor
# run with cost estimate first
pnpm agentskit-os run pr-review --mode preview --estimate
# lock + ship
pnpm agentskit-os lock
pnpm agentskit-os deploy --target docker

Roadmap

Eight milestones to 1.0.

Public process. Every milestone ships ADRs, RFCs, tests, docs. No surprises.

  1. M1Core schemas + CLI alphaIn progress
  2. M2Desktop shell · FlowEditor · TraceViewerUp next
  3. M3Flow engine · DAG · durable · HITLPlanned
  4. M4Triggers · MCP bridge v2Planned
  5. M5Marketplace · plugin hostPlanned
  6. M6Observability · audit signing · vaultPlanned
  7. M7Generative OS (NL → flow)Planned
  8. M8Cloud sync · CRDT collab · 1.0Planned

Coming soon

Desktop app. Marketplace. Cloud sync.

Get early access. Shape the contracts before 1.0. Enterprise pilots opening Q3 2026.

No spam. Updates roughly every milestone.