Pre-alpha · Milestone M1 · 223+ tests

The operating system
for AI agents.

Orchestrate, run, and govern multi-agent flows with contracts that don't drift. Self-host day one. Air-gap supported. Built on AgentsKit.

Join the waitlist

Desktop app coming soon · CLI alpha shipping in M1

ADRs published

RFCs in review

30+

LLM adapters

>90%

Core test coverage

MIT

Open license

Flow engine

DAG-native. Durable. Time-travel debuggable.

Compose agents into flows: compare, vote, debate, auction, blackboard. Pause for humans. Branch from any past step.

comparevotedebateauctionblackboardcheckpointreplaybranch-from-step

Why it's different

Foundation over speed.

Existing agent platforms optimized speed-of-shipping. The result: drift, lock-in, abandoned plugins. We optimize the opposite.

Stable contracts

Zod at every boundary. SemVer strict. ADR before architecture. RFC before breaking changes. Backward-compat within a major.

Zero lock-in

30+ LLM adapters. Self-host day one. Air-gap supported. Workspace lockfile guarantees byte-reproducible runs across machines.

Enterprise-native

Signed audit log (Merkle chain). Capability-based RBAC. Egress default-deny. SOC 2 / HIPAA / GDPR aligned, not bolted on.

Capabilities

Everything serious teams need.

Signed audit log

Merkle-chained, HSM-ready. Tamper-evident trails for regulated workloads.

OpenTelemetry gen_ai

Datadog, Honeycomb, Langfuse, New Relic, Grafana, PostHog — out of the box.

MCP bridge v2

Publish AgentsKit tools as MCP servers. Consume any MCP server. Bidirectional.

Generative OS

Natural language → agent, flow, trigger, or tool. Editable, never opaque.

Run modes

production · preview · dry-run · replay · simulate · deterministic. Pick the safety floor.

Multi-agent topologies

compare · vote · debate · auction · blackboard. ReAct loops. Speculative execution.

Pre-flight cost estimate

Token + dollar projection before run. Live counter during. Per-tenant guardrails.

Sandbox runtimes

Side-effect declarations + tiered isolation. e2b built-in. Bring your own runtime.

Built for

Four wedges. One platform.

Healthcare & clinical

Air-gap mode. Safe-Harbor PII redaction. Patient consent + break-glass. Determinism mode for regulated decisions.

Coding & dev tooling

Repo-aware agents. Multi-runtime sandbox. Diff primitives. Cost-per-PR. Local-model fallback for offline work.

Marketing agencies

Multi-client workspace isolation. BrandKit (tone, banned phrases, disclaimers). Approval HITL. Per-client cost reporting.

Ops & SRE

Durable flows. Cron + webhook + CDC triggers. Cost heat map. Anomaly detection on traces. PagerDuty + Slack native.

Architecture

Thin layer. Strong contracts.

@agentskit/os-core stays under 15 KB gzipped. Everything else is independently installable. Use one piece without the desktop.

os-desktop

os-cli

os-flow

os-triggers

@agentskit/os-core

Zod contracts · event bus · principal/cap · errors · workspace model

os-security

os-marketplace

os-mcp-bridge

os-cloud-sync

AgentsKit (upstream)

core · runtime · adapters · memory · tools · skills · rag · sandbox · eval

<15 KB

Core gzipped

<800 ms

Cold start

<15 MB

Installer size

<60 s

Time to first agent

Quick start

Six commands to your first agent.

Pre-flight cost estimates. Workspace lockfile. Docker deploy. All from the CLI.

initScaffold workspace + sane defaults
doctorDiagnose env, providers, sandbox
runExecute flow with run-mode + estimate
lockPin versions for reproducibility
deployShip to docker / cloud target

~/projects/my-flow

# install once core+cli ship
pnpm add -D @agentskit/os-cli
 
# scaffold
pnpm agentskit-os init
 
# diagnose
pnpm agentskit-os doctor
 
# run with cost estimate first
pnpm agentskit-os run pr-review --mode preview --estimate
 
# lock + ship
pnpm agentskit-os lock
pnpm agentskit-os deploy --target docker

Roadmap

Eight milestones to 1.0.

Public process. Every milestone ships ADRs, RFCs, tests, docs. No surprises.

M1Core schemas + CLI alphaIn progress
M2Desktop shell · FlowEditor · TraceViewerUp next
M3Flow engine · DAG · durable · HITLPlanned
M4Triggers · MCP bridge v2Planned
M5Marketplace · plugin hostPlanned
M6Observability · audit signing · vaultPlanned
M7Generative OS (NL → flow)Planned
M8Cloud sync · CRDT collab · 1.0Planned

Coming soon

Desktop app. Marketplace. Cloud sync.

Get early access. Shape the contracts before 1.0. Enterprise pilots opening Q3 2026.

No spam. Updates roughly every milestone.

The operating system for AI agents.