Bento — Harness as code

Meet your new loop

The REPL, evolved for agents.

Read: the event — a PR, an issue, a schedule
Eval: run the agent in an isolated sandbox
Print: post the result where the work happens
Loop: on every trigger, version-controlled

All in code, all in your repo, all on your terms.

How it works

.bento/pipelines/pr-review.yamlyaml

trigger:
  github: pull_request.opened
agent: reviewer
skill: pr-review
output:
  github: review
guardrails:
  read_only: true
  timeout: 900

→ 14:02:11 pr.opened · dispatched · ✓ 437.2s

An event you already produce — a PR opening, an issue landing, a cron firing. Pick what kicks off the pipeline.

Choose who runs and the procedure they follow. Same personas, different playbooks — composed per pipeline.

Post back to the surface that triggered it — a review, a comment, a Slack message. The work lands where you work.

Bound the run: a timeout, read-only mode, or an explicit tool allowlist. Start safe, then grant more.

Designed for context engineering.

Prior state, team knowledge, scoped identity and other advanced features that give you maximum control over how you synthesize context for your pipelines.

Iterate on outcomes in prose with enterprise tools to evaluate and build trust with your agentic workflows.

Knowledge

Four ways knowledge reaches the agent.

Your wiki, blueprints, and recipes live in the repo. One knob decides how they land in a run — pushed into the prompt, listed for on-demand reading, mounted as files, or left to a tool.

.bento/pipelines/pr-review.yamlyaml

knowledge:
  mode: qmd
  topK: 3
  maxTokens: 4000

BM25 retrieval — the top matches are pushed into the prompt as <context> excerpts.

Features

pull_request.opened Issue.assigned 0 9 * * 1-5

↓dispatch

Triggers

GitHub webhooks, Linear issues, cron schedules, MCP calls, URLs, or a manual fire.

route per pipelineclaude · sonnet-4.6codex · gpt-5

Runtimes & models

Claude and Codex today — route each pipeline, or A/B them, to the runtime and model that fits.

ephemeral · cap-drop all

acme/api@main

$ git clone — fresh ✓

Sandboxes

Every run in an ephemeral container — Docker, Podman, Daytona, or just-bash. Repo cloned fresh, credentials scoped, host untouched.

guardrails

timeout900s

read_onlytrue

allowed_tools[read, search]

Guardrails

Timeouts, token budgets, read-only mode, and tool allowlists — declared per pipeline.

run #12notes

run #13notes

run #14← prior runs injected

Memory

Per-target workspaces that persist across runs: cloned repo, prior-run notes, and history — injected into the next run.

{
  cursor: 4821,
  indexed: "1.2k"
}

persists across runs · 10 MB

Pipeline state

Persistent JSON per pipeline for cursors, indices, and metadata that outlive a run — off the prompt budget.

wikiblueprintrecipe

↓

ask “how do we deploy?”

Deep Wiki

Your wiki, blueprints, and recipes — indexed and injected per run, or queried on demand. Agents work from your team’s knowledge.

trigger

run

turn

tool

Traces

Every trigger, run, and turn recorded. Inspect the prompt, replay any event, reconstruct a run turn by turn.

$ bento daemon start✓ ready · :7890$ bento trace 9f4e

CLI control plane

One bento CLI drives it all: daemon lifecycle, firing triggers, tracing runs, workbenches, tokens, doctor.

Own your harness

Own the harness, not the other way around.

The harness is just code in your repo. Switch providers whenever you like, run it on your own hardware, keep your data to yourself. Most agent platforms can't say that — your code on their cloud, your team on their models, your data in their logs.

Nothing to get locked into. Just your harness, your way.

The REPL, evolved for agents.

Start with a trigger

Pick an agent and skill

Send the result somewhere

Set the guardrails

Designed for context engineering.

Four ways knowledge reaches the agent.

Triggers

Runtimes & models

Sandboxes

Guardrails

Memory

Pipeline state

Deep Wiki

Traces

CLI control plane

Own the harness, not the other way around.

The REPL, evolved for agents.

Start with a trigger

Pick an agent and skill

Send the result somewhere

Set the guardrails

Designed for context engineeringEngineering what an agent sees and when — prior state, team knowledge, scoped identity, assembled into each run — rather than hand-tuning a single prompt.Read moreAnthropic.

Four ways knowledge reaches the agent.

Triggers

Runtimes & models

Sandboxes

Guardrails

Memory

Pipeline state

Deep Wiki

Traces

CLI control plane

Own the harness, not the other way around.

Designed for context engineering.