Cookbook
Task-oriented recipes for common sandbox jobs - running model-generated code, wiring a sandbox into an agent loop, installing packages, parking a session between turns, and fanning out across many sandboxes at once. Each recipe is copy-paste ready in Python and JavaScript. If you are new here, start with the Quickstart and come back for patterns.
sandboxes:read and sandboxes:write.Run model-generated code safely
The core use case: a model hands you code, you need its output but not its side effects. Write it into a fresh sandbox, run it, read back only the result. The context manager destroys the sandbox on exit, so nothing the code touched - files, processes, network state - outlives the call.
from orkestr import Sandbox
# A code string your model produced. Never exec this on your own host -
# run it in a throwaway sandbox and read back only the result.
generated = """
import statistics
nums = [4, 8, 15, 16, 23, 42]
print("mean:", statistics.mean(nums))
"""
with Sandbox.create(template="python-3.12") as sbx:
sbx.files.write("/workspace/gen.py", generated)
result = sbx.exec("python /workspace/gen.py")
answer = result.stdout if result.exit_code == 0 else f"error: {result.stderr}"
print(answer)
# The sandbox is destroyed on block exit - nothing the code did survives.Wire a sandbox into an agent loop
Give the model a single long-lived sandbox and let it run a sequence of commands, feeding each result back as context. Create once, exec many, terminate when the session ends. Every command, file write and lifecycle event is recorded - watch the run unfold on the sandbox's activity timeline in the console while the agent works.
from orkestr import Sandbox
# One sandbox for the whole agent session, reused across tool calls.
# restricted network lets the code pip-install and reach allowed APIs.
sbx = Sandbox.create(template="python-3.12", network="restricted")
try:
history = []
while not task_complete(history):
# 1) your model decides the next shell command
command = agent.next_command(history)
# 2) run it in the sandbox, feed the result back to the model
result = sbx.exec(command, timeout_seconds=120)
history.append({
"command": command,
"stdout": result.stdout,
"stderr": result.stderr,
"exit_code": result.exit_code,
})
finally:
sbx.terminate() # always free the sandbox when the session endsInstall packages with restricted egress
Use network="restricted" when the code needs to pull dependencies but you do not want to hand it open internet. Package registries, GitHub and the major LLM APIs are reachable through an allowlisting proxy; everything else is refused. Proxy-aware tools (pip, npm, curl, standard HTTP libraries) work with no setup.
with Sandbox.create(template="python-3.12", network="restricted") as sbx:
# In restricted mode pip / npm / curl go through an allowlisting proxy:
# package registries, GitHub and the major LLM APIs are reachable,
# everything else is blocked. No proxy setup needed - HTTP_PROXY is
# already set inside the sandbox.
sbx.exec("pip install --quiet requests")
sbx.files.write(
"/workspace/check.py",
"import requests; print(requests.get('https://api.github.com').status_code)",
)
print(sbx.exec("python /workspace/check.py").stdout) # 200Park a session between agent turns
For agents that work in bursts, pause the sandbox between turns to stop the compute meter and resume from the exact same state - installed packages, files, everything - minutes or hours later, even from a different process. pause() returns the sandbox id; persist it with your agent state and pass it to Sandbox.resume().
# Turn 1: set up an environment, then park it to stop the compute meter.
sbx = Sandbox.create(template="node-22", timeout_seconds=3600)
sbx.exec("npm init -y && npm install lodash")
sandbox_id = sbx.pause() # snapshot taken, meter stops
save_to_session(sandbox_id) # your DB / Redis / agent memory
# Turn 2, minutes or hours later, possibly in another process:
sbx = Sandbox.resume(load_from_session())
out = sbx.exec("node -e \"console.log(require('lodash').VERSION)\"")
print(out.stdout) # the installed deps are still thereData in, artifact out
Upload input, run a script, read back the artifact it produced. The whole /workspace directory is yours to write to; the sandbox never sees your other inputs or outputs.
csv = "name,score\nada,91\nlinus,88\ngrace,95\n"
analyze = '''
import csv
rows = list(csv.DictReader(open("/workspace/scores.csv")))
top = max(rows, key=lambda r: int(r["score"]))
open("/workspace/winner.txt", "w").write(top["name"])
'''
with Sandbox.create(template="python-3.12") as sbx:
sbx.files.write("/workspace/scores.csv", csv)
sbx.files.write("/workspace/analyze.py", analyze)
sbx.exec("python /workspace/analyze.py")
print(sbx.files.read("/workspace/winner.txt")) # graceStream a long-running command
For builds, test suites or training runs, stream output as it arrives instead of waiting for the whole thing to finish. Iterate to the final chunk to get the exit code - and always iterate to completion, since breaking early leaves the in-sandbox process running until its own timeout fires.
with Sandbox.create(template="python-3.12") as sbx:
sbx.files.write(
"/workspace/build.py",
"import time\nfor i in range(5):\n print(f'step {i}', flush=True); time.sleep(1)",
)
for chunk in sbx.exec_stream("python /workspace/build.py"):
if chunk.stream == "stdout":
print(chunk.data, end="", flush=True)
if chunk.is_final and chunk.exit_code != 0:
raise RuntimeError("build failed")Fan out across many sandboxes
Each sandbox is fully isolated, so running several at once is natural - evaluate N model candidates, test N branches, process N inputs in parallel. Stay within your plan's concurrency cap; check Sandbox.limits().max_concurrent before fanning out wide.
from concurrent.futures import ThreadPoolExecutor
from orkestr import Sandbox
def run_candidate(code: str) -> str:
with Sandbox.create(template="python-3.12") as sbx:
sbx.files.write("/workspace/c.py", code)
return sbx.exec("python /workspace/c.py").stdout
# Evaluate several model candidates in parallel. Stay within your plan's
# concurrency cap - check Sandbox.limits().max_concurrent first.
with ThreadPoolExecutor(max_workers=3) as pool:
outputs = list(pool.map(run_candidate, candidates))Handle timeouts and limits
A timed-out command does not kill the sandbox - it stays alive so you can collect partial state before deciding what to do. Catch ExecTimeout for that, and PlanLimitError when you are out of concurrent sandboxes or monthly budget. See the SDK reference for the full error hierarchy.
from orkestr import Sandbox, ExecTimeout, PlanLimitError
try:
with Sandbox.create(template="python-3.12") as sbx:
try:
result = sbx.exec("python train.py", timeout_seconds=300)
except ExecTimeout:
# The command timed out but the sandbox is still alive -
# grab partial state before the block exits and terminates it.
logs = sbx.files.read("/workspace/train.log")
raise
except PlanLimitError as e:
# Out of concurrent sandboxes or monthly budget.
print(f"hit a plan limit: {e}")Production tips
- Prefer the context manager (
with) /withTempso a crash in your agent loop still terminates the sandbox and bounds your bill. - Mint tokens scoped only to
sandboxes:read/sandboxes:writefor agent runtimes - a leaked scoped token cannot reach the rest of your account. - Set a tight
timeout_secondson both the sandbox and eachexec; agent-written commands hang more often than yours do. - Call
Sandbox.limits()once at startup to pick a size and concurrency that fit the running token's plan. - Use
pause()for idle sessions instead of keeping a sandbox running - a paused sandbox does not accrue compute.
Next steps
- Python SDK reference - every method, parameter and error class
- MCP server - drive sandboxes straight from Claude Code, Cursor or any MCP client
- REST API reference - the raw wire format for any language