stitching together a personal agent cluster with a $12/mo LLM

stitching together a personal agent cluster with a $12/mo LLM

i spent a weekend wiring dirge, a 12MB rust coding agent forked from zerostack, into weft, my own rust control plane that composes four services into an agent cluster. taught it to browse the web from inside a Firecracker microVM. wound up with 40 MCP tools, a custom 6.1 kernel, and a lot of late nights debugging things that should’ve been simple.

this is the honest version. 141 dirge sessions. about $30 in API costs. 15 years of accumulated unix knowledge deployed against a stack that fought back at every layer. here’s what actually happened.


the stack, briefly

two pieces of software:

dirge, a 12MB rust coding agent. TUI, markdown rendering, janet plugin system, tree-sitter code analysis. runs on ~8MB RAM at idle. forked from zerostack with features ported from maki and pi. i rebuilt it with every feature flag turned on and pointed it at my cluster.

weft, a rust control plane that i’ve been tinkering with. it composes four services: MosaicDB (elixir, semantic search + property graph), Zypi (elixir, Firecracker microVM sandbox), FlowEngine (rust, DAG workflow engine), and YAS-MCP (rust, OpenAPI-to-MCP bridge). plus a coordination layer called ribbon, an append-only ndjson log with a state machine for agent communication.

the idea: dirge is the cockpit, weft is the engine room. dirge’s bash commands get intercepted by a janet plugin and redirected to Firecracker microVMs. session memory persists in MosaicDB. agents coordinate through ribbon’s event log.


booting the cluster

weft is a hobby project and it shows in the best way, everything is simple enough to understand in an afternoon, but you do have to understand it. there’s no magic docker compose up that hides the internals. here’s what it took to get all three core services healthy:

$ curl -s http://localhost:8080/health
{
  "overall": "degraded",
  "services": [
    {"name": "mosaic",   "status": "healthy"},
    {"name": "zypi",     "status": "healthy"},
    {"name": "flowengine","status": "healthy"},
    {"name": "yas-mcp",  "status": "unreachable"}
  ]
}

yas-mcp is optional, it bridges OpenAPI specs to MCP tools, and i hadn’t deployed it yet. three out of four was plenty.

the zypi slim docker image had some gaps from earlier experiments. i’d missed ca-certificates in the builder stage so mix couldn’t reach hex.pm. the entrypoint called mix release but no release was configured, mix run --no-halt is simpler for dev anyway. and the rootfs wasn’t included in the slim image because i’d been building it separately, so i pulled the ubuntu 24.04 squashfs from the Firecracker CI, injected the zypi-agent go binary, and built the ext4 filesystem.

i’d left a comment in the MCP handler: “NOTE: Can’t read POST body”, basically a todo i’d been ignoring. rust 1.91 tightened the axum handler trait and State<AppState> alone couldn’t read the POST body. switching to State<AppState> + Bytes fixed it and suddenly 38 tools were responding to JSON-RPC. been meaning to get to that one.


the plugin that makes it interesting

dirge embeds janet, a clojure-like lisp, on a dedicated OS thread. plugins can intercept tool calls through seven lifecycle hooks. the key one is on-tool-start, it fires before every tool execution and can rewrite the arguments.

here’s the plugin that redirects bash to Firecracker:

(defn weft-sandbox-on-tool-start [ctx]
  (when (= (ctx :tool) "bash")
    (let [cmd (extract-command (ctx :args))]
      (when cmd
        (harness/mutate-input
          (string
            "{\"command\":\""
            "curl -s -X POST http://localhost:4000/exec"
            " -H 'Content-Type: application/json'"
            " -d '{\\\"cmd\\\":[\\\"sh\\\",\\\"-c\\\",\\\""
            cmd
            "\\\"],\\\"image\\\":\\\"ubuntu:24.04\\\",\\\"timeout\\\":25}'"
            "\"}"))))))

40 lines of janet. the LLM calls bash 'rm -rf /', the plugin rewrites it to curl Zypi :4000/exec, a Firecracker VM boots, the command runs in an isolated kernel, and the output flows back through on-tool-end where stdout is extracted from the Zypi JSON response and swapped in with harness/replace-result. the LLM never knows it’s not running locally.

there’s a passthrough list too, curl, wget, git, make, cargo, docker all bypass the sandbox and run on the host. that way the agent can call weft’s MCP endpoints directly without going through a Firecracker VM that can’t reach localhost.

the VM runs linux 6.1.0 that i built from the amazon linux fork using firecracker’s microvm kernel config. smep, smap, ibrs, full spectre/meltdown mitigations. 230MB of ram, no /dev/kvm, no docker socket, no host process visibility. i added iptables rules to block recursive exec, the sandbox can’t call zypi’s API to spawn child VMs, and nat masquerade so the sandbox can reach the internet.


why dirge’s LSP integration matters

most coding agents treat your codebase as a pile of text files. they grep for symbols and hope the regex matches. dirge ships with tree-sitter for three languages (typescript, python, bash) and an LSP client that attaches to rust-analyzer, typescript-language-server, pyright, and clojure-lsp.

this means the agent doesn’t guess. when it calls find_definition on a function, it gets the exact byte range from the compiler’s own index. when it writes or edits a file, the LSP server gets a didChange notification, runs diagnostics, and any compile errors show up in the tool result as a <diagnostics> block before the agent even tries to compile. the agent fixes type errors on the same turn instead of writing broken code and discovering it later via cargo check.

the killer feature is that this works with any LSP-compatible language server. add a rust-analyzer config stanza and dirge gets go-to-definition, find-references, hover types, and document symbols for your entire cargo workspace. the semantic tools (list_symbols, get_symbol_body, find_callers, find_callees) use tree-sitter’s ast rather than regex, so they’re accurate for the languages they support.

for weft, this means dirge can navigate the rust workspace, jump from a tool handler in mcp/tools.rs to its definition in mcp/server.rs, list every function in the orchestrator module, or find every call site of circuit_breaker::with_circuit across the codebase. the agent builds an accurate mental model of the code without hallucinating file paths.

it’s not perfect. the lsp spawn failed on my machine because rust-analyzer timed out during the handshake (the weft workspace is large and the analyzer needs a moment). but when it works, it’s a different category of tool, the difference between “i think this function is in that file” and “the compiler says it’s at line 1833, column 14.”

browsing the web from inside a microVM

with the nat rules in place, the Firecracker sandbox can hit the internet. here’s dirge browsing example.com:

$ dirge --provider deepseek -p "curl http://example.com and report the title"

Title: Example Domain
First paragraph: This domain is for use in documentation examples...

the full path: dirge LLM calls bash → janet plugin intercepts → rewrites to Zypi exec → Firecracker VM boots (sub-second) → VM runs curl → nat masquerade through docker bridge → internet → response flows back through plugin → LLM sees clean output and parses the html.

there’s a chromium image (335mb) loaded in zypi that boots in 2.7 seconds. the browser automation tools, weft_browser_start, navigate, extract, are wired up in the codebase but need the session api plumbed through. the simpler weft_browse tool works today: it runs a 50-line python scraper in the sandbox and pulls out titles, text, and links.


memory that survives restarts

mosaicdb stores arbitrary json under string labels. here’s storing mission parameters in one session and recalling them in another:

session a:

$ curl -X POST :8080/api/memory/memo -d '{
  "label": "mission-parameters",
  "content": {
    "codename": "OPERATION-DIRGE-WEFT",
    "access_code": "JANET-FIRECRACKER-8842",
    "capabilities_verified": ["sandbox-exec", "memory-store", "memory-recall"]
  }
}'
→ $memo_mission_parameters stored (343B)

session b (fresh dirge session, no prior context):

$ curl -X POST :8080/api/memory/expand -d '{"handle":"$memo_mission_parameters"}'
→ codename: OPERATION-DIRGE-WEFT
→ access_code: JANET-FIRECRACKER-8842
→ capabilities: sandbox-exec, memory-store, memory-recall

the LLM in session b had never seen these values. it queried mosaicdb through weft’s api, parsed the json, and reported everything correctly. cross-session memory recall, no vector embeddings needed, handles and a key-value store.

the ribbon event log tracks every agent action:

AGENT       STATE         TASKS
dirge       ✅ completed   1
flowengine  ✅ completed   3
mosaic      ✅ completed   6
weft        ✅ completed   10
zypi        ✅ completed   9

42 tasks logged across the ecosystem. the state machine enforces submitted→working→committed→completed transitions and gives you hints when you try an invalid transition. it’s surprisingly satisfying to watch agents move through the lifecycle.


the loop: dirge churning through work

dirge has a headless loop mode. you point it at a markdown file with - [ ] checkboxes, give it a prompt, and it iterates:

dirge --loop \
  --loop-prompt 'Work through LOOP_PLAN.md one task at a time' \
  --loop-plan LOOP_PLAN.md \
  --loop-max 5 \
  --loop-run 'cargo test' \
  --provider deepseek

each iteration: reads the plan, picks an unchecked task, implements it with full tool access, marks it [x], runs the validation command, saves a transcript. the next iteration gets the updated plan plus a summary of what happened last time.

i tried it with a simple task, add a comment to weft-core/src/main.rs, and it worked. the LLM called the edit tool, made the change, and the loop moved on. with a faster provider this would churn through a backlog of small fixes unattended.

there’s an off-by-one in the loop module where --loop-max 1 stops before running the agent. --loop-max 2 works around it. small thing, easy fix.


do we even need MCP? a deeper answer

dirge already has a bash tool. the LLM can run curl, grep, cat, it can do anything a shell script can. so why add 38 MCP tools on top of that?

the short answer: bash is a serial port into the OS. MCP is an API into a composed system.

when dirge runs bash 'curl http://localhost:8080/health', the janet plugin redirects it to a Firecracker VM. the VM runs curl, gets back json, and the LLM parses it. that’s three layers of indirection (janet → zypi → firecracker) just to make an http call. it works, but the LLM is doing string processing on raw json, brittle, slow, error-prone.

MCP changes the abstraction. instead of bash 'curl -X POST ...', the agent calls tools/call with {"name":"weft_health","arguments":{}}. the weft control plane receives a typed request, checks the circuit breaker, fans out to mosaicdb and zypi and flowengine, aggregates the results, and returns a structured response. the LLM doesn’t parse json, it reads content blocks with mime-typed text.

the difference matters:

bash:  "curl -s http://:4000/exec -d '{\"cmd\":[\"echo\",\"hello\"]}'"
       → raw json string → parse stdout field → hope nothing broke

MCP:   tools/call {name: "weft_sandbox_exec", arguments: {command: ["echo", "hello"]}}
       → json-rpc request → circuit breaker → zypi exec → typed response
       → content: [{type: "text", text: "hello"}]

both get the same result. one path is a shell command wrapped in three layers of escaping. the other is a function call.

remote tooling changes what’s possible

once tools are remote, you get composition for free. weft_sandbox_exec isn’t just bash-in-a-vm, it’s a node in a flowengine dag. you can chain it:

weft_sandbox_exec("find . -name '*.rs'")       # node 1: discover files
  → weft_memory_store(results)                  # node 2: persist to graph
    → weft_workflow_run({nodes: [...]})         # node 3: run analysis dag
      → weft_memory_search("security issues")   # node 4: query the graph

four tool calls that the agent orchestrates. but flowengine can run them as a dag with parallelism, retry, and streaming events. the agent doesn’t need to know how, it just calls weft_workflow_run with a description and gets back the consolidated output.

this is what makes the ecosystem interesting. dirge handles the conversation. weft handles the computation. the MCP boundary is where they shake hands.

what would make this really zypi

a few directions i want to explore:

tool discovery from the environment. right now dirge knows about weft’s tools because i told it about the endpoint. but what if the agent bootstraps by calling ribbon whoami, discovering which services it owns, then calling weft_mcp_status to list the connector tools registered in yas-mcp? the agent learns its own capabilities at runtime from the cluster topology.

yas-mcp as a tool factory. yas-mcp ingests openapi specs and generates MCP tools dynamically. point it at a new api, gmail, github, homeassistant, and 38 tools become 50, then 100. the agent doesn’t need new code. it calls tools/list and discovers what’s available.

sandbox-transparent execution. the janet plugin already makes bash commands run in firecracker without the LLM knowing. what if weft_browse does the same for http requests? what if weft_memory_search routes through a different backend depending on the query type? the agent calls the same tool name regardless of where the work happens.

multi-agent tool sharing. ribbon already tracks agent state. what if agent A publishes its available tools in the event log and agent B discovers them? “dirge can edit code and run tests. mosaic can search the knowledge graph. here’s what i need: find security issues in the codebase and store them in the graph.” the agents negotiate who does what based on declared capabilities.

MCP isn’t the interesting part. it’s just json-rpc with schemas. the interesting part is that it lets you build a system where tools compose, agents discover each other, and the LLM operates at the level of intent rather than shell commands.

what the timeline actually looked like

141 dirge sessions. ten of them were productive. the other 131 were some variation of “bash ‘curl -s http://localhost:8080/health’” returning empty output while i stared at the screen trying to figure out which of the five abstraction layers was swallowing the response.

the janet plugin had a silent failure mode. string/has-prefix? doesn’t exist in the embedded janet runtime that janetrs 0.8 ships. no error, no crash, the function just returned falsy and every bash command got sandboxed, including the curl commands that were supposed to query the weft api. the passthrough list never matched. took two hours of adding harness/notify debug lines and reading raw tracer output before i realized what was happening. the fix was (string/find prefix cmd) with a position check instead.

the zypi container went through six rebuilds. first the slim dockerfile crashed because ca-certificates weren’t installed and mix couldn’t reach hex. then mix release failed because no release was configured, had to switch to mix run --no-halt. then the kernel was the ancient 4.14.174 from the firecracker quickstart guide, so i built linux 6.1 from the amazon linux fork. then the rootfs was missing entirely, so i downloaded the ubuntu 24.04 squashfs from the firecracker CI and injected the zypi-agent go binary by hand. then dns didn’t work inside the vm because /etc/resolv.conf was empty. then nat masquerade wasn’t configured so the sandbox couldn’t reach the internet. each fix revealed the next problem.

the mcp handler comment i’d left in the source, “NOTE: Can’t read POST body”, had been sitting there for weeks. one line fix. finally did it.

this is what systems programming looks like. every layer of the stack, rust axum extractors, elixir docker builds, firecracker kernel configs, janet lisp runtimes, iptables rules, docker networking, they all had to be right for the whole thing to work. when it finally did, when dirge ran whoami and the response came back root from a kernel 6.1 firecracker vm instead of jm from my laptop, that was worth the 141 sessions.

the sharp edges

deepseek is slow for tool-heavy work. each bash call takes 10-30 seconds through the sandbox redirect, and a complex iteration with five tool calls runs 3-5 minutes. openai or anthropic would be much snappier. the loop mechanism itself is solid, it just needs a provider that doesn’t rate-limit aggressively.

the kernel build belongs in CI. the 6.1 kernel builds from amazon linux’s fork but i’m doing it manually. a github action that produces the vmlinux artifact would make the docker build self-contained.

the docker situation is too manual. right now zypi needs privileged mode, /dev/kvm, /dev/net/tun, custom iptables rules, a downloaded rootfs, and an injected agent binary. each piece is simple but the assembly is fragile. a compose file with health checks would turn five manual steps into docker compose up. i need to write that.

janet’s embedded runtime is missing some stdlib. string/has-prefix? doesn’t exist in the embedded janet that janetrs 0.8 ships. took me an hour to figure out why my passthrough list never matched. string/find with a position check works fine as a replacement. os/shell is also unavailable, the harness api replaces it with rust functions, which is fine once you know.


what’s working

the whole thing runs on a single machine. $12/mo for the deepseek api key. the rest is just compute and curiosity.


repos: github.com/allen-munsch/weft and github.com/allen-munsch/dirge. AGPL-3.0-only. keep the ideas free.