A live dashboard for an overnight coding agent

The thing stopping me from leaving a coding agent running overnight was never capability. It was trust. If I close the laptop and let an agent grind on a repo for six hours, I want to know what it's doing without tailing a terminal in bed. I want a board I can glance at and a text when something actually happens.

So before I pointed an agent at a real open-source repo, I spent an afternoon giving it eyes. The agent did most of the wiring itself. Total time from "I should build this" to a live, notify-me-on-key-events dashboard was under an hour, and the interesting part is how little code it took.

What makes "overnight" literal

I ran this with Jacq, an autonomous agent that runs tools locally but hands long tasks off to a cloud worker when you close your laptop. Two properties matter for unattended work:

Cloud handoff. "Overnight" is only real if the work survives me going to sleep. The laptop closing can't be the thing that kills the run.
A safety classifier on every tool call. Routine edits run on autopilot; destructive or irreversible actions pause for approval. That's not a nice-to- have for unattended runs — it's the whole reason you can walk away. But it also means you will get paged for an approval at some point, which changes how you design notifications (more on that below).

What it didn't ship with was a way for me to watch from across the house. That part's on me, and it's small.

Structured milestones, not a token firehose

An agent will happily stream you a wall of reasoning. That's the wrong thing to watch. I wanted discrete milestones — picked up issue, wrote a failing test, fix applied, suite green, PR opened — so a progress bar could fill up instead of a log scrolling by.

The contract is just a JSON event:

{
  "issue": "#29498",
  "phase": "test",
  "level": "success",
  "title": "New regression test passes; suite green",
  "pr_url": null
}

phase is one of setup · repro · fix · test · lint · pr · blocked · done. Those eight strings drive a per-issue progress rail and a set of stat tiles. That's the entire API surface the agent has to care about.

The receiver is boring on purpose

A single FastAPI file. Events get appended to a JSONL file (one run, no schema migrations, no database to babysit) and pushed to any connected browser over Server-Sent Events. SSE over websockets here because the data only flows one way and EventSource reconnects itself.

@app.post("/api/event")
async def post_event(request: Request, x_token: str = Header(default="")):
    if TOKEN and x_token != TOKEN:
        raise HTTPException(401, "bad token")
    ev = await request.json()
    ev["ts"] = datetime.now(timezone.utc).isoformat()
    with open(EVENTS_LOG, "a") as f:
        f.write(json.dumps(ev) + "\n")
    broadcast(ev)                       # fan out to SSE subscribers
    if ev["phase"] in {"pr", "blocked", "done"} or ev["level"] == "error":
        notify(ev)                      # text / email on the events that matter
    return {"ok": True}

The SSE endpoint is a queue per subscriber and a keepalive so proxies don't hang up on an idle stream:

@app.get("/api/stream")
async def stream():
    q = asyncio.Queue(); subscribers.add(q)
    async def gen():
        try:
            while True:
                try:
                    ev = await asyncio.wait_for(q.get(), timeout=20)
                    yield f"data: {json.dumps(ev)}\n\n"
                except asyncio.TimeoutError:
                    yield ": keepalive\n\n"
        finally:
            subscribers.discard(q)
    return StreamingResponse(gen(), media_type="text/event-stream")

The front end is one EventSource and a reducer that maps each phase to a position on the rail:

const STEP = { repro: 0, fix: 1, test: 2, lint: 3, pr: 4 };
const es = new EventSource("/api/stream");
es.onmessage = (e) => {
  const ev = JSON.parse(e.data);
  if (ev.issue) advanceRail(ev.issue, STEP[ev.phase], ev.level);
  prependToFeed(ev);
};

That's the whole live board: stat tiles, a progress rail per issue, and a feed that updates the instant the agent does something.

Notifications that don't nag

The fast way to make a dashboard useless is to push every event to your phone. I only fan out on three phases — pr (it opened a draft PR), blocked (it needs my call on something), and done — plus any hard error, and I throttle to one push per 20 seconds. Email goes through SES; text goes through a small SMS helper I already had. Both are behind a toggle on the board itself, so I can flip text on when I want to be woken up and leave it on email otherwise.

The blocked event is the important one, and it's a direct consequence of that safety classifier. A good unattended agent isn't one that never stops — it's one that stops at exactly the right moment and tells you why. Design your alerts around the human-in-the-loop beats, not just the wins.

Wiring the agent in was one line

Here's the part that still feels slightly unfair. Because the agent can run a shell, "integrate with my dashboard" isn't an SDK or a plugin. It's a function I dropped into its brief:

report() {  # report <phase> <issue|-> <level> <title> [pr_url]
  curl -s -X POST "$DASH/api/event" \
    -H "X-Token: $TOKEN" -H "Content-Type: application/json" \
    -d "{\"phase\":\"$1\",\"issue\":\"$2\",\"level\":\"$3\",\"title\":\"$4\",\"pr_url\":\"$5\"}"
}

The brief tells it to call report at each milestone, with a couple of examples. That's the entire integration. No tool definitions, no registration, no schema package — the agent already knows how to use curl, so the cheapest possible interface is the right one.

What I'd change

JSONL is fine for a single run. If you want a history across runs, point it at Postgres and add a run_id. I deliberately didn't — ephemeral is correct for a demo board.
One throttle for everything is crude. A real version would separate "FYI" pushes from "I'm blocked, come look" pushes and only let the second category wake you up.
The board should show the diff. Right now a pr event links the PR; next iteration it inlines the patch so I can approve from my phone instead of opening GitHub.

The honest takeaway for anyone evaluating this class of tool: the agents are already good. The leverage now is in the thin layer around them — observability, the right notification design, and an interface cheap enough that wiring it up is an afternoon, not a project. The board's live, it's pointed at a real repo, and the genuinely fun part is watching the rails fill in from across the room.