Back to blog
FILE 0x6A·CASS

Cass

A private LLM assistant built on the Claude Agent SDK, running unattended on my homelab. It is my personal AI staff — a chief of staff, a sysadmin, a researcher, and an inbox filter all at once. It is online right now, and it does not require my supervision.

What it does

I talk to Cass through an iOS app and a web UI at x.cwfrazier.com. It has access to my calendar, my mail, my finances (via Plaid), my home network, my homelab, my AWS account, my contacts, my iMessage and Signal archives, my browser sessions, and my voice line. It books haircuts, places phone calls, drafts blog posts, drains my todo list, restarts services when they fall over, and surfaces what I need to know about my day on a schedule I never have to maintain.

It is a personal AI that I built for myself, with the level of integration depth that you can only justify when the audience is exactly one person.

Why I built it

I wanted to find out what an assistant looks like when there is no privacy concern, no enterprise-grade access-control problem, no shared-tenant complexity, and a single user it can know everything about. I also wanted a place to do agent engineering at production stakes without paying production costs — a place to find out which decisions hold up when they have to run unsupervised for months.

Architecture

Runtime

FastAPI on a Proxmox LXC container (192.168.1.59), behind nginx with Let's Encrypt at x.cwfrazier.com. The Claude Agent SDK is the reasoning loop. The bundled claude CLI ships with the SDK and authenticates against my Claude Max subscription via OAuth, which is how I keep the cost of the assistant essentially flat regardless of usage.

Python · FastAPI · Claude Agent SDK · uvicorn · systemd · nginx · Let's Encrypt · Proxmox / LXC

Persistent memory (MCP server)

I built a custom MCP server backed by DynamoDB that gives Cass long-term memory across sessions. It has typed topics, hot vs cold tiers (hot entries auto-load each session; cold entries are searched on demand), full-text search, and conversation indexing. New facts are written by Cass itself with no human in the loop — phone numbers, preferences, project state, debugging gotchas, family relationships, finances. The memory has grown to thousands of typed entries and is the single most valuable component of the system.

Auto-router

Not every request needs the full Claude Max + full tools path. A lightweight router classifies the incoming message and dispatches to either the full agent (with the complete tool catalog), a cheap Haiku path (via the Anthropic API), or a direct API call for trivial completions. The router also handles rate-limit fallback: if the Claude Max session is exhausted, the same request seamlessly re-runs through the cheap API path, and the user never sees a failure.

Tools (~30 custom)

calendar · commitments
todo + memory
contacts (iPhone mirror)
Plaid finance
mail (SES catch-all + DDB)
iMessage + Signal index
Teams + Slack scraping
Bitwarden credentials
AWS account ops
Proxmox + LXC restarts
browser driving (Playwright)
MacPilot (drives the Mac)
outbound voice (Amazon Connect)
SMS (Pinpoint, two-way)
memex hybrid search
job-hunt apply pipeline

Mail integration

Every Chester-owned domain has its MX pointed at AWS SES. Inbound mail lands in S3, kicks off a Lambda that writes structured rows into DynamoDB, classifies the message, and forwards a copy to Gmail as a write-only backup. Cass queries the DDB index for any email-related task — "did I get the appointment confirmation," "show me everything from my CPA in May," "summarize today's inbox." The mail layer doubles as the audit trail for any outbound Cass sends (BCC to a dedicated alias).

Outbound voice and SMS

Two-way SMS goes through AWS End User Messaging on a registered toll-free number, with inbound replies routed back into the same Cass conversation that initiated the thread. Voice calls go through Amazon Connect; the call is recorded automatically and the recording lands in S3 30 seconds after disconnect, where it can be transcribed or replayed. Cass uses these for real-world tasks — confirming reservations, calling the vet, dealing with utility companies.

Memex — queryable second brain

Beneath Cass sits Memex, a hybrid retrieval index over my entire personal corpus: ~80,000 messages across iMessage, Signal, email, and Teams. Polls every minute, normalizes, writes raw JSONL to S3, and maintains a hybrid (BM25 + embedding) index. Cass queries Memex when I ask “what did Sheri say about the new building” or “find the last time we discussed the well pump.”

Operating model

Why this is a thing other engineers should care about

Cass is the most honest portfolio piece I have, because it is the system I trust myself to. It is what I built when I had no client, no PRD, no committee — just the question “what is the smallest version of an AI agent platform that I would actually depend on every day.” The answers are baked into the architecture: persistent memory matters more than reasoning sophistication; tool depth matters more than tool count; the assistant's job is to take real action in the real world, not summarize web pages.

Stack summary

Claude Agent SDK · Python · FastAPI · MCP (custom) · DynamoDB · AWS Lambda · SES · Pinpoint / End User Messaging · Amazon Connect · Proxmox / LXC · nginx · Let's Encrypt · Route 53 · Playwright

← Frogger · Check on Mine →