Cass
A private LLM assistant built on the Claude Agent SDK, running unattended on my homelab. It is my personal AI staff — a chief of staff, a sysadmin, a researcher, and an inbox filter all at once. It is online right now, and it does not require my supervision.
What it does
I talk to Cass through an iOS app and a web UI at x.cwfrazier.com. It has access to my calendar, my mail, my finances (via Plaid), my home network, my homelab, my AWS account, my contacts, my iMessage and Signal archives, my browser sessions, and my voice line. It books haircuts, places phone calls, drafts blog posts, drains my todo list, restarts services when they fall over, and surfaces what I need to know about my day on a schedule I never have to maintain.
It is a personal AI that I built for myself, with the level of integration depth that you can only justify when the audience is exactly one person.
Why I built it
I wanted to find out what an assistant looks like when there is no privacy concern, no enterprise-grade access-control problem, no shared-tenant complexity, and a single user it can know everything about. I also wanted a place to do agent engineering at production stakes without paying production costs — a place to find out which decisions hold up when they have to run unsupervised for months.
Architecture
Runtime
FastAPI on a Proxmox LXC container (192.168.1.59), behind nginx with Let's Encrypt at x.cwfrazier.com. The Claude Agent SDK is the reasoning loop. The bundled claude CLI ships with the SDK and authenticates against my Claude Max subscription via OAuth, which is how I keep the cost of the assistant essentially flat regardless of usage.
Persistent memory (MCP server)
I built a custom MCP server backed by DynamoDB that gives Cass long-term memory across sessions. It has typed topics, hot vs cold tiers (hot entries auto-load each session; cold entries are searched on demand), full-text search, and conversation indexing. New facts are written by Cass itself with no human in the loop — phone numbers, preferences, project state, debugging gotchas, family relationships, finances. The memory has grown to thousands of typed entries and is the single most valuable component of the system.
Auto-router
Not every request needs the full Claude Max + full tools path. A lightweight router classifies the incoming message and dispatches to either the full agent (with the complete tool catalog), a cheap Haiku path (via the Anthropic API), or a direct API call for trivial completions. The router also handles rate-limit fallback: if the Claude Max session is exhausted, the same request seamlessly re-runs through the cheap API path, and the user never sees a failure.
Tools (~30 custom)
Mail integration
Every Chester-owned domain has its MX pointed at AWS SES. Inbound mail lands in S3, kicks off a Lambda that writes structured rows into DynamoDB, classifies the message, and forwards a copy to Gmail as a write-only backup. Cass queries the DDB index for any email-related task — "did I get the appointment confirmation," "show me everything from my CPA in May," "summarize today's inbox." The mail layer doubles as the audit trail for any outbound Cass sends (BCC to a dedicated alias).
Outbound voice and SMS
Two-way SMS goes through AWS End User Messaging on a registered toll-free number, with inbound replies routed back into the same Cass conversation that initiated the thread. Voice calls go through Amazon Connect; the call is recorded automatically and the recording lands in S3 30 seconds after disconnect, where it can be transcribed or replayed. Cass uses these for real-world tasks — confirming reservations, calling the vet, dealing with utility companies.
Memex — queryable second brain
Beneath Cass sits Memex, a hybrid retrieval index over my entire personal corpus: ~80,000 messages across iMessage, Signal, email, and Teams. Polls every minute, normalizes, writes raw JSONL to S3, and maintains a hybrid (BM25 + embedding) index. Cass queries Memex when I ask “what did Sheri say about the new building” or “find the last time we discussed the well pump.”
Operating model
- Always on. Cass is reachable 24/7 from my phone, my Mac, or any browser. It has been the front door to my digital life for months.
- Acts on its own. Cron tasks let Cass initiate work without me — daily digests, morning briefings, garden sprinkler weather watch, escalation handling from Evangeline (my girlfriend's assistant, a separate Cass deployment on her household tenant).
- Cheap to run. Total monthly cost is dominated by the Claude Max subscription. Bedrock spend is sub-$10/mo. Lambda + DynamoDB are pennies. The homelab is sunk cost.
- Self-restartable. Cass has a
/admin/restartendpoint that drains in-flight work before SIGTERM and lets systemd bring it back; deploys are safe to run even mid-conversation.
Why this is a thing other engineers should care about
Cass is the most honest portfolio piece I have, because it is the system I trust myself to. It is what I built when I had no client, no PRD, no committee — just the question “what is the smallest version of an AI agent platform that I would actually depend on every day.” The answers are baked into the architecture: persistent memory matters more than reasoning sophistication; tool depth matters more than tool count; the assistant's job is to take real action in the real world, not summarize web pages.