The AWS bill that tripled (and how I didn't notice for six days)
The email from AWS arrived on a Tuesday. Subject line: "Your estimated charges have exceeded your billing alert threshold."
The threshold was $100. The bill was at $312 and it was only the 9th of the month.
I'd spun up an r6g.xlarge to run a batch job the previous Thursday. The job finished in two hours. I forgot to terminate the instance.
Six days of r6g.xlarge at $0.2016/hr: $29.03. The real problem was the NAT gateway I'd left on alongside it — $45 of data processing I hadn't expected. Then the CloudWatch logs that were aggregating to S3. Then the S3 Intelligent-Tiering that kicked in on a bucket I didn't know existed.
None of these were individually alarming. Together they were $212 over expected.
The thing that failed wasn't AWS. It was my monitoring.
The billing alert problem
I had set up a billing alert. It fired. That's a success on paper.
But a billing alert only tells you the damage after it's done. You set a threshold at $100, and you find out when you've crossed it. You don't find out which service caused it, or when it started, or whether it's still running.
A threshold alert is a circuit breaker. It's not a smoke alarm.
What I actually needed was: tell me the day a service's cost increases significantly, before it compounds.
What AWS Cost Anomaly Detection does (and doesn't)
AWS has a built-in answer to this: Cost Anomaly Detection. I tried it.
What it does well: machine learning-based anomaly detection that learns your baseline over time. For large, complex accounts with many services and unpredictable traffic patterns, it's genuinely good.
What it doesn't do well: for small accounts (1-3 engineers, <$500/mo), it frequently generates false positives early on, doesn't give you a plain-language daily summary, and costs a minimum of $8/mo just to run the detection rules.
More importantly: the notification format is dense. You get a JSON blob with confidence intervals and anomaly impact ranges. It's accurate, but it's not actionable without reading it carefully.
For a solo dev who wants to know "did anything weird happen yesterday?", it's overkill and then some.
What I built instead
CostWatch is a simpler version of the same idea:
- Every morning at 06:00 UTC, assume a read-only IAM role in your account
- Fetch yesterday's cost per service (Cost Explorer API)
- Compare against the 7-day rolling average
- If any service jumped more than 50% AND more than $5 in absolute terms, send an email
The $5 floor matters. Without it, a service that went from $0.01/day to $0.02/day would trigger an alert. The combination of percentage threshold and absolute threshold means you only hear about things worth knowing.
The email looks like this:
CostWatch anomaly alert — Production
Yesterday's spend had unexpected spikes:
Amazon EC2: $47.82 (7-day avg: $18.20, +163%)
AWS Data Transfer: $12.44 (7-day avg: $4.80, +159%)
No JSON, no confidence intervals, no machine learning required. Just: this service cost more than usual yesterday.
The cross-account IAM setup
The part I spent the most time on was making the account connection trustworthy.
CostWatch uses STS AssumeRole with an external ID to access Cost Explorer in your account. The IAM role is read-only — five ce:* permissions, nothing else. No EC2, no S3, no IAM read. The external ID (cw-{your-user-id}) prevents confused deputy attacks.
You deploy a CloudFormation snippet that creates the role. The stack output is the role ARN, which you paste into the CostWatch dashboard. CostWatch validates it immediately by doing a test sts:AssumeRole — if your trust policy is wrong, you get an error message with the exact issue.
One-time setup. Two minutes.
When this is overkill
If you're running a production service with hundreds of thousands of dollars per month in AWS spend, use AWS Cost Anomaly Detection. It's better for that problem.
CostWatch is for the case where you want a simple, plain-English email that tells you what changed yesterday. No console navigation required, no threshold tuning, no machine learning bootstrapping period.
Free tier: 1 account, weekly Monday digest. Solo ($5/mo): daily anomaly emails. Team ($12/mo): 5 accounts, daily emails.
The self-hosted version is a SAM deploy to your own account — same code, no monthly fee, you just run the Lambda yourself.
The r6g.xlarge incident cost me $212. CostWatch would have caught it on day two.