Sixty-five lines ahead of git, in production
I went to deploy a fix to a backend Lambda and stopped halfway
through because the file I was about to overwrite was 65 lines
ahead of the repo. Hot-patches I had applied weeks earlier had
never been committed back. A normal serverless deploy would
have rolled them back.
What was happening
The deployed cronCheckStatus Lambda had cron_check_status.php
with code that wasn't in the repo's main branch:
- a
HOME_TIED_SOURCESconstant for filtering home-tied passive sources when the user is away getUserHomeStatus()integration to suppress those alerts- a 24-hour per-source-set dedupe lock to stop hourly Sensor Offline retriggers
All of that had been deployed by patching the existing Lambda
zip in place — extract the zip, modify the file, re-zip, push
via aws lambda update-function-code. Quick, surgical, and a
landmine for future-me because nothing in source control knows
about it.
What I found
Two api.php and api_extensions.php were fine — those had
gone out through normal serverless deploy cycles, so the repo
and the deployed bundle matched. The drift was only on
cron_check_status.php, which had been hot-patched twice in
quick succession during incidents.
The fix
Reverse the drift before the next deploy:
# 1. Download the live Lambda's zip
aws lambda get-function \
--function-name checkonmine-prod-cronCheckStatus \
--query 'Code.Location' --output text \
| xargs curl -s -o /tmp/live.zip
# 2. Extract the file that drifted
unzip -p /tmp/live.zip public/cron_check_status.php \
> /tmp/live-cron_check_status.php
# 3. Diff against repo
diff -u public/cron_check_status.php /tmp/live-cron_check_status.php
# 4. Copy the live version into the repo
cp /tmp/live-cron_check_status.php public/cron_check_status.php
# 5. Commit with a "sync: recover hot-patches from deployed Lambda"
# message that explicitly names which files and which fixes
git add public/cron_check_status.php
git commit -m "sync: recover hot-patches from deployed Lambda"
# 6. NOW you can deploy new changes
serverless deploy --stage prod
Post-sync I also ran md5 checksums across the deployed Lambdas
to confirm zero drift on api.php,
cron_check_status.php, and dynamodb_service.php:
for fn in checkonmine-prod-{api,cronCheckStatus,cronRiskMonitor}; do
aws lambda get-function --function-name $fn \
--query 'Code.Location' --output text \
| xargs curl -s -o /tmp/$fn.zip
for f in public/api.php public/cron_check_status.php public/dynamodb_service.php; do
unzip -p /tmp/$fn.zip $f | md5sum
done
done
All identical, and all matching the repo.
What I'd do differently
The honest fix is "don't hot-patch production." But in practice, during an incident, copying a fixed file into the live zip and pushing is sometimes the right move — the incident is the priority, the commit can wait an hour.
The wrong move is letting that hour become a month. Two changes make this less painful:
- A scheduled job (weekly is fine) that diffs each deployed Lambda's source against the corresponding git ref and pages me if they differ. Same script as above, just looped.
- A team rule that hot-patches get a
git commitwithin the same ticket. Even if the commit is "TODO: this is on prod, need to reconcile before next deploy," at least the drift is visible.
I've added the first one to my homelab cron. The second one I just keep promising myself.