Why the sleep-window learner kept collapsing to the wrong four hours

FILE 0xFC·WHY THE SLEEP-WINDOW LEARNER KEPT COLLAPSING TO THE WRONG FO

April 24, 2026 · aws, lambda, anomaly-detection, debugging

The v1 sleep-window learner for a side-project safety app produced a sleep window of 01:00–05:00 for me. My actual sleep is more like 23:00–07:00. The pre-alert cron was firing at 05:00 and 06:05 and I was tired of pulling my phone out from under the pillow to silence it.

What was happening

v1 scored every hour of day from 0..23 using

rawScore[h] = 0.4 * activityScore[h] + 0.6 * gapScore[h]

then searched windows of duration 4..12 hours and picked the window with the lowest mean score. The lowest mean over a 4-hour window almost always beats the lowest mean over an 8-hour window — because the 4-hour window can fit entirely inside the deepest valley while the 8-hour window has to include shoulders.

So the learner collapsed to the narrowest possible quiet stretch. For me, that was 01:00–05:00, the dead-quiet middle of my real sleep, and the alerting threshold treated 05:00 to 07:00 as awake hours.

What I found

Two compounding issues:

Scoring by mean rewards short windows. The duration sweep was real but the metric punished its own purpose.
A single noisy hour in the middle of sleep (a 3am pee-trip ping) distorted the boundary heavily.

The fix

v2 changes both:

// 3-hour rolling mean of the raw score — one anomalous hour
// no longer distorts the boundary
for ($h = 0; $h < 24; $h++) {
    $sleepScore[$h] = (
        $rawScore[($h - 1 + 24) % 24] +
        $rawScore[$h] +
        $rawScore[($h + 1) % 24]
    ) / 3.0;
}

// Score windows by SUM, not mean. Bias toward 8h windows.
foreach (range(6, 10) as $duration) {
    foreach (range(0, 23) as $start) {
        $score = sumWindow($sleepScore, $start, $duration);
        $score *= 1 - abs($duration - 8) * 0.08;
        // remember best
    }
}

Also added a grace ramp at wake time. The old code went binary ("in sleep" → 10h threshold, "out of sleep" → 2h threshold) at the exact hour sleep_end_hour. So at 07:01, a user who had been quiet since midnight (a perfectly normal 7-hour gap) was suddenly compared to a 2-hour threshold and pre-alerted. The fix is a 1-hour linear ramp from sleep threshold down to active threshold:

$hoursIntoMorning = $nowHour - $sleepEndHour;
if ($hoursIntoMorning >= 0 && $hoursIntoMorning < 1.0) {
    $t = $hoursIntoMorning; // 0..1
    $threshold = $sleepThreshold + $t * ($activeThreshold - $sleepThreshold);
} else {
    $threshold = $hoursIntoMorning < 0 ? $sleepThreshold : $activeThreshold;
}

Validated by forcing a re-learn against my 90 days of activity. v2 produced sleep_start=23, sleep_end=7, which matched a quick Python simulation I'd written against the same data.

What I'd do differently

The v1 mistake was using the same metric for ranking that I used for filtering. Search over duration with a duration-insensitive metric and you get the shortest viable duration every time. If you want to compare windows of different lengths, you have to either fix the length and compare positions, or use a metric that scales with length (sum) plus an explicit prior for the length you want (my 1 - |duration-8| * 0.08 bias).

I also should have shipped the grace ramp the day I shipped the window — the binary cliff at wake time was obvious in retrospect and not obvious to me at the time.