Push notifications only firing on the final assistant message

FILE 0x41·PUSH NOTIFICATIONS ONLY FIRING ON THE FINAL ASSISTANT MESSAG

April 21, 2026 · push, agents, claude-code

My homelab assistant runs as a Claude Code agent in the background. When it finishes a long task, it pushes a notification to my phone with the result. The notifications worked, but they only ever covered the last message of a multi- step turn. All the intermediate progress updates the agent streamed along the way were silently dropped.

What was happening

The push pipeline hooked into the Claude Code Stop event, which fires once per turn (when the agent decides it's done). The handler called extract_last_assistant_text() and pushed that. If the agent's turn looked like:

1. "I'll check the logs now…"
2. <tool: read logs>
3. "Found three errors, fixing the first…"
4. <tool: edit file>
5. "Done. Here's the summary."

…only message 5 ever became a push. Messages 1 and 3 — which are actually the most useful for "is this thing making progress?" — never reached the phone. The hook had no idea they'd happened.

What I found

Stop is the wrong event for live progress. It fires once per turn by design. Streaming progress updates are emitted by the agent during the turn, between tool calls, and they hit the backend's message store as they arrive.

Three options, in order of complexity:

Move the push logic into the message-insert path. Each time a new assistant message lands in storage, fire a push if the message is non-empty and the parent conversation is one the user has push enabled for. This is the right answer; it couples push notification to the canonical source of new assistant text.
Poll the messages table. A small watcher loop that queries for new assistant rows since the last seen id and pushes each one. Simpler but uses extra resources and adds latency.
Fire push per-message during streaming. Hook into the streaming layer directly. More invasive and harder to dedupe across reconnects.

Option 1 won. The push code lives in one place (the message store), it gets called by every code path that adds an assistant message, and the per-conversation push-enabled bit is just a column.

The fix

async def add_assistant_message(conv_id, text, *, final=False):
    msg_id = await _insert(conv_id, "assistant", text)
    await touch(conv_id)
    conv = await get_conversation(conv_id)
    if conv.push_enabled and text.strip():
        await push.notify(
            conv.user_id,
            title=conv.title or "Cass",
            body=text[:200],
            conv_id=conv_id,
            msg_id=msg_id,
            silent=not final,  # progress = silent push, final = banner
        )
    return msg_id

The silent= flag is the small refinement that makes the new behavior tolerable. Mid-turn updates land as silent pushes — the app updates its state in the background but the phone doesn't buzz. The final-of-turn message is a banner. So now I see live progress when I have the app open, and I still get exactly one buzz per finished task.

What I'd do differently

The Stop-event hook was the easiest thing to wire up in the beginning. It also encoded an assumption — "the only thing worth notifying about is the final answer" — that turned out to be wrong as soon as I started running multi-step tasks. The whole point of long-running agent work is that you want to know during it, not just after.

I'd add a contract test for this kind of pipeline next time: a fixture turn with three intermediate assistant messages and one final, and an assertion that I see four pushes (three silent, one banner). Without that, the regression to "only the last message" is impossible to catch until you notice you've missed three status updates in a row.