ONLINE Alert (Node Returned to Operation)
Why do you receive this message?​
The ONLINE alert is sent when the system detects that the node returned to operation after a failure period. You receive it when:
- The node responds to HTTP queries again
- Status changes from OFFLINE to ONLINE
- The system successfully retrieved metrics from the node
✅ Good news!
The node is working again. Check if it's synchronized and if everything works normally.
What does the message contain?​
- Recovery image - Sent as the first message
- Status - ✅ ONLINE – Responding to requests
- Detection time - Exact timestamp of return to operation
- Failure duration (if available):
- Duration - time from
last_down_atto now - Last known OFFLINE event - when the node stopped working
- Duration - time from
- Summary - Information about node return to operation
- Recommendations - Check synchronization and stability
How should you react?​
-
Check synchronization:
- Open the monitoring panel
- Check if the node is synchronized (block difference ≤ 5)
-
Check logs (if the failure was long):
journalctl -u redbelly.service --since "1 hour ago" -
Check stability:
- Is the node staying ONLINE?
- Are there frequent ONLINE/OFFLINE switches (flapping)?
-
If the node frequently switches:
- Check system resources (CPU, RAM, disk)
- Check logs for errors
- Check network connection
Sending Logic​
| Element | Details |
|---|---|
| Trigger | Status change from OFFLINE → ONLINE in background_monitor() function |
| Check frequency | Every 15 seconds (each monitoring cycle) |
| Conditions | • previous_online_status != is_online• is_online == True• Telegram alerts enabled • Uses last_down_at to calculate failure time |
| Format | First image, then text message in Markdown |
| Duplicate prevention | Alert sent only on status change |