Skip to main content

ONLINE Alert (Node Returned to Operation)

Why do you receive this message?​

The ONLINE alert is sent when the system detects that the node returned to operation after a failure period. You receive it when:

  • The node responds to HTTP queries again
  • Status changes from OFFLINE to ONLINE
  • The system successfully retrieved metrics from the node

✅ Good news!

The node is working again. Check if it's synchronized and if everything works normally.

What does the message contain?​

  • Recovery image - Sent as the first message
  • Status - ✅ ONLINE – Responding to requests
  • Detection time - Exact timestamp of return to operation
  • Failure duration (if available):
    • Duration - time from last_down_at to now
    • Last known OFFLINE event - when the node stopped working
  • Summary - Information about node return to operation
  • Recommendations - Check synchronization and stability

How should you react?​

  1. Check synchronization:

    • Open the monitoring panel
    • Check if the node is synchronized (block difference ≤ 5)
  2. Check logs (if the failure was long):

    journalctl -u redbelly.service --since "1 hour ago"
  3. Check stability:

    • Is the node staying ONLINE?
    • Are there frequent ONLINE/OFFLINE switches (flapping)?
  4. If the node frequently switches:

    • Check system resources (CPU, RAM, disk)
    • Check logs for errors
    • Check network connection

Sending Logic​

ElementDetails
TriggerStatus change from OFFLINE → ONLINE in background_monitor() function
Check frequencyEvery 15 seconds (each monitoring cycle)
Conditions• previous_online_status != is_online
• is_online == True
• Telegram alerts enabled
• Uses last_down_at to calculate failure time
FormatFirst image, then text message in Markdown
Duplicate preventionAlert sent only on status change