High Latency Alert
Why do you receive this message?
The high latency alert is sent when the system detects that the node responds too slowly. You receive it when:
- P95 latency (95th percentile of response time) exceeds threshold:
- WARNING: ≥ 1000ms (1 second) for the last 60 minutes
- CRITICAL: ≥ 2000ms (2 seconds) for the last 60 minutes
- The node is ONLINE but works slowly
- Enough data to calculate P95 (minimum samples from last 60 minutes)
⚠️ Warning! High latency may indicate performance problems that can lead to synchronization issues or node failure.
What does the message contain?
- Alert image (IMG_0500.png) - Sent as the first message
- Severity level:
- ⚠️ WARNING - Action recommended (P95 ≥ 1000ms)
- 🚨 CRITICAL - Action required (P95 ≥ 2000ms)
- Metrics details:
- P95 Latency - value in milliseconds
- Threshold - threshold that was exceeded
- Window - last 60 minutes
- Normal Range - 50-300 ms (expected range)
- Possible causes:
- High CPU load
- Disk I/O wait (slow disk)
- Network overload
- RPC overload (too many queries)
- Memory pressure (lack of memory)
- Recommended actions - Specific diagnostic commands
How should you react?
-
Check CPU load:
top
# or
htopLook for processes consuming a lot of CPU
-
Check disk I/O:
iostat -x 1 10Check if there are write/read problems
-
Check network:
ping google.com
mtr google.comCheck network latency
-
Check system logs:
journalctl -u redbelly.service --since today -
Check application logs:
tail -f /var/log/redbelly/rbn_logs/*.log -
Check memory:
free -hCheck if there are memory problems (swap usage)
-
If the problem persists:
- Consider restarting the node
- Check if there are network problems
- Contact the Redbelly community on Discord
Sending Logic
| Element | Details |
|---|---|
| Trigger | check_latency_alerts() function called every 5 minutes |
| Check frequency | Every 5 minutes (20 monitoring cycles × 15 seconds) |
| Time window | Last 60 minutes of data |
| Thresholds | • WARNING: P95 ≥ 1000ms • CRITICAL: P95 ≥ 2000ms |
| Cooldown | • 30 minutes between alerts (to avoid spam) • Disabled if latency increases by >50% |
| Conditions | • Enough data (P95 calculable) • P95 exceeds threshold • Outside cooldown or significant degradation |
| Format | First image (IMG_0500.png), then text message in Markdown |