check_and_reboot
An automated network card recovery tool written in Julia that monitors network connectivity and triggers system reboot to recover from network card hang/stuck conditions.
Overview
This project consists of two monitoring scripts that detect network failures and automatically reboot the system to recover the network card:
check_router_reboot.jl- Monitors router connectivity via ICMP ping to detect when the network stack becomes unresponsivecheck_yiem_website_reboot.jl- Monitors website availability via HTTP requests as an additional network health indicator
Both scripts run continuously in the background, performing periodic health checks and automatically rebooting the system if consecutive failures exceed a configured threshold. The reboot serves to reset the network card and restore network connectivity.
Features
- Continuous Monitoring: Runs indefinitely with configurable check intervals
- Multi-attempt Verification: Retries failed checks with backoff before declaring failure
- State Persistence: Maintains state in JSON file across restarts
- Cooldown Period: Prevents rapid repeated reboots after a reboot event
- Cross-Platform Support: Works on Linux, macOS, and Windows with appropriate reboot commands
- Broadcast Notifications: Sends system-wide notifications on events (via
wall,logger, or platform equivalents) - Log Rotation: Automatically limits log file to last 100 entries to prevent unbounded growth
- Dry Run Mode: Test configuration without triggering actual reboots
Configuration
Router Monitor Configuration (check_router_reboot.jl)
const ROUTER_IP = "192.168.88.1" # Target router IP address
const TIMEOUT_SECS = 30 # Request timeout in seconds
const ATTEMPTS_PER_CHECK = 1 # Number of ping attempts per check
const BACKOFF_BETWEEN_ATTEMPTS = 1 # Seconds between retry attempts
const FAILS_TO_REBOOT = 3 # Consecutive failures before reboot
const COOLDOWN_AFTER_REBOOT_SECS = 600 # Minimum seconds between reboots
const DRY_RUN = false # Set false to enable actual reboots
const CHECK_INTERVAL_SECS = 60 # Check interval in seconds
Website Monitor Configuration (check_yiem_website_reboot.jl)
const URL = "https://www.yiem.cc" # Target URL to monitor
const TIMEOUT_SECS = 30 # Request timeout in seconds
const ATTEMPTS_PER_CHECK = 3 # Number of HTTP attempts per check
const BACKOFF_BETWEEN_ATTEMPTS = 60 # Seconds between retry attempts
const FAILS_TO_REBOOT = 3 # Consecutive failures before reboot
const COOLDOWN_AFTER_REBOOT_SECS = 600 # Minimum seconds between reboots
const DRY_RUN = false # Set false to enable actual reboots
const CHECK_INTERVAL_SECS = 60 # Check interval in seconds
Usage
Running Manually
julia check_router_reboot.jl
julia check_yiem_website_reboot.jl
Running at System Boot (Crontab)
Add the following to root's crontab (sudo crontab -e):
@reboot /usr/local/bin/juliar /path/to/check_router_reboot.jl >> /var/log/check_reboot.log 2>&1
@reboot /usr/local/bin/juliar /path/to/check_yiem_website_reboot.jl >> /var/log/check_reboot.log 2>&1
Note: The scripts use juliar which is a symlink to Julia for root (separate from user's Julia installation).
Required Dependencies
Install the required Julia packages:
julia -e 'using Pkg; Pkg.add(["HTTP", "Dates", "JSON"])'
Files
| File | Description |
|---|---|
check_router_reboot.jl |
Router ping monitor with auto-reboot |
check_yiem_website_reboot.jl |
Website HTTP monitor with auto-reboot |
check_and_reboot_state.json |
State persistence file (generated) |
check_router_reboot_log.txt |
Router monitor log file |
check_website_reboot_log.txt |
Website monitor log file |
State File
The state is stored in check_and_reboot_state.json with the following structure:
{
"consecutive_fails": 0,
"last_reboot_datetime": "2026-03-11T10:00:00"
}
Log Output
Logs are written to both the console and the respective log files with timestamps:
[2026-03-11T10:05:09.123] Starting check loop. Checking router 192.168.88.1 every 60 seconds.
[2026-03-11T10:05:09.456] 192.168.88.1 is reachable; resetting consecutive failure counter.
[2026-03-11T10:06:09.789] 192.168.88.1 is unreachable (last result: no response). Consecutive fails: 1/3.
Reboot Commands
The scripts automatically select the appropriate reboot command based on the operating system:
- Linux:
sudo systemctl rebootorsudo reboot - macOS:
sudo shutdown -r now - Windows:
shutdown /r /t 0
Safety Features
- Cooldown Period: After a reboot, the script waits
COOLDOWN_AFTER_REBOOT_SECSseconds before performing another check - Consecutive Failures: Requires multiple consecutive failures before triggering a reboot
- Dry Run Mode: Set
DRY_RUN = trueto test without actually rebooting