# check_and_reboot An automated monitoring and system recovery tool written in Julia that checks network connectivity and website availability, with automatic reboot capabilities when failures are detected. ## Overview This project consists of two monitoring scripts: - **[`check_router_reboot.jl`](check_router_reboot.jl)** - Monitors router connectivity via ICMP ping - **[`check_yiem_website_reboot.jl`](check_yiem_website_reboot.jl)** - Monitors website availability via HTTP requests Both scripts run continuously in the background, performing periodic health checks and automatically rebooting the system if consecutive failures exceed a configured threshold. ## Features - **Continuous Monitoring**: Runs indefinitely with configurable check intervals - **Multi-attempt Verification**: Retries failed checks with backoff before declaring failure - **State Persistence**: Maintains state in JSON file across restarts - **Cooldown Period**: Prevents rapid repeated reboots after a reboot event - **Cross-Platform Support**: Works on Linux, macOS, and Windows with appropriate reboot commands - **Broadcast Notifications**: Sends system-wide notifications on events (via `wall`, `logger`, or platform equivalents) - **Log Rotation**: Automatically limits log file to last 100 entries to prevent unbounded growth - **Dry Run Mode**: Test configuration without triggering actual reboots ## Configuration ### Router Monitor Configuration (`check_router_reboot.jl`) ```julia const ROUTER_IP = "192.168.88.1" # Target router IP address const TIMEOUT_SECS = 30 # Request timeout in seconds const ATTEMPTS_PER_CHECK = 3 # Number of ping attempts per check const BACKOFF_BETWEEN_ATTEMPTS = 60 # Seconds between retry attempts const FAILS_TO_REBOOT = 3 # Consecutive failures before reboot const COOLDOWN_AFTER_REBOOT_SECS = 600 # Minimum seconds between reboots const DRY_RUN = true # Set false to enable actual reboots const CHECK_INTERVAL_SECS = 60 # Check interval in seconds ``` ### Website Monitor Configuration (`check_yiem_website_reboot.jl`) ```julia const URL = "https://www.yiem.cc" # Target URL to monitor const TIMEOUT_SECS = 30 # Request timeout in seconds const ATTEMPTS_PER_CHECK = 3 # Number of HTTP attempts per check const BACKOFF_BETWEEN_ATTEMPTS = 60 # Seconds between retry attempts const FAILS_TO_REBOOT = 3 # Consecutive failures before reboot const COOLDOWN_AFTER_REBOOT_SECS = 600 # Minimum seconds between reboots const DRY_RUN = false # Set false to enable actual reboots const CHECK_INTERVAL_SECS = 60 # Check interval in seconds ``` ## Usage ### Running Manually ```bash julia check_router_reboot.jl julia check_yiem_website_reboot.jl ``` ### Running at System Boot (Crontab) Add the following to root's crontab (`sudo crontab -e`): ``` @reboot /usr/local/bin/juliar /path/to/check_router_reboot.jl >> /var/log/check_reboot.log 2>&1 @reboot /usr/local/bin/juliar /path/to/check_yiem_website_reboot.jl >> /var/log/check_reboot.log 2>&1 ``` **Note**: The scripts use `juliar` which is a symlink to Julia for root (separate from user's Julia installation). ### Required Dependencies Install the required Julia packages: ```bash julia -e 'using Pkg; Pkg.add(["HTTP", "Dates", "JSON"])' ``` ## Files | File | Description | |------|-------------| | [`check_router_reboot.jl`](check_router_reboot.jl) | Router ping monitor with auto-reboot | | [`check_yiem_website_reboot.jl`](check_yiem_website_reboot.jl) | Website HTTP monitor with auto-reboot | | [`check_and_reboot_state.json`](check_and_reboot_state.json) | State persistence file (generated) | | [`check_router_reboot_log.txt`](check_router_reboot_log.txt) | Router monitor log file | | [`check_website_reboot_log.txt`](check_website_reboot_log.txt) | Website monitor log file | ## State File The state is stored in [`check_and_reboot_state.json`](check_and_reboot_state.json) with the following structure: ```json { "consecutive_fails": 0, "last_reboot_datetime": "2026-03-11T10:00:00" } ``` ## Log Output Logs are written to both the console and the respective log files with timestamps: ``` [2026-03-11T10:05:09.123] Starting check loop. Checking router 192.168.88.1 every 60 seconds. [2026-03-11T10:05:09.456] 192.168.88.1 is reachable; resetting consecutive failure counter. [2026-03-11T10:06:09.789] 192.168.88.1 is unreachable (last result: no response). Consecutive fails: 1/3. ``` ## Reboot Commands The scripts automatically select the appropriate reboot command based on the operating system: - **Linux**: `sudo systemctl reboot` or `sudo reboot` - **macOS**: `sudo shutdown -r now` - **Windows**: `shutdown /r /t 0` ## Safety Features 1. **Cooldown Period**: After a reboot, the script waits `COOLDOWN_AFTER_REBOOT_SECS` seconds before performing another check 2. **Consecutive Failures**: Requires multiple consecutive failures before triggering a reboot 3. **Dry Run Mode**: Set `DRY_RUN = true` to test without actually rebooting