From 758b5ebad2302662b042b705761e75f6334eca54 Mon Sep 17 00:00:00 2001 From: narawat Date: Wed, 11 Mar 2026 17:08:28 +0700 Subject: [PATCH] add readme --- README.md | 129 +++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 124 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 821137e..226824f 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,124 @@ - - - -@reboot /usr/local/bin/juliar /home/ton/docker-programs/check_yiem_website_reboot/check_yiem_website_reboot.jl >> /var/log/check_reboot.log 2>&1 -# *** juliar is root's julia (sudo crontab -e) but I symlinked to juliar because I want to seperate it from user's julia \ No newline at end of file +# check_and_reboot + +An automated monitoring and system recovery tool written in Julia that checks network connectivity and website availability, with automatic reboot capabilities when failures are detected. + +## Overview + +This project consists of two monitoring scripts: + +- **[`check_router_reboot.jl`](check_router_reboot.jl)** - Monitors router connectivity via ICMP ping +- **[`check_yiem_website_reboot.jl`](check_yiem_website_reboot.jl)** - Monitors website availability via HTTP requests + +Both scripts run continuously in the background, performing periodic health checks and automatically rebooting the system if consecutive failures exceed a configured threshold. + +## Features + +- **Continuous Monitoring**: Runs indefinitely with configurable check intervals +- **Multi-attempt Verification**: Retries failed checks with backoff before declaring failure +- **State Persistence**: Maintains state in JSON file across restarts +- **Cooldown Period**: Prevents rapid repeated reboots after a reboot event +- **Cross-Platform Support**: Works on Linux, macOS, and Windows with appropriate reboot commands +- **Broadcast Notifications**: Sends system-wide notifications on events (via `wall`, `logger`, or platform equivalents) +- **Log Rotation**: Automatically limits log file to last 100 entries to prevent unbounded growth +- **Dry Run Mode**: Test configuration without triggering actual reboots + +## Configuration + +### Router Monitor Configuration (`check_router_reboot.jl`) + +```julia +const ROUTER_IP = "192.168.88.1" # Target router IP address +const TIMEOUT_SECS = 30 # Request timeout in seconds +const ATTEMPTS_PER_CHECK = 3 # Number of ping attempts per check +const BACKOFF_BETWEEN_ATTEMPTS = 60 # Seconds between retry attempts +const FAILS_TO_REBOOT = 3 # Consecutive failures before reboot +const COOLDOWN_AFTER_REBOOT_SECS = 600 # Minimum seconds between reboots +const DRY_RUN = true # Set false to enable actual reboots +const CHECK_INTERVAL_SECS = 60 # Check interval in seconds +``` + +### Website Monitor Configuration (`check_yiem_website_reboot.jl`) + +```julia +const URL = "https://www.yiem.cc" # Target URL to monitor +const TIMEOUT_SECS = 30 # Request timeout in seconds +const ATTEMPTS_PER_CHECK = 3 # Number of HTTP attempts per check +const BACKOFF_BETWEEN_ATTEMPTS = 60 # Seconds between retry attempts +const FAILS_TO_REBOOT = 3 # Consecutive failures before reboot +const COOLDOWN_AFTER_REBOOT_SECS = 600 # Minimum seconds between reboots +const DRY_RUN = false # Set false to enable actual reboots +const CHECK_INTERVAL_SECS = 60 # Check interval in seconds +``` + +## Usage + +### Running Manually + +```bash +julia check_router_reboot.jl +julia check_yiem_website_reboot.jl +``` + +### Running at System Boot (Crontab) + +Add the following to root's crontab (`sudo crontab -e`): + +``` +@reboot /usr/local/bin/juliar /path/to/check_router_reboot.jl >> /var/log/check_reboot.log 2>&1 +@reboot /usr/local/bin/juliar /path/to/check_yiem_website_reboot.jl >> /var/log/check_reboot.log 2>&1 +``` + +**Note**: The scripts use `juliar` which is a symlink to Julia for root (separate from user's Julia installation). + +### Required Dependencies + +Install the required Julia packages: + +```bash +julia -e 'using Pkg; Pkg.add(["HTTP", "Dates", "JSON"])' +``` + +## Files + +| File | Description | +|------|-------------| +| [`check_router_reboot.jl`](check_router_reboot.jl) | Router ping monitor with auto-reboot | +| [`check_yiem_website_reboot.jl`](check_yiem_website_reboot.jl) | Website HTTP monitor with auto-reboot | +| [`check_and_reboot_state.json`](check_and_reboot_state.json) | State persistence file (generated) | +| [`check_router_reboot_log.txt`](check_router_reboot_log.txt) | Router monitor log file | +| [`check_website_reboot_log.txt`](check_website_reboot_log.txt) | Website monitor log file | + +## State File + +The state is stored in [`check_and_reboot_state.json`](check_and_reboot_state.json) with the following structure: + +```json +{ + "consecutive_fails": 0, + "last_reboot_datetime": "2026-03-11T10:00:00" +} +``` + +## Log Output + +Logs are written to both the console and the respective log files with timestamps: + +``` +[2026-03-11T10:05:09.123] Starting check loop. Checking router 192.168.88.1 every 60 seconds. +[2026-03-11T10:05:09.456] 192.168.88.1 is reachable; resetting consecutive failure counter. +[2026-03-11T10:06:09.789] 192.168.88.1 is unreachable (last result: no response). Consecutive fails: 1/3. +``` + +## Reboot Commands + +The scripts automatically select the appropriate reboot command based on the operating system: + +- **Linux**: `sudo systemctl reboot` or `sudo reboot` +- **macOS**: `sudo shutdown -r now` +- **Windows**: `shutdown /r /t 0` + +## Safety Features + +1. **Cooldown Period**: After a reboot, the script waits `COOLDOWN_AFTER_REBOOT_SECS` seconds before performing another check +2. **Consecutive Failures**: Requires multiple consecutive failures before triggering a reboot +3. **Dry Run Mode**: Set `DRY_RUN = true` to test without actually rebooting