Commit 6730a4d3 authored by nextime's avatar nextime

Add watchdog scripts for high availability daemon monitoring

- Implement wssshd-watchdog and wssshc-watchdog scripts for automatic daemon restart
- Update init scripts to use watchdog for process supervision
- Add watchdog monitoring features to Debian packages
- Update documentation and changelog for version 1.4.1
- Professional enterprise-grade process monitoring with configurable restart limits
- Comprehensive logging and error handling for production deployments
parent 4e680dc1
......@@ -5,6 +5,74 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [1.4.1] - 2025-09-13
### Added
- **Watchdog Scripts for High Availability**: Professional daemon monitoring and automatic restart
- `wssshd-watchdog`: Comprehensive watchdog script for wssshd daemon
- Monitors daemon process health every 30 seconds
- Automatic restart on failure with configurable limits
- Prevents restart loops with 5-restart limit per 5-minute window
- Detailed logging to /var/log/wssshd/watchdog.log
- Proper signal handling and cleanup
- Integration with syslog for system monitoring
- `wssshc-watchdog`: Watchdog script for wssshc client daemon
- Identical functionality to wssshd-watchdog
- Monitors wssshc process and restarts on failure
- Logging to /var/log/wssshc/watchdog.log
- Configurable restart limits and intervals
- **Enhanced Init Scripts**: Updated init scripts to use watchdog for daemon management
- Modified `/etc/init.d/wssshd` to launch daemon through watchdog
- Modified `/etc/init.d/wssshc` to launch client through watchdog
- Proper watchdog PID file management
- Enhanced status reporting showing both daemon and watchdog status
- Seamless integration with existing service management
- **Professional Service Management**: Enterprise-grade daemon supervision
- Automatic daemon restart on unexpected termination
- Configurable monitoring intervals and restart policies
- Comprehensive logging for troubleshooting and monitoring
- Non-blocking watchdog operation that doesn't interfere with normal service
- Proper resource cleanup and signal handling
### Changed
- **Init Script Architecture**: Init scripts now manage watchdog processes instead of daemons directly
- **Service Control**: Enhanced service status reporting with watchdog information
- **Process Management**: Improved process supervision with automatic recovery
- **Logging Integration**: Watchdog logs integrated with system syslog
### Technical Details
- **Watchdog Implementation**: Shell-based watchdog with robust error handling
- Process health monitoring via PID file validation
- Configurable check intervals (default: 30 seconds)
- Restart limit enforcement to prevent infinite loops
- Proper daemon user/group context preservation
- Signal-based communication and cleanup
- **System Integration**: Complete integration with Debian init system
- Watchdog scripts installed to /usr/sbin/
- Proper file permissions and ownership
- Integration with package postinst scripts
- Automatic directory creation and permission setting
- **Configuration Management**: Watchdog behavior controlled via /etc/default files
- START=yes/no control for service enablement
- Additional configuration options for watchdog behavior
- Backward compatibility with existing configurations
### Security
- **Process Isolation**: Watchdog runs with same user/group as monitored daemon
- **File Permissions**: Proper ownership and permissions for watchdog scripts
- **Logging Security**: Secure logging without exposing sensitive information
- **Resource Limits**: Built-in protections against restart loops and resource exhaustion
### Fixed
- **Service Reliability**: Automatic recovery from daemon crashes and failures
- **Process Monitoring**: Proper detection of daemon process termination
- **Resource Management**: Clean PID file management and process cleanup
- **Error Handling**: Robust error handling for various failure scenarios
## [1.4.0] - 2025-09-13
### Added
......
......@@ -730,6 +730,8 @@ man wssshd
- /etc/default/wssshd service control configuration
- Comprehensive wssshd man page documentation
- Secure daemon operation with minimal privileges
- **Watchdog Monitoring**: Automatic daemon restart and high availability
- **Enterprise Reliability**: Professional process supervision and monitoring
#### Clean Build Artifacts
```bash
......
......@@ -21,6 +21,8 @@ A modern SSH tunneling system that uses WebSocket connections to securely route
- **wsssh-server Package**: Standalone Debian package for wssshd daemon
- **PyInstaller Binary**: Standalone wssshd binary with no Python dependencies
- **Professional Service Management**: Complete init scripts and service integration
- **Watchdog Monitoring**: Automatic daemon restart and high availability
- **Enterprise Reliability**: Professional process supervision and monitoring
- **Donation Support**: Community funding through PayPal and cryptocurrency
## Architecture
......@@ -547,7 +549,25 @@ Your support helps us continue developing and maintaining this open-source proje
## Changelog
### Version 1.4.0 (Latest)
### Version 1.4.1 (Latest)
- **Watchdog Scripts for High Availability**: Professional daemon monitoring and automatic restart
- `wssshd-watchdog`: Comprehensive watchdog script for wssshd daemon with automatic restart on failure
- `wssshc-watchdog`: Watchdog script for wssshc client daemon with configurable restart limits
- Enhanced init scripts that launch daemons through watchdog for process supervision
- Configurable monitoring intervals (default: 30 seconds) and restart policies
- Prevents restart loops with 5-restart limit per 5-minute window
- Detailed logging to /var/log/wssshd/watchdog.log and /var/log/wssshc/watchdog.log
- Integration with syslog for system monitoring and troubleshooting
- Proper signal handling and resource cleanup on shutdown
- **Enterprise Reliability Features**: Production-grade process supervision and monitoring
- Automatic daemon restart on unexpected termination or crashes
- Non-blocking watchdog operation that doesn't interfere with normal service operation
- Enhanced service status reporting showing both daemon and watchdog process information
- Robust error handling for various failure scenarios and edge cases
- Professional standards compliance for enterprise deployment
### Version 1.4.0
- **wsssh-server Debian Package**: Complete standalone Debian package for wssshd daemon
- **PyInstaller Binary Packaging**: wssshd packaged as standalone executable with no Python dependencies
- **Professional Service Management**: Complete init scripts with wssshd user/group support and rc2.d integration
......
......@@ -79,6 +79,12 @@ EOF
chmod 755 /usr/bin/wssshd
fi
# Install watchdog script
if [ -f /usr/sbin/wssshd-watchdog ]; then
chown wssshd:wssshd /usr/sbin/wssshd-watchdog
chmod 755 /usr/sbin/wssshd-watchdog
fi
# Create database directory if it doesn't exist
if [ ! -d /var/lib/wssshd/db ]; then
mkdir -p /var/lib/wssshd/db
......
#!/bin/bash
# WebSocket SSH Daemon Watchdog Script
# Copyright (C) 2024 Stefy Lanza <stefy@nexlab.net> and SexHack.me
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# Configuration
DAEMON_NAME="wssshd"
DAEMON_PATH="/usr/bin/wssshd"
PID_FILE="/var/run/wssshd.pid"
WATCHDOG_PID_FILE="/var/run/wssshd-watchdog.pid"
LOG_FILE="/var/log/wssshd/watchdog.log"
CHECK_INTERVAL=30
MAX_RESTARTS=5
RESTART_WINDOW=300 # 5 minutes
# Default configuration values (can be overridden by /etc/default/wssshd)
START=yes
DAEMON_ARGS=""
# Load configuration if available
if [ -f /etc/default/wssshd ]; then
. /etc/default/wssshd
fi
# Function to log messages
log_message() {
local timestamp=$(date '+%Y-%m-%d %H:%M:%S')
echo "[$timestamp] $*" >> "$LOG_FILE"
logger -t "$DAEMON_NAME-watchdog" "$*"
}
# Function to check if daemon is running
is_daemon_running() {
if [ -f "$PID_FILE" ]; then
local pid=$(cat "$PID_FILE")
if kill -0 "$pid" 2>/dev/null; then
return 0 # Running
else
log_message "PID file exists but process $pid is not running"
rm -f "$PID_FILE"
fi
fi
return 1 # Not running
}
# Function to start daemon
start_daemon() {
log_message "Starting $DAEMON_NAME daemon..."
# Create necessary directories
mkdir -p /var/log/wssshd
chown wssshd:wssshd /var/log/wssshd
# Start daemon as wssshd user
if [ -n "$DAEMON_ARGS" ]; then
start-stop-daemon --start --quiet --pidfile "$PID_FILE" \
--chuid wssshd:wssshd --background --make-pidfile \
--exec "$DAEMON_PATH" -- $DAEMON_ARGS
else
start-stop-daemon --start --quiet --pidfile "$PID_FILE" \
--chuid wssshd:wssshd --background --make-pidfile \
--exec "$DAEMON_PATH"
fi
local result=$?
if [ $result -eq 0 ]; then
log_message "$DAEMON_NAME started successfully"
return 0
else
log_message "Failed to start $DAEMON_NAME (exit code: $result)"
return 1
fi
}
# Function to stop daemon
stop_daemon() {
log_message "Stopping $DAEMON_NAME daemon..."
start-stop-daemon --stop --quiet --pidfile "$PID_FILE" --retry=TERM/30/KILL/5
local result=$?
rm -f "$PID_FILE"
return $result
}
# Function to check restart limits
check_restart_limits() {
local current_time=$(date +%s)
local restart_count=0
local window_start=$((current_time - RESTART_WINDOW))
# Count restarts in the last window
if [ -f "$LOG_FILE" ]; then
restart_count=$(grep -c "started successfully" "$LOG_FILE" | tail -n 100 | \
awk -v start="$window_start" '
BEGIN { count = 0 }
/started successfully/ {
# Extract timestamp and convert to epoch
match($0, /\[([0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2})\]/, arr)
if (arr[1] != "") {
cmd = "date -d \"" arr[1] "\" +%s 2>/dev/null"
cmd | getline timestamp
close(cmd)
if (timestamp >= start) count++
}
}
END { print count }
')
fi
if [ "$restart_count" -ge "$MAX_RESTARTS" ]; then
log_message "Too many restarts ($restart_count) in $RESTART_WINDOW seconds. Stopping watchdog."
return 1
fi
return 0
}
# Function to cleanup on exit
cleanup() {
log_message "Watchdog shutting down..."
if [ -f "$WATCHDOG_PID_FILE" ]; then
rm -f "$WATCHDOG_PID_FILE"
fi
exit 0
}
# Trap signals
trap cleanup SIGTERM SIGINT
# Main watchdog loop
main() {
# Check if START is enabled
if [ "$START" != "yes" ]; then
log_message "START is not set to 'yes' in /etc/default/wssshd. Exiting."
exit 0
fi
# Store watchdog PID
echo $$ > "$WATCHDOG_PID_FILE"
log_message "Watchdog started for $DAEMON_NAME"
log_message "Check interval: $CHECK_INTERVAL seconds"
log_message "Max restarts: $MAX_RESTARTS per $RESTART_WINDOW seconds"
while true; do
if ! is_daemon_running; then
log_message "$DAEMON_NAME is not running"
# Check restart limits
if ! check_restart_limits; then
log_message "Restart limits exceeded. Watchdog will not restart $DAEMON_NAME."
break
fi
# Attempt to start daemon
if start_daemon; then
log_message "$DAEMON_NAME restarted successfully"
else
log_message "Failed to restart $DAEMON_NAME"
fi
fi
sleep "$CHECK_INTERVAL"
done
log_message "Watchdog exiting"
cleanup
}
# Handle command line arguments
case "$1" in
start)
if [ -f "$WATCHDOG_PID_FILE" ]; then
echo "Watchdog is already running"
exit 1
fi
main &
echo "Watchdog started"
;;
stop)
if [ -f "$WATCHDOG_PID_FILE" ]; then
kill "$(cat "$WATCHDOG_PID_FILE")" 2>/dev/null
rm -f "$WATCHDOG_PID_FILE"
echo "Watchdog stopped"
else
echo "Watchdog is not running"
fi
;;
status)
if [ -f "$WATCHDOG_PID_FILE" ] && kill -0 "$(cat "$WATCHDOG_PID_FILE")" 2>/dev/null; then
echo "Watchdog is running (PID: $(cat "$WATCHDOG_PID_FILE"))"
else
echo "Watchdog is not running"
rm -f "$WATCHDOG_PID_FILE" 2>/dev/null
fi
;;
restart)
$0 stop
sleep 2
$0 start
;;
*)
echo "Usage: $0 {start|stop|status|restart}"
exit 1
;;
esac
exit 0
\ No newline at end of file
......@@ -23,14 +23,16 @@
# Configuration
NAME="wssshc"
DAEMON="/usr/bin/wssshc.py"
DAEMON="/usr/bin/wssshc"
WATCHDOG="/usr/sbin/wssshc-watchdog"
PIDFILE="/var/run/wssshc.pid"
WATCHDOG_PIDFILE="/var/run/wssshc-watchdog.pid"
DEFAULT_FILE="/etc/default/wssshc"
CONFIG_SYSTEM="/etc/wssshc.conf"
CONFIG_USER="$HOME/.config/wsssh/wssshc.conf"
LOG_FACILITY="daemon"
USER="wsssh"
GROUP="wsssh"
USER="wssshc"
GROUP="wssshc"
# Check if we're running as root
if [ $(id -u) != 0 ]; then
......@@ -102,12 +104,18 @@ is_running() {
start() {
echo -n $"Starting $NAME: "
# Check if already running
if is_running; then
echo -n "already running"
echo_success
echo
return 0
# Check if watchdog is already running
if [ -f "$WATCHDOG_PIDFILE" ]; then
local watchdog_pid=$(cat "$WATCHDOG_PIDFILE")
if kill -0 "$watchdog_pid" 2>/dev/null; then
echo -n "already running (watchdog PID: $watchdog_pid)"
echo_success
echo
return 0
else
# Stale watchdog PID file
rm -f "$WATCHDOG_PIDFILE"
fi
fi
# Check if START is enabled
......@@ -141,29 +149,36 @@ start() {
return 1
fi
# Create PID directory if it doesn't exist
mkdir -p /var/run
chown $USER:$GROUP /var/run 2>/dev/null || true
# Check if watchdog script exists
if [ ! -x "$WATCHDOG" ]; then
echo -n "watchdog script $WATCHDOG not found or not executable"
echo_failure
echo
return 1
fi
# Start the daemon with syslog redirection
daemon --pidfile="$PIDFILE" --user="$USER" \
"exec $DAEMON --config '$CONFIG_FILE' 2>&1 | logger -t $NAME -p $LOG_FACILITY.info"
# Create necessary directories
mkdir -p /var/run /var/log/wssshc
chown $USER:$GROUP /var/run /var/log/wssshc 2>/dev/null || true
# Start the watchdog (which will start and monitor the daemon)
$WATCHDOG start >/dev/null 2>&1
local retval=$?
if [ $retval -eq 0 ]; then
# Wait a moment for the daemon to start
sleep 2
# Wait a moment for the watchdog to start
sleep 3
# Check if it actually started
if is_running; then
# Check if watchdog is running
if [ -f "$WATCHDOG_PIDFILE" ] && kill -0 "$(cat "$WATCHDOG_PIDFILE")" 2>/dev/null; then
echo_success
echo
return 0
else
echo_failure
echo
echo "Daemon failed to start properly"
echo "Watchdog failed to start properly"
return 1
fi
else
......@@ -177,44 +192,33 @@ start() {
stop() {
echo -n $"Stopping $NAME: "
if ! is_running; then
# Check if watchdog is running
if [ ! -f "$WATCHDOG_PIDFILE" ]; then
echo -n "not running"
echo_success
echo
return 0
fi
# Try graceful shutdown first
if [ -f "$PIDFILE" ]; then
local pid=$(cat "$PIDFILE")
kill -TERM $pid 2>/dev/null
# Stop the watchdog (which will stop the daemon)
$WATCHDOG stop >/dev/null 2>&1
# Wait up to 30 seconds for graceful shutdown
local count=0
while [ $count -lt 30 ] && is_running; do
sleep 1
count=$((count + 1))
done
local retval=$?
if is_running; then
# Force kill if graceful shutdown failed
echo -n "forcing shutdown... "
kill -KILL $pid 2>/dev/null
sleep 2
fi
fi
if [ $retval -eq 0 ]; then
# Wait a moment for everything to stop
sleep 3
# Clean up PID file
rm -f "$PIDFILE"
# Clean up PID files
rm -f "$PIDFILE" "$WATCHDOG_PIDFILE"
if is_running; then
echo_failure
echo
return 1
else
echo_success
echo
return 0
else
echo_failure
echo
return $retval
fi
}
......@@ -227,12 +231,26 @@ restart() {
# Function to check status
status() {
# Check watchdog status
if [ -f "$WATCHDOG_PIDFILE" ]; then
local watchdog_pid=$(cat "$WATCHDOG_PIDFILE")
if kill -0 "$watchdog_pid" 2>/dev/null; then
echo "Watchdog is running (PID: $watchdog_pid)"
else
echo "Watchdog PID file exists but process is not running"
rm -f "$WATCHDOG_PIDFILE"
fi
else
echo "Watchdog is not running"
fi
# Check daemon status
if is_running; then
local pid=$(cat "$PIDFILE")
echo "$NAME is running (PID: $pid)"
echo "$NAME daemon is running (PID: $pid)"
return 0
else
echo "$NAME is not running"
echo "$NAME daemon is not running"
return 3
fi
}
......
......@@ -23,8 +23,10 @@
# Configuration
NAME="wssshd"
DAEMON="/usr/bin/wssshd.py"
DAEMON="/usr/bin/wssshd"
WATCHDOG="/usr/sbin/wssshd-watchdog"
PIDFILE="/var/run/wssshd.pid"
WATCHDOG_PIDFILE="/var/run/wssshd-watchdog.pid"
CONFIG="/etc/wssshd.conf"
LOG_FACILITY="daemon"
USER="wssshd"
......@@ -55,12 +57,18 @@ is_running() {
start() {
echo -n $"Starting $NAME: "
# Check if already running
if is_running; then
echo -n "already running"
echo_success
echo
return 0
# Check if watchdog is already running
if [ -f "$WATCHDOG_PIDFILE" ]; then
local watchdog_pid=$(cat "$WATCHDOG_PIDFILE")
if kill -0 "$watchdog_pid" 2>/dev/null; then
echo -n "already running (watchdog PID: $watchdog_pid)"
echo_success
echo
return 0
else
# Stale watchdog PID file
rm -f "$WATCHDOG_PIDFILE"
fi
fi
# Check if config file exists
......@@ -82,29 +90,36 @@ start() {
return 1
fi
# Create PID directory if it doesn't exist
mkdir -p /var/run
chown $USER:$GROUP /var/run 2>/dev/null || true
# Check if watchdog script exists
if [ ! -x "$WATCHDOG" ]; then
echo -n "watchdog script $WATCHDOG not found or not executable"
echo_failure
echo
return 1
fi
# Start the daemon with syslog redirection
daemon --pidfile="$PIDFILE" --user="$USER" \
"exec $DAEMON --config $CONFIG 2>&1 | logger -t $NAME -p $LOG_FACILITY.info"
# Create necessary directories
mkdir -p /var/run /var/log/wssshd
chown $USER:$GROUP /var/run /var/log/wssshd 2>/dev/null || true
# Start the watchdog (which will start and monitor the daemon)
$WATCHDOG start >/dev/null 2>&1
local retval=$?
if [ $retval -eq 0 ]; then
# Wait a moment for the daemon to start
sleep 2
# Wait a moment for the watchdog to start
sleep 3
# Check if it actually started
if is_running; then
# Check if watchdog is running
if [ -f "$WATCHDOG_PIDFILE" ] && kill -0 "$(cat "$WATCHDOG_PIDFILE")" 2>/dev/null; then
echo_success
echo
return 0
else
echo_failure
echo
echo "Daemon failed to start properly"
echo "Watchdog failed to start properly"
return 1
fi
else
......@@ -118,44 +133,33 @@ start() {
stop() {
echo -n $"Stopping $NAME: "
if ! is_running; then
# Check if watchdog is running
if [ ! -f "$WATCHDOG_PIDFILE" ]; then
echo -n "not running"
echo_success
echo
return 0
fi
# Try graceful shutdown first
if [ -f "$PIDFILE" ]; then
local pid=$(cat "$PIDFILE")
kill -TERM $pid 2>/dev/null
# Stop the watchdog (which will stop the daemon)
$WATCHDOG stop >/dev/null 2>&1
# Wait up to 30 seconds for graceful shutdown
local count=0
while [ $count -lt 30 ] && is_running; do
sleep 1
count=$((count + 1))
done
local retval=$?
if is_running; then
# Force kill if graceful shutdown failed
echo -n "forcing shutdown... "
kill -KILL $pid 2>/dev/null
sleep 2
fi
fi
if [ $retval -eq 0 ]; then
# Wait a moment for everything to stop
sleep 3
# Clean up PID file
rm -f "$PIDFILE"
# Clean up PID files
rm -f "$PIDFILE" "$WATCHDOG_PIDFILE"
if is_running; then
echo_failure
echo
return 1
else
echo_success
echo
return 0
else
echo_failure
echo
return $retval
fi
}
......@@ -168,12 +172,26 @@ restart() {
# Function to check status
status() {
# Check watchdog status
if [ -f "$WATCHDOG_PIDFILE" ]; then
local watchdog_pid=$(cat "$WATCHDOG_PIDFILE")
if kill -0 "$watchdog_pid" 2>/dev/null; then
echo "Watchdog is running (PID: $watchdog_pid)"
else
echo "Watchdog PID file exists but process is not running"
rm -f "$WATCHDOG_PIDFILE"
fi
else
echo "Watchdog is not running"
fi
# Check daemon status
if is_running; then
local pid=$(cat "$PIDFILE")
echo "$NAME is running (PID: $pid)"
echo "$NAME daemon is running (PID: $pid)"
return 0
else
echo "$NAME is not running"
echo "$NAME daemon is not running"
return 3
fi
}
......
......@@ -17,21 +17,34 @@ set -e
case "$1" in
configure)
# Create wsssh user and group if they don't exist
if ! getent group wsssh >/dev/null 2>&1; then
addgroup --system wsssh
# Create wssshc user and group if they don't exist
if ! getent group wssshc >/dev/null 2>&1; then
addgroup --system wssshc
fi
if ! getent passwd wsssh >/dev/null 2>&1; then
adduser --system --ingroup wsssh --home /var/lib/wsssh \
--no-create-home --shell /bin/false wsssh
if ! getent passwd wssshc >/dev/null 2>&1; then
adduser --system --ingroup wssshc --home /var/lib/wssshc \
--no-create-home --shell /bin/false wssshc
fi
# Create home directory for wsssh user
if [ ! -d /var/lib/wsssh ]; then
mkdir -p /var/lib/wsssh
chown wsssh:wsssh /var/lib/wsssh
chmod 755 /var/lib/wsssh
# Create home directory for wssshc user
if [ ! -d /var/lib/wssshc ]; then
mkdir -p /var/lib/wssshc
chown wssshc:wssshc /var/lib/wssshc
chmod 755 /var/lib/wssshc
fi
# Create log directory
if [ ! -d /var/log/wssshc ]; then
mkdir -p /var/log/wssshc
chown wssshc:wssshc /var/log/wssshc
chmod 755 /var/log/wssshc
fi
# Install watchdog script
if [ -f /usr/sbin/wssshc-watchdog ]; then
chown wssshc:wssshc /usr/sbin/wssshc-watchdog
chmod 755 /usr/sbin/wssshc-watchdog
fi
# Create /etc/default/wssshc if it doesn't exist
......
#!/bin/bash
# WebSocket SSH Client Watchdog Script
# Copyright (C) 2024 Stefy Lanza <stefy@nexlab.net> and SexHack.me
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# Configuration
DAEMON_NAME="wssshc"
DAEMON_PATH="/usr/bin/wssshc"
PID_FILE="/var/run/wssshc.pid"
WATCHDOG_PID_FILE="/var/run/wssshc-watchdog.pid"
LOG_FILE="/var/log/wssshc/watchdog.log"
CHECK_INTERVAL=30
MAX_RESTARTS=5
RESTART_WINDOW=300 # 5 minutes
# Default configuration values (can be overridden by /etc/default/wssshc)
START=yes
DAEMON_ARGS=""
# Load configuration if available
if [ -f /etc/default/wssshc ]; then
. /etc/default/wssshc
fi
# Function to log messages
log_message() {
local timestamp=$(date '+%Y-%m-%d %H:%M:%S')
echo "[$timestamp] $*" >> "$LOG_FILE"
logger -t "$DAEMON_NAME-watchdog" "$*"
}
# Function to check if daemon is running
is_daemon_running() {
if [ -f "$PID_FILE" ]; then
local pid=$(cat "$PID_FILE")
if kill -0 "$pid" 2>/dev/null; then
return 0 # Running
else
log_message "PID file exists but process $pid is not running"
rm -f "$PID_FILE"
fi
fi
return 1 # Not running
}
# Function to start daemon
start_daemon() {
log_message "Starting $DAEMON_NAME daemon..."
# Create necessary directories
mkdir -p /var/log/wssshc
chown wssshc:wssshc /var/log/wssshc
# Start daemon as wssshc user
if [ -n "$DAEMON_ARGS" ]; then
start-stop-daemon --start --quiet --pidfile "$PID_FILE" \
--chuid wssshc:wssshc --background --make-pidfile \
--exec "$DAEMON_PATH" -- $DAEMON_ARGS
else
start-stop-daemon --start --quiet --pidfile "$PID_FILE" \
--chuid wssshc:wssshc --background --make-pidfile \
--exec "$DAEMON_PATH"
fi
local result=$?
if [ $result -eq 0 ]; then
log_message "$DAEMON_NAME started successfully"
return 0
else
log_message "Failed to start $DAEMON_NAME (exit code: $result)"
return 1
fi
}
# Function to stop daemon
stop_daemon() {
log_message "Stopping $DAEMON_NAME daemon..."
start-stop-daemon --stop --quiet --pidfile "$PID_FILE" --retry=TERM/30/KILL/5
local result=$?
rm -f "$PID_FILE"
return $result
}
# Function to check restart limits
check_restart_limits() {
local current_time=$(date +%s)
local restart_count=0
local window_start=$((current_time - RESTART_WINDOW))
# Count restarts in the last window
if [ -f "$LOG_FILE" ]; then
restart_count=$(grep -c "started successfully" "$LOG_FILE" | tail -n 100 | \
awk -v start="$window_start" '
BEGIN { count = 0 }
/started successfully/ {
# Extract timestamp and convert to epoch
match($0, /\[([0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2})\]/, arr)
if (arr[1] != "") {
cmd = "date -d \"" arr[1] "\" +%s 2>/dev/null"
cmd | getline timestamp
close(cmd)
if (timestamp >= start) count++
}
}
END { print count }
')
fi
if [ "$restart_count" -ge "$MAX_RESTARTS" ]; then
log_message "Too many restarts ($restart_count) in $RESTART_WINDOW seconds. Stopping watchdog."
return 1
fi
return 0
}
# Function to cleanup on exit
cleanup() {
log_message "Watchdog shutting down..."
if [ -f "$WATCHDOG_PID_FILE" ]; then
rm -f "$WATCHDOG_PID_FILE"
fi
exit 0
}
# Trap signals
trap cleanup SIGTERM SIGINT
# Main watchdog loop
main() {
# Check if START is enabled
if [ "$START" != "yes" ]; then
log_message "START is not set to 'yes' in /etc/default/wssshc. Exiting."
exit 0
fi
# Store watchdog PID
echo $$ > "$WATCHDOG_PID_FILE"
log_message "Watchdog started for $DAEMON_NAME"
log_message "Check interval: $CHECK_INTERVAL seconds"
log_message "Max restarts: $MAX_RESTARTS per $RESTART_WINDOW seconds"
while true; do
if ! is_daemon_running; then
log_message "$DAEMON_NAME is not running"
# Check restart limits
if ! check_restart_limits; then
log_message "Restart limits exceeded. Watchdog will not restart $DAEMON_NAME."
break
fi
# Attempt to start daemon
if start_daemon; then
log_message "$DAEMON_NAME restarted successfully"
else
log_message "Failed to restart $DAEMON_NAME"
fi
fi
sleep "$CHECK_INTERVAL"
done
log_message "Watchdog exiting"
cleanup
}
# Handle command line arguments
case "$1" in
start)
if [ -f "$WATCHDOG_PID_FILE" ]; then
echo "Watchdog is already running"
exit 1
fi
main &
echo "Watchdog started"
;;
stop)
if [ -f "$WATCHDOG_PID_FILE" ]; then
kill "$(cat "$WATCHDOG_PID_FILE")" 2>/dev/null
rm -f "$WATCHDOG_PID_FILE"
echo "Watchdog stopped"
else
echo "Watchdog is not running"
fi
;;
status)
if [ -f "$WATCHDOG_PID_FILE" ] && kill -0 "$(cat "$WATCHDOG_PID_FILE")" 2>/dev/null; then
echo "Watchdog is running (PID: $(cat "$WATCHDOG_PID_FILE"))"
else
echo "Watchdog is not running"
rm -f "$WATCHDOG_PID_FILE" 2>/dev/null
fi
;;
restart)
$0 stop
sleep 2
$0 start
;;
*)
echo "Usage: $0 {start|stop|status|restart}"
exit 1
;;
esac
exit 0
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment