Trading is temporarily paused. More info here. THORChain has no active refund, airdrop, or compensation program. Be cautious of fraudulent websites impersonating the protocol and report them back to us.

THORChain Bare-Metal Validator  —  WireGuard Monitoring

D5 Sammy profile picture
D5 Sammy

2023-09-08 — 3 min read

    Node Setup

With the intend of increasing reliability of VPN Tunnel for THORChain Validators, here is a solution to monitor hanging Wireguard connections.

Sometime a WireGuard connection would just stay hanging unresponsive and would require a restart to get back alive. When a VPN connection hang, the Validator become unreachable via the public IP, which result in a display the RPC and BRF column as BAD in dashboards.

Wasn’t able to find the root cause yet, maybe Wireguard doesn’t like having multiple interface at once.

I have created a simple script to monitors network connection every 10min and restart them when required.

Create Script

sudo nano /root/wgcycle.sh

Copy the following, replacing list of IP and Interfaces accordingly:

#!/bin/bash
# This script monitors WireGuard network connection and
# restart service when connection is hagning. By D5Sammy.

declare -A Tunnels
declare Restarts

# Define Network Connections Here
# With the IP at the other end of the WireGuard Tunnel
# and the Name of Wireguard Interface:

Tunnels[10.10.1.1]=wg1
Tunnels[10.10.2.1]=wg2
Tunnels[10.10.3.1]=wg3
Tunnels[10.10.4.1]=wg4

echo "Checking list of ${#Tunnels[*]} tunnels."

for ip in "${!Tunnels[@]}"
do
inf=${Tunnels[$ip]}
echo -n "-> Ping $ip on $inf... "

ping -c 1 $ip > /dev/null

if [ $? -eq 0 ]; then
echo "Available!"
else
echo -n "Not available... "

if [[ " ${Restarts[@]} " =~ " ${inf} " ]]; then
echo "Interface $inf was already restared!"
else
echo -n "Restarting $inf... "
Restarts+=($inf)
systemctl restart wg-quick@$inf
echo "Done!"
fi
fi
done

echo "Run Completed, ${#Restarts[*]} Restarts Required (${Restarts[@]})."

Make Script Executable

sudo chmod +x /root/wgcycle.sh

Try Script

sudo /root/wgcycle.sh

Result Output should look like this:

Checking list of 4 tunnels.
-> Ping 10.10.1.1 on wg1... Available!
-> Ping 10.10.2.1 on wg2... Available!
-> Ping 10.10.3.1 on wg3... Not available... Restarting wg3... Done!
-> Ping 10.10.4.1 on wg4... Available!
Run Completed, 1 Restarts Required (wg3).

Create Service

sudo nano /etc/systemd/system/wgcycle.service

Copy Following

[Unit]
Description=WireGuard Connection Restarting Service
After=network.target

[Service]
Type=simple
User=root
ExecStart=/root/wgcycle.sh

[Install]
WantedBy=multi-user.target

Create Timer

sudo nano /etc/systemd/system/wgcycle.timer

Copy Following:

[Unit]
Description=WireGuard Connection Restarting Timer

[Timer]
OnCalendar=*:0/10:0

[Install]
WantedBy=timers.target

Operate Service

Starting the Timer

sudo systemctl enable wgcycle.timer
sudo systemctl start wgcycle.timer

Monitoring the Service status

sudo systemctl status wgcycle.timer wgcycle.service

Display Service Logs

journalctl -u wgcycle.service -n 100 --no-pager

Impact of Network Interface change on MicroK8s

By default, Microk8s monitors for any change of network interface and restarts itself whenever it detect an IP change to refresh its certificate, causing all Kube Pods to turn to Unknown State. This cause our Validator to be unresponsive for a few minutes, and can lead to corruption of a chain-daemon. Wireguard Restart does trigger this because it momentarily remove and re-add an IP to the host.

Configure Microk8s to not refresh its certificate so that it doesnt restart every time a Wireguard interface is restarted.

sudo touch /var/snap/microk8s/current/var/lock/no-cert-reissue
kubectl scale deployments --replicas=0 --all -n c0
sudo microk8s stop
sudo microk8s start
kubectl scale deployments --replicas=1 --all -n c0

Hope this help improving node connectivity.

Try the World’s Leading Bitcoin DEX

No sign up required. Easy to use.