Validator Home
Monitoring your validator
18min
tail your log file watch the latest entries in your log file for errors it moves fast so may need to stop the tail and read before restarting here are several ways to run it when the validator is first starting up, it asks the entrypoint in the start script "what ip address do you see me at?" then it proceeds to ask the entrypoint if the required ports are open for it to use this can be searched for during the startup sequence using the last tail option below that will grep 10 lines before and 50 after the search term to see if you are reachable there is a ton of good connection debugging info in this search see screenshot below run as xand user cli su xand cd cli tail f /xand validator log cli tail f /xand validator log | grep "search terms" this will restart your validator then grep the log file for initial connection info cli sudo systemctl restart xand service && tail f /xand validator log | grep color=always b 10 a 50 "checking that tcp ports" mon sh create a monitor script to easily run the monitor command run as xand user create a blank file named mon sh in your home dir with editor cli cd nano nano mon sh vim vim mon sh copy the code block into the file, correcting your ledger path if needed save and exit cli xandeum validator ledger /ledger monitor make the file executable cli chmod +x mon sh run the monitor from home dir cli /mon sh note press enter to drop a line to compare old values and press ctrl+c to exit the monitor command catchup sh create a catchup script that compares your machine to the rpc that you are connected to create a blank file named catchup sh in your home dir with editor cli cd nano nano catchup sh vim vim catchup sh copy the code block into the file save and exit note if using the alt method for catchup because localhost is not working for you be sure to grab your validator id pubkey using solana keygen pubkey /validator keypair json cli xandeum catchup k /validator keypair json our localhost follow verbose alt xandeum catchup url https //api devnet xandeum com 8899 \<validator id pubkey> follow verbose make the file executable cli chmod +x catchup sh run the monitor from home dir cli /catchup sh note press enter to drop a line to compare old values and press ctrl+c to exit the catchup command watchtower note xandeum watchtower is an optional monitoring system running on a separate computer that will alert you in your own personal discord setup is required that is not shown in this guide run as xand user without sudo watchtower should be ran from a remote computer that is running 24/7 it works by asking the rpc node if your validator passes all the sanity checks it can be added as a service or ran in a tmux window that never closes you will need the software compiled to the point that xandeum v works after a reboot you will need to create a discord or slack channel with a webhook to make this work telegram, pagerdudy, and twilio are also supported this example script checks every \<interval> seconds and alerts to discord and slack if \<unhealthy threshold> number of failures show in a row\ ie 900 seconds multiple scripts can be running with different validator identity and pumped into the same alert channel and use the name suffix to uniquely identify which machine is failing note if our rpc node goes down or is unreachable from your location you will get false positives that your machine is down this can be added if desired ignore http bad gateway ignore http 502 bad gateway errors from the json rpc url this flag can help reduce false positives, at the expense of no alerting should a bad gateway error be a side effect of the real problem nano nano watchtower alerts sh vim vim watchtower alerts sh add text to file and modify for your needs text export discord webhook=https //discord com/api/webhooks/xxxxxxx/yyyyyyyyyy export slack webhook=https //hooks slack com/services/xxxxxxxxx/yyyyyyyy/zzzzzz \#! /bin/sh exec xandeum watchtower \\ \ url https //api devnet xandeum com 8899/ \\ \ validator identity \<validator id> \\ \ name suffix \<hostname> \\ \ interval 300 \\ \ unhealthy threshold 3 \\ \ minimum validator identity balance 3 \\ cli chmod +x watchtower alerts sh run the script! you may want this running in a tmux session so it stays active when you close your teminal learn tmux docid\ jskw zztyqphdwbgr3zuy cli /watchtower alerts sh from the script, we will see that 3 errors in a row checked at 300 second intervals will trigger the alert to be sent to our discord webhook consider next zabbix installation docid\ x1ltqa8zayke8yuwdb1sr