Validator Home

Monitoring your validator

18min

Tail your log file

Watch the latest entries in your log file for errors. It moves fast...so may need to stop the tail and read before restarting. Here are several ways to run it.

When the validator is first starting up, it asks the entrypoint in the start script "what IP address do you see me at?" Then it proceeds to ask the entrypoint if the required ports are open for it to use. This can be searched for during the startup sequence using the last tail option below that will grep 10 lines before and 50 after the search term to see if you are reachable. There is a ton of good connection debugging info in this search. See screenshot below.

Run as xand user

CLI

CLI

CLI


This will restart your validator then grep the log file for initial connection info:

CLI

Correct ports are reachable
Correct ports are reachable




mon.sh

Create a monitor script to easily run the monitor command

Run as xand user

Create a blank file named mon.sh in your home dir with editor.

CLI

nano
vim


Copy the code block into the file, correcting your ledger path if needed. Save and exit.

CLI


Make the file executable

CLI


Run the monitor from home dir

CLI


NOTE: Press enter to drop a line to compare old values and press ctrl+c to exit the monitor command.



catchup.sh

Create a catchup script that compares your machine to the RPC that you are connected to.

Create a blank file named catchup.sh in your home dir with editor.

CLI

nano
vim


Copy the code block into the file. Save and exit.

Note: if using the ALT method for catchup because localhost is not working for you...be sure to grab your validator ID pubkey using solana-keygen pubkey ~/validator-keypair.json



CLI
ALT


Make the file executable

CLI


Run the monitor from home dir

CLI


NOTE: Press enter to drop a line to compare old values and press ctrl+c to exit the catchup command.



Watchtower

NOTE: xandeum-watchtower is an optional monitoring system running on a separate computer that will alert you in your own personal discord. Setup is required that is not shown in this guide.



Run as xand user without sudo



Watchtower should be ran from a remote computer that is running 24/7. It works by asking the RPC node if your Validator passes all the sanity checks. It can be added as a service or ran in a tmux window that never closes. You will need the software compiled to the point that xandeum -V works after a reboot. You will need to create a Discord or Slack channel with a webhook to make this work. Telegram, PagerDudy, and Twilio are also supported. This example script checks every <interval> seconds and alerts to Discord and Slack if <unhealthy-threshold> number of failures show in a row...ie 900 seconds

Multiple scripts can be running with different --validator-identity and pumped into the same alert channel and use the --name-suffix to uniquely identify which machine is failing.



NOTE: If our RPC node goes down or is unreachable from your location...you will get false positives that your machine is down...this can be added if desired:

--ignore-http-bad-gateway

Ignore HTTP 502 Bad Gateway errors from the JSON RPC URL. This flag can help reduce false positives, at the expense of no alerting should a Bad Gateway error be a side effect of the real problem



nano
vim


Add text to file and modify for your needs:

text

CLI


Run the script! You may want this running in a tmux session so it stays active when you close your teminal. Learn Tmux

CLI

Document image


From the script, we will see that 3 ERRORS in a row checked at 300 second intervals will trigger the alert to be sent to our Discord webhook.



Consider next Zabbix Installation