Node analysis

StakePool's architecture is based on data collection and visualization applications, and the generation of alerts to reduce the time of action in cases of failure in validator machines in blockchain.

These applications can be divided into two categories: Applications Running on a Centralized Server, Applications Running on a Validator Machine.

• Centralized Server Execution Applications ◦ Reverse Proxy Server (NGINX) - Centralizes the receipt of HTTPS requests in an exposed domain on the web and distributes them to the applications present on the Server. It uses Let's Encrypt and certbot to provide valid certificate in the application.

Grafana Server - Application that centralizes visualization of data collected via Prometheus and Loki. Login via Github - Provides integration for using Github account credentials without having to manually register users in Grafana.

Creating Alert Rules - Allows creating query-based alert rules

▪ Log Centralization - Allows viewing logs from various sources in a single application (Integration with Loki).

Loki - Allows the centralization of logs in a single application and obtaining logs in an easier way through the execution of a query in LogQL.

◦ Alert Manager - Alert management via files and integration with Grafana

◦ Prometheus - Application that collects and stores your metrics as time series data. Metric information is stored with the timestamp it was recorded, along with optional key-value pairs called labels. It provides flexible queries through the PromQL language.

Collect Metrics Cosmos Server - HTTP server application to receive data from the “Collect Metrics Cosmos Client” application, persists the data received in a local SQLite3 base and exposes this data in endpoint /metrics to be collected via Prometheus.

◦ PM2 - Application for managing processes in a production environment.

◦ Alert Discord and Slack API - Allows the generation of customized alerts for Discord and Slack platforms integrated with Alert Manager.

• Execution Applications on a Validator Machine

◦ Collect Metrics Cosmos Client - Application that extracts information from lost blocks, converts this data into a JSON Object and sends the data to the “Collect Metrics Cosmos Server” Server application.

◦ Promtail - Application that collects logs from the machine and sends them to the Loki application. ◦ Node Exporter - Provide machine metrics in endpoint /metrics for Prometheus collection.

7 - Create service file

sudo vi /etc/systemd/system/promtail.service

Put this content in the file and save

[Unit]
Description=Promtail Service
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/promtail/promtail-linux-amd64 -config.file=/usr/local/bin/promtail/promtail-local-config.yaml

[Install]
WantedBy=multi-user.target

8 - Reload daemons

sudo systemctl daemon-reload

9 - Execute the promtail

sudo systemctl start promtail

10 - Checking if promtail is really running

sudo systemctl status promtail

11 - Enabling service in boot

sudo systemctl enable promtail

Dashboards Loki query based

1 - In "Dashboards" menu click in "Browse"

2 - Click in "New Dashboard" button or click on an existing dashboard

3 - Click in "Add new panel" button

4 - Select Loki datasource

5 - Define the content filter in Loki query

Grafana Alerts

1 - Access the Alert menu and then click Admin

2 - Click the "Add Alertmanager" button

3 - Set the URL where your external alertmanager instance is installed and click on the "Add alertmanagers" button

4 - Access the panel inside the Dashboard that you want to add the alert, right-click on the panel and then click "Edit"

5 - Click the "Alert" tab and click the "Create alert rule from this panel" button to create an alert with a direct link to this panel.

6 - Define the name of the alert, the folder, the function, select the query and the limit that will be used for the alert

7 - Set the field "evaluate every" and "for", if necessary add a custom label and then click "Save and exit

Nginx Install and enable HTTPS connections with Let's Encrypt

sudo apt install nginx

Enable Nginx in boot:

sudo systemctl enable nginx

Certbot snap version install:

sudo snap install core; sudo snap refresh core
sudo snap install --classic certbot
sudo ln -s /snap/bin/certbot /usr/bin/certbot
sudo certbot --nginx

Configure /etc/nginx/sites-enabled/default :

location / {
                # First attempt to serve request as file, then
                # as directory, then fall back to displaying a 404.
                try_files $uri $uri/ =404;
                proxy_pass http://localhost:3000;
        }


        location /api/live {
                proxy_http_version 1.1;
                proxy_set_header Upgrade $http_upgrade;
                proxy_set_header Connection "Upgrade";
                proxy_set_header Host $http_host;
                proxy_pass http://localhost:3000/;
        }

        location /grafana/ {
                proxy_pass         http://localhost:3000/;
                proxy_set_header   Host $Host;
                proxy_max_temp_file_size 0;
        }

Configure /etc/grafana/grafana.ini :

[server]

root_url = %(protocol)s://%(domain)s:/grafana
serve_from_sub_path = true

Restart nginx and grafana-server services :

sudo systemctl restart grafana-server
sudo systemctl restart nginx

Configuring names in hosts file

1 - Edit /etc/hosts file

Define the names that will be used for each host, and put them in the file /etc/hosts, follow exemple:

sudo nano /etc/hosts

nameofhost1 10.0.0.1
nameofhost2 10.0.0.2
nameofhost3 10.0.0.3

2 - Test configuration with ping command

ping nameofhost

Something like this will appear:

PING localhost (10.0.0.1) 56(84) bytes of data.
64 bytes from ip6-localhost (10.0.0.1): icmp_seq=1 ttl=64 time=0.012 ms
64 bytes from ip6-localhost (10.0.0.1): icmp_seq=2 ttl=64 time=0.042 ms

if the IP address that returns after the command matches the name you put in the /etc/hosts file, the configuration is correct

Now in the configuration of prometheus you can configure the host name instead of the IP

  - job_name: "Validators OR Sentry"
    static_configs:
      - targets: ["nameofhost1:9100","nameofhost2:9100","nameofhost3:9100"]

GitHub OAuth2 Authentication

1 - Register your application with GitHub

To enable the GitHub OAuth2 you must register your application with GitHub. GitHub will generate a client ID and secret key for you to use. When you create the application you will need to specify a callback URL. Specify this as callback:

https://myrealdomain.com/grafana/login/github

Access your organization

Access Settings of your organization

Access Developer section and select OAuth App

Click in "New OAuth App"

Define OAuth app info

Get Client ID and generate new client secret

Store your Client ID and Client secret for alertmanager configuration

2 - Set root_url in grafana.ini file

sudo nano /etc/grafana/grafana.ini
[server]
root_url = https://myrealdomain.com/grafana
serve_from_sub_path = true

3 - Enable GitHub in Grafana in grafana.ini file

sudo nano /etc/grafana/grafana.ini
[auth.github]
enabled = true
allow_sign_up = true
client_id = YOUR_GITHUB_APP_CLIENT_ID
client_secret = YOUR_GITHUB_APP_CLIENT_SECRET
scopes = user:email,read:org
auth_url = https://github.com/login/oauth/authorize
token_url = https://github.com/login/oauth/access_token
api_url = https://api.github.com/user
# space-delimited organization names
allowed_organizations = organization1 organization2

4 - Restart grafana-server service

sudo systemctl restart grafana-server

5 - After logout we have a login option with github

Conclusion: We can conclude that real-time log analysis monitoring is an effective way to keep uptime high.

Last updated