Node analysis
StakePool's architecture is based on data collection and visualization applications, and the generation of alerts to reduce the time of action in cases of failure in validator machines in blockchain.
These applications can be divided into two categories: Applications Running on a Centralized Server, Applications Running on a Validator Machine.
• Centralized Server Execution Applications ◦ Reverse Proxy Server (NGINX) - Centralizes the receipt of HTTPS requests in an exposed domain on the web and distributes them to the applications present on the Server. It uses Let's Encrypt and certbot to provide valid certificate in the application.
Grafana Server - Application that centralizes visualization of data collected via Prometheus and Loki. Login via Github - Provides integration for using Github account credentials without having to manually register users in Grafana.
Creating Alert Rules - Allows creating query-based alert rules
▪ Log Centralization - Allows viewing logs from various sources in a single application (Integration with Loki).
Loki - Allows the centralization of logs in a single application and obtaining logs in an easier way through the execution of a query in LogQL.
◦ Alert Manager - Alert management via files and integration with Grafana
◦ Prometheus - Application that collects and stores your metrics as time series data. Metric information is stored with the timestamp it was recorded, along with optional key-value pairs called labels. It provides flexible queries through the PromQL language.
Collect Metrics Cosmos Server - HTTP server application to receive data from the “Collect Metrics Cosmos Client” application, persists the data received in a local SQLite3 base and exposes this data in endpoint /metrics to be collected via Prometheus.
◦ PM2 - Application for managing processes in a production environment.
◦ Alert Discord and Slack API - Allows the generation of customized alerts for Discord and Slack platforms integrated with Alert Manager.
• Execution Applications on a Validator Machine
◦ Collect Metrics Cosmos Client - Application that extracts information from lost blocks, converts this data into a JSON Object and sends the data to the “Collect Metrics Cosmos Server” Server application.
◦ Promtail - Application that collects logs from the machine and sends them to the Loki application. ◦ Node Exporter - Provide machine metrics in endpoint /metrics for Prometheus collection.
7 - Create service file
Put this content in the file and save
8 - Reload daemons
9 - Execute the promtail
10 - Checking if promtail is really running
11 - Enabling service in boot
Dashboards Loki query based
1 - In "Dashboards" menu click in "Browse"
2 - Click in "New Dashboard" button or click on an existing dashboard
3 - Click in "Add new panel" button
4 - Select Loki datasource
5 - Define the content filter in Loki query
Grafana Alerts
1 - Access the Alert menu and then click Admin
2 - Click the "Add Alertmanager" button
3 - Set the URL where your external alertmanager instance is installed and click on the "Add alertmanagers" button
4 - Access the panel inside the Dashboard that you want to add the alert, right-click on the panel and then click "Edit"
5 - Click the "Alert" tab and click the "Create alert rule from this panel" button to create an alert with a direct link to this panel.
6 - Define the name of the alert, the folder, the function, select the query and the limit that will be used for the alert
7 - Set the field "evaluate every" and "for", if necessary add a custom label and then click "Save and exit
Nginx Install and enable HTTPS connections with Let's Encrypt
Enable Nginx in boot:
Certbot snap version install:
Configure /etc/nginx/sites-enabled/default :
Configure /etc/grafana/grafana.ini :
Restart nginx and grafana-server services :
Configuring names in hosts file
1 - Edit /etc/hosts file
Define the names that will be used for each host, and put them in the file /etc/hosts, follow exemple:
2 - Test configuration with ping command
Something like this will appear:
if the IP address that returns after the command matches the name you put in the /etc/hosts file, the configuration is correct
Now in the configuration of prometheus you can configure the host name instead of the IP
GitHub OAuth2 Authentication
1 - Register your application with GitHub
To enable the GitHub OAuth2 you must register your application with GitHub. GitHub will generate a client ID and secret key for you to use. When you create the application you will need to specify a callback URL. Specify this as callback:
Access your organization
Access Settings of your organization
Access Developer section and select OAuth App
Click in "New OAuth App"
Define OAuth app info
Get Client ID and generate new client secret
Store your Client ID and Client secret for alertmanager configuration
2 - Set root_url in grafana.ini file
3 - Enable GitHub in Grafana in grafana.ini file
4 - Restart grafana-server service
5 - After logout we have a login option with github
Conclusion: We can conclude that real-time log analysis monitoring is an effective way to keep uptime high.
Last updated