Checkpoints

Monitoring Integration with Prometheus and Alert.


Introduction

In this guide, we will set up a script to monitor data from an API. This script will expose metrics to Prometheus. We will also configure alerts in Prometheus to notify when a specific signer and ID are missing.


1: Installing Dependencies

You need to install the requests library to make the request to the API and prometheus_client to expose the metrics to Prometheus.

pip install requests prometheus_client

Step 1.1: Create the Monitoring Script

import requests
from prometheus_client import start_http_server, Gauge
import time

missing_signers_gauge = Gauge('missing_signers', 'Number of missing signers')

target_signer = "<YOUR-SIGNER>"
target_id = "<YOUR-IDVALIDATOR>"

def fetch_missing_signers():
    url = 'https://monitor.vn.stakepool.dev.br/missing_signers'
    try:
        response = requests.get(url)
        response.raise_for_status()  

       
        try:
            data = response.json()
            return data
        except ValueError:
            print(f"Error decoding JSON: Malformed response. Content: {response.text}")
            return []  

    except requests.exceptions.HTTPError as err:
        print(f"Error HTTP: {err}")
    except requests.exceptions.RequestException as err:
        print(f"Error request: {err}")

    return [] 

def update_metrics():
    missing_signers = fetch_missing_signers()
    
   
    if isinstance(missing_signers, list):
        target_found = any(
            signer['signer'] == target_signer and signer['ID'] == target_id
            for signer in missing_signers
        )

        if target_found:
            missing_signers_gauge.set(1)
        else:
            missing_signers_gauge.set(0)
    else:
        print("Invalid response format. Expected a list.")

if __name__ == "__main__":
    start_http_server(8000)
    while True:
        update_metrics()
        time.sleep(60)

2: Install and Configure Prometheus

Step 2.1: Configure Prometheus to Scrape Metrics

Edit the prometheus.yml file to include the scrape target for the Python script. Add the following configuration to prometheus.yml:

scrape_configs:
  - job_name: 'missing_signers'
    static_configs:
      - targets: ['localhost:8000']

3: Configure Prometheus Alerting Rule

Create a new file named alert.rules and define the alerting rule for when the specific signer and ID are missing:

groups:
  - name: missing_signer_alerts
    rules:
    - alert: MissingSignerAlert
      expr: missing_signers == 1
      for: 1m
      labels:
        severity: critical
    

Step 3.1: Update Prometheus Configuration to Include Alerts

Add the alert.rules file to the Prometheus. Update prometheus.yml to include the rule_files section:

rule_files:
  - "alert.rules"

3.2 Restart Prometheus

systemctl restart prometheus

4. Creating a New Panel

Follow these steps to create a panel in Grafana.

Steps:

  1. On the Grafana dashboard, click the "+" icon in the left sidebar and select "Dashboard".

  2. Click on "Add New Panel".

Defining the Query for the missing_signers Metrics

With the panel created, the next step is to define the query to display the missing_signers metric.

Steps:

In the Query section (just below "Metric"), select Prometheus as the data source.

In the query field, enter the following query:

missing_signers
  1. In the Grafana panel, click the Alert icon (next to the panel title).

  2. Configure the alert condition. For example, set it to trigger an alert when the value of missing_signers is equal to 1 (which means the specific signer is missing).

Last updated