Prometheus is a powerful open-source monitoring and alerting toolkit designed for reliability and scalability. Originally developed at SoundCloud Prometheus has become a cornerstone in modern monitoring particularly in cloud-native environments. This blog provides a detailed guide on installing and configuring Prometheus explains its data model and metrics collection process, introduces the Prometheus Query Language (PromQL) and walks through an example of setting up a simple metrics collection from a sample application.
Installing Prometheus
Before diving into the details of Prometheus let's start with the installation process. Prometheus is easy to install and can run on various platforms, including Linux, macOS and Windows.
Step 1 :- Download Prometheus
First download the latest Prometheus release from the official Prometheus website. For Linux you can use the following commands :-
wget https://github.com/prometheus/prometheus/releases/download/v2.28.1/prometheus-2.28.1.linux-amd64.tar.gz
tar xvfz prometheus-2.28.1.linux-amd64.tar.gz
cd prometheus-2.28.1.linux-amd64
Step 2 :- Configure Prometheus
Prometheus is configured via a YAML file typically named prometheus.yml. Below is an example configuration file :-
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds.
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
In this configuration :-
The global section sets the scrape interval to 15 seconds meaning Prometheus will scrape metrics from the configured targets every 15 seconds.
The scrape_configs section defines the targets from which Prometheus will collect metrics. In this case it will scrape metrics from the Prometheus server itself and from a Node Exporter running on the same machine.
Step 3 :- Start Prometheus
To start Prometheus navigate to the directory where Prometheus is installed and run the following command :-
./prometheus --config.file=prometheus.yml
Prometheus will start and be accessible via localhost:9090. You can open this URL in your web browser to access the Prometheus UI where you can explore metrics, configure alerting rules and more.
Understanding Prometheus Data Model
Prometheus collects and stores metrics as time-series data. This data model is highly flexible and allows for powerful querying and analysis.
Time-Series Data
A time-series is a sequence of data points indexed by time. In Prometheus each data point consists of :-
Timestamp :- The time at which the data point was collected.
Value :- The actual metric value (e.g. CPU usage, memory usage).
Metric Names and Labels
Prometheus uses a combination of metric names and labels to identify each time-series uniquely :-
Metric Name :- A string that identifies the type of metric (e.g. http_requests_total, cpu_usage_seconds_total.
Labels :- Key-value pairs that provide additional information about the metric. Labels allow for multi-dimensional data collection and querying. For example an http_requests_total metric might have labels such as method="GET" and handler="/api".
Collecting Metrics
Prometheus collects metrics by scraping HTTP endpoints that expose metrics in a specific format. These endpoints are known as targets. Targets are typically instrumented applications or services that expose metrics via an HTTP endpoint (usually /metrics ).
Introduction to Prometheus Query Language (PromQL)
PromQL is the powerful query language used by Prometheus to retrieve and manipulate time-series data. It allows users to perform complex queries, aggregations and functions on the collected metrics.
Basic PromQL Syntax
PromQL queries typically consist of :-
Selectors :- Used to select specific time-series based on metric names and labels.
Operators and Functions :- Used to perform operations and transformations on the selected time-series.
Example 1 :- Selecting a Metric
To select all time-series with the metric name http_requests_total you can use the following query :-
http_requests_total
Example 2 :- Filtering by Labels
To filter the http_requests_total metric by the method label you can use :-
http_requests_total{method="GET"}
Example 3 :- Applying Functions
To calculate the rate of increase of the http_requests_total metric over the last 5 minutes you can use the rate function :-
rate(http_requests_total[5m])
Aggregation Operators
PromQL supports various aggregation operators that allow you to aggregate metrics across multiple dimensions. Some common aggregation operators include :-
sum : Calculates the sum of values.
avg : Calculates the average of values.
max : Finds the maximum value.
min : Finds the minimum value.
Example 4 :- Aggregating Metrics
To calculate the total number of HTTP requests across all methods you can use the sum operator :-
sum(http_requests_total)
Example :- Setting Up a Simple Metrics Collection
Let's walk through an example of setting up a simple metrics collection from a sample application using Prometheus and Node Exporter.
Step 1 :- Install Node Exporter
Node Exporter is a Prometheus exporter that exposes system metrics such as CPU, memory and disk usage. To install Node Exporter use the following commands :-
wget https://github.com/prometheus/node_exporter/releases/download/v1.2.2/node_exporter-1.2.2.linux-amd64.tar.gz
tar xvfz node_exporter-1.2.2.linux-amd64.tar.gz
cd node_exporter-1.2.2.linux-amd64
./node_exporter
Node Exporter will start and expose metrics at localhost:9100/metrics.
Step 2 :- Configure Prometheus to Scrape Node Exporter
Add the Node Exporter target to your prometheus.yml configuration file :-
scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
Restart Prometheus to apply the new configuration.
Step 3 :- Verify Metrics Collection
Open the Prometheus web UI at localhost:9090 and navigate to the Targets page (localhost:9090/targets). You should see the Node Exporter target listed and marked as "UP".
Step 4 :- Query Metrics in Prometheus
Now that Prometheus is collecting metrics from Node Exporter you can start querying the metrics using PromQL.
Example 5 :- CPU Usage
To query the total CPU usage time you can use the following PromQL query :-
node_cpu_seconds_total
This will return the total CPU usage time for each CPU core.
Example 6 :- Memory Usage
To query the total memory usage you can use :-
node_memory_MemTotal_bytes
This will return the total memory available on the system.
Step 5 :- Create a Simple Dashboard in Grafana
To visualize the collected metrics you can use Grafana a popular open-source platform for monitoring and observability.
Install Grafana :- Download and install Grafana from the Grafana website. Start the Grafana server and access the web UI at localhost:3000.
Add Prometheus as a Data Source :-
In the Grafana web UI navigate to Configuration -> Data Sources -> Add data source.
Select Prometheus from the list of data sources.
Enter the Prometheus server URL (localhost:9090) and save the data source.
Create a Dashboard :-
Navigate to Dashboards -> New Dashboard.
Add a new panel to the dashboard.
Select the Prometheus data source.
Enter a PromQL query to retrieve data such as node_cpu_seconds_total.
Customize the visualization and save the dashboard.
Conclusion
Setting up Prometheus for metrics collection is a crucial first step in building a robust monitoring system. In this blog we covered the installation and configuration of Prometheus, explored its data model and metrics collection process, introduced Prometheus Query Language (PromQL) and demonstrated how to set up a simple metrics collection from a sample application using Node Exporter.
By following this guide you now have a foundational understanding of Prometheus and its capabilities. Stay tuned for more insights and practical tips to enhance your monitoring setup.