Monitoring Linux Hosts with Node Exporter, Prometheus & Grafana — Step-by-Step

A practical guide to install node_exporter on hosts, configure Prometheus as the metrics scraper, and visualize metrics in Grafana (including provisioning and example queries).

Table of contents

Overview
Prerequisites
1) Install node_exporter on hosts
2) Configure Prometheus to scrape node_exporter
3) Configure Grafana & add Prometheus datasource
4) Dashboards & example PromQL
5) Alerts (Prometheus Alertmanager or Grafana)
Appendix: Docker Compose quickstart
Best practices & troubleshooting

Overview

This post shows a minimal, production-ready approach to host-level observability using:

node_exporter — exposes OS and hardware metrics (CPU, memory, disk, network) via /metrics on port 9100.
Prometheus — scrapes metrics from node_exporter and stores time-series data.
Grafana — visualizes metrics using Prometheus as the datasource; dashboards show host health and trends.

Prerequisites

Linux hosts (Ubuntu/CentOS) where you can install node_exporter (or run as container).
A server for Prometheus and Grafana (can be one VM or container host).
Network connectivity: Prometheus must be able to reach HOST:9100.
Optional: firewall rules allowing port 9100 from the Prometheus server only.

Tip: For security, restrict node_exporter access to only Prometheus (via firewall or by binding to a private interface).

1) Install node_exporter on each host

Here’s a typical systemd-based installation on an Ubuntu host (replace version as needed):

# Download
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.1/node_exporter-1.7.1.linux-amd64.tar.gz
tar xvf node_exporter-1.7.1.linux-amd64.tar.gz
sudo mv node_exporter-1.7.1.linux-amd64/node_exporter /usr/local/bin/
# Create systemd unit
sudo tee /etc/systemd/system/node_exporter.service >/dev/null <<'EOF'
[Unit]
Description=Prometheus Node Exporter
After=network.target

[Service]
User=nobody
ExecStart=/usr/local/bin/node_exporter --collector.textfile.directory=/var/lib/node_exporter/textfile_collector
Restart=always

[Install]
WantedBy=multi-user.target
EOF
# Start service
sudo mkdir -p /var/lib/node_exporter/textfile_collector
sudo systemctl daemon-reload
sudo systemctl enable --now node_exporter
# Verify
curl http://localhost:9100/metrics | head -n 20

The --collector.textfile.directory allows adding custom metrics as text files (useful for scripts). Use User=nobody or a dedicated user for security.

If you run node_exporter in container mode (Docker), expose port 9100 and mount volumes for textfile collector if needed.

2) Configure Prometheus to scrape node_exporter

Prometheus configuration example (prometheus.yml). Add a job that scrapes all your nodes:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node_exporter'
    metrics_path: /metrics
    scrape_interval: 15s
    static_configs:
      - targets:
        - 'host1.example.com:9100'
        - 'host2.example.com:9100'
    # Optional relabeling to simplify instance label
    relabel_configs:
      - source_labels: [__address__]
        regex: '([^:]+):.*'
        target_label: instance
        replacement: '$1'

After updating prometheus.yml, restart Prometheus and check the Targets page (http://PROM_SERVER:9090/targets) to ensure node exporters are UP.

Dynamic service discovery

If you run in cloud environments, replace static_configs with cloud SD (EC2, GCE, Kubernetes). For Kubernetes, use kubernetes_sd_configs and the Prometheus Operator for automated scraping.

3) Configure Grafana & add Prometheus datasource

Two ways to add the Prometheus datasource:

Option A — UI (Quick)

Open Grafana web UI (e.g., http://GRAFANA_HOST:3000), log in (default admin/admin).
Go to Configuration → Data Sources → Add data source
Select Prometheus, set URL to http://PROMETHEUS_HOST:9090, and click Save & Test.

Option B — Provisioning (recommended for automation)

Create a YAML file at /etc/grafana/provisioning/datasources/prometheus.yaml:

apiVersion: 1
datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://prometheus:9090
    isDefault: true
    editable: false

Restart Grafana — the datasource will be automatically created. This is ideal for repeatable infra-as-code deployments.

4) Dashboards & example PromQL queries

You can import community dashboards (search “Node Exporter Full” on grafana.com) or create custom panels. Below are useful PromQL queries for common panels:

CPU Usage (per instance)

100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

Memory Used (bytes)

node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes

Memory Usage %

(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100

Disk Used % (root)

100 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"} * 100)

Load (1m)

node_load1{instance="$instance"}

Network Bytes In/Out

rate(node_network_receive_bytes_total{device!="lo"}[5m])
rate(node_network_transmit_bytes_total{device!="lo"}[5m])

When creating dashboards, group panels by purpose (CPU, Memory, Disk, Network, Processes) and include host selector variables for easy filtering.

5) Alerts (Prometheus Alertmanager or Grafana)

Two common approaches:

Prometheus rules + Alertmanager: Define Prometheus alerting rules and route alerts via Alertmanager to email/Slack/PagerDuty.
Grafana alerting: Grafana’s unified alerting can evaluate Prometheus queries and send notifications. This centralizes alert management in Grafana.

Example Prometheus alert rule (high CPU)

groups:
- name: node_alerts
  rules:
  - alert: HighCPUUsage
    expr: (100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)) > 85
    for: 2m
    labels:
      severity: critical
    annotations:
      summary: "High CPU on {{ $labels.instance }}"
      description: "CPU usage is > 85% for more than 2 minutes."

Place rules in a file (e.g., node_rules.yml) and reference it in prometheus.yml under rule_files. Configure Alertmanager endpoints in prometheus.yml under alerting.

Appendix: Docker Compose quickstart (Prometheus + Grafana + Node Exporter)

Simple docker-compose.yml for local testing:

version: '3.7'
services:
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - ./rules/:/etc/prometheus/rules/:ro
    ports:
      - "9090:9090"

  grafana:
    image: grafana/grafana:latest
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - ./provisioning/:/etc/grafana/provisioning/:ro
    ports:
      - "3000:3000"

  node-exporter-host1:
    image: prom/node-exporter:latest
    ports:
      - "9100:9100"
    command:
      - '--collector.textfile.directory=/var/lib/node_exporter/textfile_collector'
    volumes:
      - ./textfile_collector/:/var/lib/node_exporter/textfile_collector:ro

Note: For production, run node_exporter on each host (not in the same host as Prometheus unless for lab/testing).

Best practices & troubleshooting

Scrape intervals: 15s is common for host metrics; lower intervals increase load and storage.
Security: Limit access to node_exporter (firewall or private network). Use mTLS or a proxy if exposing metrics across untrusted networks.
Labeling: Use meaningful labels (environment, role, datacenter) so queries can aggregate effectively.
Retention & downsampling: Plan Prometheus retention and use remote_write (Thanos, Cortex, VictoriaMetrics) for long-term storage.
Textfile collector: Use scripts to export custom metrics into the textfile directory for application-specific metrics.
Troubleshooting: If metrics don’t appear, check Prometheus targets page, verify node_exporter is reachable (curl host:9100/metrics), and inspect firewall/security groups.

Step-by-Step Guide: Monitoring Linux Hosts with Node Exporter, Prometheus & Grafana

Monitoring Linux Hosts with Node Exporter, Prometheus & Grafana — Step-by-Step

Overview

Prerequisites

1) Install node_exporter on each host

2) Configure Prometheus to scrape node_exporter

Dynamic service discovery

3) Configure Grafana & add Prometheus datasource

Option A — UI (Quick)

Option B — Provisioning (recommended for automation)

4) Dashboards & example PromQL queries

CPU Usage (per instance)

Memory Used (bytes)

Memory Usage %

Disk Used % (root)

Load (1m)

Network Bytes In/Out

5) Alerts (Prometheus Alertmanager or Grafana)

Example Prometheus alert rule (high CPU)

Appendix: Docker Compose quickstart (Prometheus + Grafana + Node Exporter)

Best practices & troubleshooting

Leave a Reply Cancel reply

Overview

Prerequisites

1) Install node_exporter on each host

2) Configure Prometheus to scrape node_exporter

Dynamic service discovery

3) Configure Grafana & add Prometheus datasource

Option A — UI (Quick)

Option B — Provisioning (recommended for automation)

4) Dashboards & example PromQL queries

CPU Usage (per instance)

Memory Used (bytes)

Memory Usage %

Disk Used % (root)

Load (1m)

Network Bytes In/Out

5) Alerts (Prometheus Alertmanager or Grafana)

Example Prometheus alert rule (high CPU)

Appendix: Docker Compose quickstart (Prometheus + Grafana + Node Exporter)

Best practices & troubleshooting

You may also like

Leave a Reply Cancel reply