Skip to content

Latest commit

 

History

History
132 lines (117 loc) · 4 KB

resources_monitor.md

File metadata and controls

132 lines (117 loc) · 4 KB

服务配置

cadvisornode-exporterprometheusgrafana实现docker和宿主机运行时资源监控。 利用docker compose运行多个容器,docker compose配置文件compose.yaml如下:

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    ports:
      - 9090:9090
    command:
      - --config.file=/etc/prometheus/prometheus.yml
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
    depends_on:
      - cadvisor
      - node-exporter
    networks:
      - my-net
  
  cadvisor:
    image: m.daocloud.io/gcr.io/cadvisor/cadvisor:latest
    container_name: cadvisor
    ports:
      - 8080:8080
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
      - /dev/disk/:/dev/disk:ro
    devices:
      - /dev/kmsg
    privileged: true
    networks:
      - my-net

  node-exporter:
    image: quay.io/prometheus/node-exporter:latest
    container_name: node-exporter
    ports:
      - 9100:9100
    command:
      - --path.rootfs=/host
    volumes:
      - /:/host:ro,rslave
    networks:
      - my-net
    
  grafana:
    image: grafana/grafana-enterprise:latest
    container_name: grafana
    ports:
      - 3000:3000
    volumes:
      - ./grafana-data:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=123456
    depends_on:
      - prometheus
    networks:
      - my-net

networks:
  my-net:
    driver: bridge

由于国内不能直接拉取gcr.io下镜像,所有加个前缀m.daocloud.io

上述配置中grafana容器volumes挂载的本地路径./grafana-data目录需要有other用户写权限。

prometheus服务端配置文件prometheus.yml文件如下:

global:
  scrape_interval:     1s # By default, scrape targets every 15 seconds.
  evaluation_interval: 1s # Evaluate rules every 15 seconds.

  # Attach these extra labels to all timeseries collected by this Prometheus instance.
  external_labels:
    monitor: 'test-monitor'

rule_files:
#  - 'prometheus.rules.yml'

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 1s
    static_configs:
      - targets: ['10.211.55.8:9090']

  - job_name: 'node'
    scrape_interval: 1s
    static_configs:
      - targets:
        - 10.211.55.8:9100

  - job_name: 'docker'
    scrape_interval: 1s
    static_configs:
      - targets:
        - 10.211.55.8:8080

其中targets中需要更改为实际运行指标收集服务的地址。

服务运行

服务启动

sudo docker compose up -d
  1. 服务启动后,通过http://<ip>:8080访问cadvisor的 webui 界面,可以查看容器和宿主机资源使用情况。 cadvisor

  2. 通过http://<ip>:9090访问prometheus服务的 webui 界面,可以在表达式栏中输入Prometheus表达式。 例如,开始检索container_start_time_seconds指标,该指标记录了容器的启动时长(以秒为单位)。 可以使用name="<container_name>"表达式按名称选择特定的容器。容器名称对应于Docker Compose配置中的container_name参数。 例如,container_start_time_seconds{name="redis"}表达式显示redis容器启动时长。 prometheus

  3. 通过http://<ip>:3000访问grafana界面。初始的时候需要登录,用户名为admim,密码为启动grafana容器时GF_SECURITY_ADMIN_PASSWORD环境变量的值。 登录成功后界面如下 grafana 添加数据源Data sources选择prometheus,指定prometheus server URL,其他配置根据需求设置,样例: datasources 配置Dashboards,可以从官方下载配置好的json文件,直接导入,如下所示: dashboards 例如选择Docker and system monitoring模版,最终效果如下: dashboards_ok

服务停止

sudo docker compose down