====== [HOWTO] Ceph+grafana+prometheus ====== ^ Documentation ^| ^Name:| [HOWTO] Ceph+grafana+prometheus | ^Description:| How to setup ceph with prometheus and grafana for advanced statistics | ^Modification date :| 24/12/2021 | ^Owner:|dodger| ^Notify changes to:|Owner | ^Tags:|ceph, object storage | ^Scalate to:|The_fucking_bofh| ====== Documentation ====== * [[https://docs.ceph.com/en/nautilus/mgr/dashboard/?#enabling-the-embedding-of-grafana-dashboards|Ceph Dashboard + grafana integration]] * [[https://docs.ceph.com/en/latest/mgr/dashboard/#dashboard-grafana|Same as previous but for latest version of ceph (has additional info)]] * [[https://docs.ceph.com/en/nautilus/mgr/prometheus/|Ceph+Prometheus official documentation]] * [[https://github.com/ceph/ceph/tree/master/monitoring/grafana/dashboards|Official ceph grafana dashboards]] * [[https://dev.to/ingoleajinkya/ceph-cluster-monitoring-using-prometheus-and-grafana-472i|Non-official howto]] ====== Pre-Requisites ====== ===== Prometheus node exporter ===== From the salt-master: export THEHOSTNAME='avmlp-os*' salt "${THEHOSTNAME}" test.ping salt "${THEHOSTNAME}" pkg.install golang-github-prometheus-node-exporter salt "${THEHOSTNAME}" service.start node_exporter salt "${THEHOSTNAME}" service.enable node_exporter salt "${THEHOSTNAME}" service.status node_exporter Check: salt "${THEHOSTNAME}" cmd.run "netstat -nap | egrep 9100 | egrep LISTEN" Obtain the list of nodes for configuring prometheus to scrape the ''node_exporter'': salt "${THEHOSTNAME}" service.status node_exporter | grep "^${THEHOSTNAME}" | awk -F\: '{print "\047"$1":9100\047,"}' Example: root@avmlm-salt-001 /home/bofher/scripts/nutanix_buster $ salt "${THEHOSTNAME}" service.status node_exporter | grep "^${THEHOSTNAME}" | awk -F\: '{print "\047"$1":9100\047,"}' 'bvmlm-osd-001.ciberterminal.net:9100', 'bvmlm-osd-019.ciberterminal.net:9100', 'bvmlm-osd-013.ciberterminal.net:9100', 'bvmlm-osm-003.ciberterminal.net:9100', 'bvmlm-osd-005.ciberterminal.net:9100', 'bvmlm-oslb-001.ciberterminal.net:9100', 'bvmlm-osd-010.ciberterminal.net:9100', 'bvmlm-osd-003.ciberterminal.net:9100', 'bvmlm-osd-020.ciberterminal.net:9100', 'bvmlm-osfs-003.ciberterminal.net:9100', 'bvmlm-osd-002.ciberterminal.net:9100', 'bvmlm-osm-001.ciberterminal.net:9100', 'bvmlm-osm-004.ciberterminal.net:9100', 'bvmlm-osd-015.ciberterminal.net:9100', 'bvmlm-osd-018.ciberterminal.net:9100', 'bvmlm-osgw-001.ciberterminal.net:9100', 'bvmlm-osd-017.ciberterminal.net:9100', 'bvmlm-osd-011.ciberterminal.net:9100', 'bvmlm-osd-007.ciberterminal.net:9100', 'bvmlm-osgw-004.ciberterminal.net:9100', 'bvmlm-osgw-003.ciberterminal.net:9100', 'bvmlm-osd-006.ciberterminal.net:9100', 'bvmlm-osfs-004.ciberterminal.net:9100', 'bvmlm-osm-002.ciberterminal.net:9100', 'bvmlm-osd-008.ciberterminal.net:9100', 'bvmlm-osfs-002.ciberterminal.net:9100', 'bvmlm-osfs-001.ciberterminal.net:9100', 'bvmlm-osd-004.ciberterminal.net:9100', 'bvmlm-oslb-002.ciberterminal.net:9100', 'bvmlm-osd-012.ciberterminal.net:9100', 'bvmlm-osd-009.ciberterminal.net:9100', 'bvmlm-osgw-002.ciberterminal.net:9100', 'bvmlm-osd-014.ciberterminal.net:9100', 'bvmlm-osm-005.ciberterminal.net:9100', 'bvmlm-osnx-002.ciberterminal.net:9100', 'bvmlm-osd-016.ciberterminal.net:9100', ===== Prometheus ===== Bare minimal install instructions: cat >/etc/yum.repos.d/prometheus.repo< Prometheus setup, add scrape config for ceph, for example, in dev with larry: # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['0.0.0.0:9090'] - job_name: 'ceph-larry' static_configs: - targets: ['larry.ciberterminal.net:9283'] - job_name: 'node-exporter' static_configs: - targets: [ 'bvmlm-osd-001.ciberterminal.net:9100', 'bvmlm-osd-019.ciberterminal.net:9100', 'bvmlm-osd-013.ciberterminal.net:9100', 'bvmlm-osm-003.ciberterminal.net:9100', 'bvmlm-osd-005.ciberterminal.net:9100', 'bvmlm-oslb-001.ciberterminal.net:9100', 'bvmlm-osd-010.ciberterminal.net:9100', 'bvmlm-osd-003.ciberterminal.net:9100', 'bvmlm-osd-020.ciberterminal.net:9100', 'bvmlm-osfs-003.ciberterminal.net:9100', 'bvmlm-osd-002.ciberterminal.net:9100', 'bvmlm-osm-001.ciberterminal.net:9100', 'bvmlm-osm-004.ciberterminal.net:9100', 'bvmlm-osd-015.ciberterminal.net:9100', 'bvmlm-osd-018.ciberterminal.net:9100', 'bvmlm-osgw-001.ciberterminal.net:9100', 'bvmlm-osd-017.ciberterminal.net:9100', 'bvmlm-osd-011.ciberterminal.net:9100', 'bvmlm-osd-007.ciberterminal.net:9100', 'bvmlm-osgw-004.ciberterminal.net:9100', 'bvmlm-osgw-003.ciberterminal.net:9100', 'bvmlm-osd-006.ciberterminal.net:9100', 'bvmlm-osfs-004.ciberterminal.net:9100', 'bvmlm-osm-002.ciberterminal.net:9100', 'bvmlm-osd-008.ciberterminal.net:9100', 'bvmlm-osfs-002.ciberterminal.net:9100', 'bvmlm-osfs-001.ciberterminal.net:9100', 'bvmlm-osd-004.ciberterminal.net:9100', 'bvmlm-oslb-002.ciberterminal.net:9100', 'bvmlm-osd-012.ciberterminal.net:9100', 'bvmlm-osd-009.ciberterminal.net:9100', 'bvmlm-osgw-002.ciberterminal.net:9100', 'bvmlm-osd-014.ciberterminal.net:9100', 'bvmlm-osm-005.ciberterminal.net:9100', 'bvmlm-osnx-002.ciberterminal.net:9100', 'bvmlm-osd-016.ciberterminal.net:9100' ] We will restart and check after setting up the rest of elements :-) ===== grafana ===== * Grafana working I haven't setup it, so I can't give instructions here xD\\ \\ Additional setup for grafana to work with ceph: --- grafana.ini 2021-12-24 10:38:20.669668776 +0100 +++ grafana.ini.orig 2021-12-24 12:36:44.083311253 +0100 @@ -185,7 +185,6 @@ # set to true if you want to allow browsers to render Grafana in a ,