User Tools

Site Tools


linux:ceph:howtos:ceph_grafana_prometheus

[HOWTO] Ceph+grafana+prometheus

Documentation
Name: [HOWTO] Ceph+grafana+prometheus
Description: How to setup ceph with prometheus and grafana for advanced statistics
Modification date : 24/12/2021
Owner:dodger
Notify changes to:Owner
Tags:ceph, object storage
Scalate to:Thefuckingbofh

Documentation

Pre-Requisites

Prometheus node exporter

From the salt-master:

export THEHOSTNAME='avmlp-os*'
salt "${THEHOSTNAME}" test.ping
salt "${THEHOSTNAME}" pkg.install golang-github-prometheus-node-exporter
salt "${THEHOSTNAME}" service.start node_exporter
salt "${THEHOSTNAME}" service.enable node_exporter
salt "${THEHOSTNAME}" service.status node_exporter

Check:

salt "${THEHOSTNAME}" cmd.run "netstat -nap | egrep 9100 | egrep LISTEN"

Obtain the list of nodes for configuring prometheus to scrape the nodeexporter'': <code bash> salt “${THEHOSTNAME}” service.status nodeexporter | grep “^${THEHOSTNAME}” | awk -F\: '{print “\047”$1“:9100\047,”}' </code> Example: <code bash> root@avmlm-salt-001 /home/bofher/scripts/nutanixbuster $ salt “${THEHOSTNAME}” service.status nodeexporter | grep “^${THEHOSTNAME}” | awk -F\: '{print “\047”$1“:9100\047,”}' 'bvmlm-osd-001.ciberterminal.net:9100', 'bvmlm-osd-019.ciberterminal.net:9100', 'bvmlm-osd-013.ciberterminal.net:9100', 'bvmlm-osm-003.ciberterminal.net:9100', 'bvmlm-osd-005.ciberterminal.net:9100', 'bvmlm-oslb-001.ciberterminal.net:9100', 'bvmlm-osd-010.ciberterminal.net:9100', 'bvmlm-osd-003.ciberterminal.net:9100', 'bvmlm-osd-020.ciberterminal.net:9100', 'bvmlm-osfs-003.ciberterminal.net:9100', 'bvmlm-osd-002.ciberterminal.net:9100', 'bvmlm-osm-001.ciberterminal.net:9100', 'bvmlm-osm-004.ciberterminal.net:9100', 'bvmlm-osd-015.ciberterminal.net:9100', 'bvmlm-osd-018.ciberterminal.net:9100', 'bvmlm-osgw-001.ciberterminal.net:9100', 'bvmlm-osd-017.ciberterminal.net:9100', 'bvmlm-osd-011.ciberterminal.net:9100', 'bvmlm-osd-007.ciberterminal.net:9100', 'bvmlm-osgw-004.ciberterminal.net:9100', 'bvmlm-osgw-003.ciberterminal.net:9100', 'bvmlm-osd-006.ciberterminal.net:9100', 'bvmlm-osfs-004.ciberterminal.net:9100', 'bvmlm-osm-002.ciberterminal.net:9100', 'bvmlm-osd-008.ciberterminal.net:9100', 'bvmlm-osfs-002.ciberterminal.net:9100', 'bvmlm-osfs-001.ciberterminal.net:9100', 'bvmlm-osd-004.ciberterminal.net:9100', 'bvmlm-oslb-002.ciberterminal.net:9100', 'bvmlm-osd-012.ciberterminal.net:9100', 'bvmlm-osd-009.ciberterminal.net:9100', 'bvmlm-osgw-002.ciberterminal.net:9100', 'bvmlm-osd-014.ciberterminal.net:9100', 'bvmlm-osm-005.ciberterminal.net:9100', 'bvmlm-osnx-002.ciberterminal.net:9100', 'bvmlm-osd-016.ciberterminal.net:9100', </code> ===== Prometheus ===== Bare minimal install instructions: <code bash> cat >/etc/yum.repos.d/prometheus.repo«EOF [prometheus] name=prometheus baseurl=https://packagecloud.io/prometheus-rpm/release/el/$releasever/$basearch repogpgcheck=1 enabled=1 gpgkey=https://packagecloud.io/prometheus-rpm/release/gpgkey https://raw.githubusercontent.com/lest/prometheus-rpm/master/RPM-GPG-KEY-prometheus-rpm gpgcheck=1 metadataexpire=300 EOF yum install prometheus2.x8664 \ apacheexporter.x8664 \ collectdexporter.x8664 consulexporter.x8664 \ elasticsearchexporter.x8664 \ graphiteexporter.x8664 \ haproxyexporter.x8664 \ kafkaexporter.x8664 \ memcachedexporter.x8664 \ mysqldexporter.x8664 \ nginxexporter.x8664 \ nodeexporter.x8664 \ postgresexporter.x8664 \ processexporter.x8664 \ pushgateway.x8664 \ rabbitmqexporter.x8664 \ redisexporter.x8664 \ sachet.x8664 \ smokepingprober.x8664 \ snmpexporter.x8664 \ statsdexporter.x8664 \ thanos.x8664 systemctl start prometheus systemctl enable prometheus systemctl status prometheus </code> Prometheus setup, add scrape config for ceph, for example, in dev with larry: <file yaml prometheus.yml> # my global config global: scrapeinterval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluationinterval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluationinterval'. rulefiles: # - “firstrules.yml” # - “secondrules.yml” # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrapeconfigs: # The job name is added as a label job=<job_name> to any timeseries scraped from this config. - jobname: 'prometheus' # metricspath defaults to '/metrics' # scheme defaults to 'http'. staticconfigs: - targets: ['0.0.0.0:9090'] - jobname: 'ceph-larry' staticconfigs: - targets: ['larry.ciberterminal.net:9283'] - jobname: 'node-exporter' staticconfigs: - targets: [ 'bvmlm-osd-001.ciberterminal.net:9100', 'bvmlm-osd-019.ciberterminal.net:9100', 'bvmlm-osd-013.ciberterminal.net:9100', 'bvmlm-osm-003.ciberterminal.net:9100', 'bvmlm-osd-005.ciberterminal.net:9100', 'bvmlm-oslb-001.ciberterminal.net:9100', 'bvmlm-osd-010.ciberterminal.net:9100', 'bvmlm-osd-003.ciberterminal.net:9100', 'bvmlm-osd-020.ciberterminal.net:9100', 'bvmlm-osfs-003.ciberterminal.net:9100', 'bvmlm-osd-002.ciberterminal.net:9100', 'bvmlm-osm-001.ciberterminal.net:9100', 'bvmlm-osm-004.ciberterminal.net:9100', 'bvmlm-osd-015.ciberterminal.net:9100', 'bvmlm-osd-018.ciberterminal.net:9100', 'bvmlm-osgw-001.ciberterminal.net:9100', 'bvmlm-osd-017.ciberterminal.net:9100', 'bvmlm-osd-011.ciberterminal.net:9100', 'bvmlm-osd-007.ciberterminal.net:9100', 'bvmlm-osgw-004.ciberterminal.net:9100', 'bvmlm-osgw-003.ciberterminal.net:9100', 'bvmlm-osd-006.ciberterminal.net:9100', 'bvmlm-osfs-004.ciberterminal.net:9100', 'bvmlm-osm-002.ciberterminal.net:9100', 'bvmlm-osd-008.ciberterminal.net:9100', 'bvmlm-osfs-002.ciberterminal.net:9100', 'bvmlm-osfs-001.ciberterminal.net:9100', 'bvmlm-osd-004.ciberterminal.net:9100', 'bvmlm-oslb-002.ciberterminal.net:9100', 'bvmlm-osd-012.ciberterminal.net:9100', 'bvmlm-osd-009.ciberterminal.net:9100', 'bvmlm-osgw-002.ciberterminal.net:9100', 'bvmlm-osd-014.ciberterminal.net:9100', 'bvmlm-osm-005.ciberterminal.net:9100', 'bvmlm-osnx-002.ciberterminal.net:9100', 'bvmlm-osd-016.ciberterminal.net:9100' ] </file> We will restart and check after setting up the rest of elements :-) ===== grafana ===== * Grafana working I haven't setup it, so I can't give instructions here xD

Additional setup for grafana to work with ceph: <code diff> — grafana.ini 2021-12-24 10:38:20.669668776 +0100 +++ grafana.ini.orig 2021-12-24 12:36:44.083311253 +0100 @@ -185,7 +185,6 @@ # set to true if you want to allow browsers to render Grafana in a <frame>, <iframe>, <embed> or <object>. default is false. ;allowembedding = false -allowembedding = true # Set to true if you want to enable http strict transport security (HSTS) response header. # This is only sent when HTTPS is enabled in this configuration. @@ -308,16 +307,12 @@ [auth.anonymous] # enable anonymous access ;enabled = false -enabled = true # specify organization name that should be used for unauthenticated users ;orgname = Main Org. -;orgname = ciberterminal.net -orgname = ciberterminal DEMO # specify role for unauthenticated users ;orgrole = Viewer -org_role = Viewer #################################### Github Auth ########################## [auth.github] </code>
But you'll need the following plugins for grafana: <code bash> grafana-cli plugins install vonage-status-panel grafana-cli plugins install grafana-piechart-panel </code>
Import all of the officia dashboards :-)
Here you have some nice oneliners to simplify the process: <code bash> wget “https://github.com/ceph/ceph/tree/master/monitoring/grafana/dashboards” for i in $(cat dashboards| egrep json |egrep “dashboard” | awk -F\“ '{print $6}' | egrep “.json”) ; do wget “https://raw.githubusercontent.com/ceph/ceph/master/monitoring/grafana/dashboards/${i}” ; done for i in *json ; do cat ${i} | jq . >/dev/null && echo ”### OK ${i}“ || echo ”@@@ KO ${i}“ ; done </code> And import them with the web-ui (I couldn't import them through API).
Also you'll have to setup prometheus as data-source for grafana and setup the prometheus server: ====== Instructions ====== Following official documentation, on any of the ceph admin nodes: <code bash> ceph mgr module enable prometheus ceph config set mgr mgr/prometheus/serverport 9283 ceph config set mgr mgr/prometheus/serveraddr 0.0.0.0 ceph config set mgr mgr/prometheus/scrape_interval 15 ceph dashboard set-grafana-api-url http://avvmld-graf-001.ciberterminal.net:3000/ ceph dashboard set-grafana-api-ssl-verify False </code> You must change grafana url according your setup.
check: <code bash> bvmlm-osm-001 /home/bofher # ceph config dump | egrep -v “KEY” WHO MASK LEVEL OPTION VALUE RO mgr advanced mgr/dashboard/GRAFANAAPIURL https://grafana-bavel.ciberterminal.net/ *
mgr advanced mgr/prometheus/scrapeinterval 15 *
mgr advanced mgr/prometheus/server
addr 0.0.0.0 *
mgr advanced mgr/prometheus/server_port 9283 *
bvmlm-osm-001 /home/bofher # ceph mgr services { “dashboard”: “https://bvmlm-osm-002.ciberterminal.net:8443/”, “prometheus”: “http://bvmlm-osm-002.ciberterminal.net:9283/” } </code>
haproxy configuration so it magically balance to the working monitor server running dashboard & prometheus module: <code yaml> # Fronted for prometheus scrapper frontend httpweb *:9283 mode http defaultbackend ceph_prometheus backend ceph_prometheus mode http option httpchk GET / http-check expect status 200 server monscraper1 bvmlm-osm-001.ciberterminal.net:9283 check verify none server monscraper2 bvmlm-osm-002.ciberterminal.net:9283 check verify none server monscraper3 bvmlm-osm-003.ciberterminal.net:9283 check verify none server monscraper4 bvmlm-osm-004.ciberterminal.net:9283 check verify none server monscraper5 bvmlm-osm-005.ciberterminal.net:9283 check verify none </code>
Go and restart prometheus to begin scrapping ceph: <code bash> systemctl restart prometheus systemctl status prometheus </code> Check targets on prometheus: http://avmlm-prom-001:9090/targets (change the prometheus server…) == Need more instructions? RTFM! == ====== For NX nodes (nginx) ====== Add firewall rules: <code bash> firewall-cmd –permanent –zone=public –add-rich-rule='rule family=ipv4 source address=10.40.3.64/32 port port=9100 protocol=tcp accept' firewall-cmd –zone=public –add-rich-rule='rule family=ipv4 source address=10.40.3.64/32 port port=9100 protocol=tcp accept' </code> ====== Final thoughts ======

linux/ceph/howtos/ceph_grafana_prometheus.txt · Last modified: 2022/02/11 11:36 by 127.0.0.1