0%
监控Flink任务情况
下载软件
1 2 3 4
| grafana node_exporter prometheus pushgateway
|
Flink方面修改
1 2 3 4 5 6 7 8
| # 复制opt/flink-metrics-prometheus到lib目录 # 修改conf/flink-conf.yaml metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter metrics.reporter.promgateway.host: hadoop01 metrics.reporter.promgateway.port: 9091 metrics.reporter.promgateway.jobName: myJob metrics.reporter.promgateway.randomJobNameSuffix: true metrics.reporter.promgateway.deleteOnShutdown: false
|
prometheus.yml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['hadoop01:9090'] labels: instance: 'prometheus' - job_name: 'linux' static_configs: - targets: ['hadoop01:9100'] labels: instance: 'hadoop01' - job_name: 'pushgateway' static_configs: - targets: ['hadoop01:9091'] labels: instance: 'pushgateway'
|
启动
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| # pushgateway ./pushgateway & ./pushgateway --web.enable-lifecycle --web.enable-admin-api &
# node_exporter ./node_exporter &
# prometheus ./prometheus --config.file=./conf/prometheus.yml &
# grafana ./bin/grafana-server web & username/password: admin/admin
# 注意版本问题
|
针对pushgateway的优化
1 2 3 4
| # 版本为1.0.1,低版本并不会主动去清除group信息 # 哪怕是很久没有进行push数据了,也不会清除 # 需要自己写脚本定时去清除所有的group信息 curl -X PUT http://hadoop01:9091/api/v1/admin/wipe
|
图表的设置