Add MegaCLI collector #18

discordianfish · 2014-07-08T14:40:32Z

This collector exports the following metrics:

raid_drive_temperature: drive temperature
raid_drive_count: drive error and event counters
raid_adapter_disk_presence: disk presence per adapter

I still have to see if everything is working as expected, but feel free to review already :)

juliusv · 2014-07-08T14:56:31Z

collector/megacli.go

Fahrenheit, Celsius, Kelvin? ;) Include unit suffix please.

I knew it! :)

juliusv · 2014-07-08T15:02:45Z

👍 otherwise, though I admit I didn't look too closely since it seems like quite a specialized collector module :)

discordianfish · 2014-07-08T16:58:35Z

@juliusv Well, it's not a beauty - the megacli output is really, really ugly but the only way to get RAID stats for the most common hw raid controllers. The same RAID controllers you guys are using btw, so that could come in handy for you as well.

beorn7 · 2014-07-09T12:44:41Z

collector/megacli.go

You want CounterOpts. (Sorry, other way round, updated my previous comment.)

CounterOpts instead of GaugeOpts... (didn't it get my update?)

beorn7 · 2014-07-09T12:48:11Z

@juliusv is confident this is good to go. I just discovered the small inconsistency above.

This collector exports the following metrics: - raid_drive_temperature: drive temperature - raid_drive_count: drive error and event counters - raid_adapter_disk_presence: disk presence per adapter

beorn7 · 2014-07-09T12:56:48Z

👍

Add MegaCLI collector

…r-promu Install promu package for OCP multistage builds

* Add mountpoint to NodeFilesystem alerts This helps to identify alerting filesystem. Signed-off-by: Vitaly Zhuravlev <[email protected]> * Decrease NodeFilesystem pending time to 15m 30m is too long and there is a risk of running out of disk space/inodes completely if something is filling up disk very fast (like log file). Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add CPU and memory alerts Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add failed systemd service alert Signed-off-by: Vitaly Zhuravlev <[email protected]> * Decrease NodeNetwork*Errs pending period Signed-off-by: Vitaly Zhuravlev <[email protected]> * Set 'at' everywhere as preposition for instance Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add NodeDiskIOSaturation alert Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add %(nodeExporterSelector)s to Network and conntrack alerts Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add diskDevice selector Signed-off-by: Vitaly Zhuravlev <[email protected]> * Fix NodeMemoryHighUtilization alert Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add NodeSystemSaturation and NodeMemoryMajorPagesFaults Signed-off-by: Vitaly Zhuravlev <[email protected]> * Decrease NodeSystemdServiceFailed severity to warning Signed-off-by: Vitaly Zhuravlev <[email protected]> * Extend alert description Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add comma after 'mounted on' Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add thresholds for memory alerts Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add thresholds for memory, disk and system alerts Signed-off-by: Vitaly Zhuravlev <[email protected]> * Set severity to NodeCPUHighUsage to info Signed-off-by: Vitaly Zhuravlev <[email protected]> * Convert graph panels to timeseries panel ...With default style (opacity, tooltip etc). Also: Change 'logical core' line style to dotted Update Disk I/O time metric to dots Signed-off-by: Vitaly Zhuravlev <[email protected]> * Move dashboard paramaters to config Signed-off-by: Vitaly Zhuravlev <[email protected]> * Lint mixin Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add overview row * Add Cpu Usage stat panel * Add network dash * Improve network dash - Add interfaces overview panel - Add oper status timeline - Add common lib with reused elements (templates, queries) - Add common panels with shared style to be used accross this mixin * Remove external panels lib * Add fleet dashboard * Update fleet dash * Add CPU and memory to fleet * Add common cpu/memory/disk/network panels on fleet * add network errors panel as points * Fix alerts column in fleet table * Add support for multiple group and instance labels * Add sockstat to network dashboard * Add netstat to network dashboard * Change span to gridPod. Make overview row smaller. gridPos supports tiny panels height. * add reboot annotation * Add system dashboard * add filesystem row * Add disk and fs dashboard * Update mixin * make fmt * Add memory dashboard * Add memory generic counters to memory dashboard * Update common lib * Update OOM killer panel Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add common annotations: kernelChange, OOMkill Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add mountpoint to NodeFilesystem alerts This helps to identify alerting filesystem. Signed-off-by: Vitaly Zhuravlev <[email protected]> * Decrease NodeFilesystem pending time to 15m 30m is too long and there is a risk of running out of disk space/inodes completely if something is filling up disk very fast (like log file). Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add CPU and memory alerts Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add failed systemd service alert Signed-off-by: Vitaly Zhuravlev <[email protected]> * Decrease NodeNetwork*Errs pending period Signed-off-by: Vitaly Zhuravlev <[email protected]> * Set 'at' everywhere as preposition for instance Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add NodeDiskIOSaturation alert Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add %(nodeExporterSelector)s to Network and conntrack alerts Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add diskDevice selector Signed-off-by: Vitaly Zhuravlev <[email protected]> * Fix NodeMemoryHighUtilization alert Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add NodeSystemSaturation and NodeMemoryMajorPagesFaults Signed-off-by: Vitaly Zhuravlev <[email protected]> * Decrease NodeSystemdServiceFailed severity to warning Signed-off-by: Vitaly Zhuravlev <[email protected]> * Remove unused import * Add ability to set custom dashboardUID Required when multiple mixins are loaded based on node-mixin Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add mountpoint to NodeFilesystem alerts This helps to identify alerting filesystem. Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add failed systemd service alert Signed-off-by: Vitaly Zhuravlev <[email protected]> * Set 'at' everywhere as preposition for instance Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add NodeDiskIOSaturation alert Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add diskDevice selector Signed-off-by: Vitaly Zhuravlev <[email protected]> * Fix OOMkill panel Signed-off-by: Vitaly Zhuravlev <[email protected]> * Remove systemd panel systemd collector is disabled by default Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add some lint exclusions. Add UIDs to all dashboards. Add units and descriptions to all panels which were missing them. Modify alerts descriptions and summaries as needed for linting. Signed-off-by: Ryan J. Geyer <[email protected]> * Add multi-cluster dashboard lint exclusions * Extend alert description Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add comma after 'mounted on' Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add thresholds for memory alerts Signed-off-by: Vitaly Zhuravlev <[email protected]> * Add thresholds for memory, disk and system alerts Signed-off-by: Vitaly Zhuravlev <[email protected]> * Set severity to NodeCPUHighUsage to info Signed-off-by: Vitaly Zhuravlev <[email protected]> * Fix broken diskSpaceUsage link * Fix cpuIdle panel units * Change cpuUsage to use $__rate_interval * Fix cpu usage (replace with nodeQuerySelector) * Fix units (seconds->s) * Fix iops units * Add %(nodeQuerySelector)s to alerts queries * Remove trailing space * Add support for multi in job * Fix Pagesout metric * Add memory desciptions * Add total and available memory metrics * Update context switches description * Add network descriptions * Change pipe to | from / in AxisLabel * Update changes * Remove , in dashboards.jsonnet * Remove code comments * Update network descriptions * Add timezone metric * Add disk description --------- Signed-off-by: Vitaly Zhuravlev <[email protected]> Signed-off-by: Ryan J. Geyer <[email protected]>

Signed-off-by: dislbenn <[email protected]> Signed-off-by: dislbenn <[email protected]>

juliusv reviewed Jul 8, 2014
View reviewed changes

beorn7 reviewed Jul 9, 2014
View reviewed changes

Add MegaCLI collector

f47abc5

This collector exports the following metrics: - raid_drive_temperature: drive temperature - raid_drive_count: drive error and event counters - raid_adapter_disk_presence: disk presence per adapter

discordianfish added a commit that referenced this pull request Jul 9, 2014

Merge pull request #18 from prometheus/add-megaraid-metrics

50c6691

Add MegaCLI collector

discordianfish merged commit 50c6691 into master Jul 9, 2014

discordianfish deleted the add-megaraid-metrics branch July 9, 2014 12:56

pgier pushed a commit to pgier/node_exporter that referenced this pull request Jan 15, 2019

Merge pull request prometheus#18 from simonpasquier/fix-ocp-builds-fo…

45b0f5c

…r-promu Install promu package for OCP multistage builds

Avimitin mentioned this pull request Feb 25, 2022

Fail to test with version 1.3.1 under RISC-V #2296

Open

xuyixin1996 mentioned this pull request Jun 2, 2022

[ipvs collector] Propose to disable by default due to performance concern #2388

Closed

philipgough pushed a commit to philipgough/node_exporter that referenced this pull request Jan 8, 2025

updated component version to 2.7.0 (prometheus#18)

27030e2

Signed-off-by: dislbenn <[email protected]> Signed-off-by: dislbenn <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add MegaCLI collector #18

Add MegaCLI collector #18

Uh oh!

discordianfish commented Jul 8, 2014

Uh oh!

juliusv Jul 8, 2014

Uh oh!

discordianfish Jul 8, 2014

Uh oh!

juliusv commented Jul 8, 2014

Uh oh!

discordianfish commented Jul 8, 2014

Uh oh!

beorn7 Jul 9, 2014

Uh oh!

beorn7 Jul 9, 2014

Uh oh!

beorn7 commented Jul 9, 2014

Uh oh!

beorn7 commented Jul 9, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add MegaCLI collector #18

Add MegaCLI collector #18

Uh oh!

Conversation

discordianfish commented Jul 8, 2014

Uh oh!

juliusv Jul 8, 2014

Choose a reason for hiding this comment

Uh oh!

discordianfish Jul 8, 2014

Choose a reason for hiding this comment

Uh oh!

juliusv commented Jul 8, 2014

Uh oh!

discordianfish commented Jul 8, 2014

Uh oh!

beorn7 Jul 9, 2014

Choose a reason for hiding this comment

Uh oh!

beorn7 Jul 9, 2014

Choose a reason for hiding this comment

Uh oh!

beorn7 commented Jul 9, 2014

Uh oh!

beorn7 commented Jul 9, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants