Sunday, August 24, 2014

Alert definition templates in plugin(descriptor)s ?

Hey,

his is a more general question and not tied to a specific version of RHQ (but may become part of a future version of RHQ and/or RHQ-alerts).

Do you feel it would make sense to provide alert templates inside the plugin (descriptor) to allow the plugin writer to pre-define alert definitions for certain resource types / metrics? Plugin-writers know best what alert definitions and conditions make sense and should get the power to pre-define them.

This idea would probably work best with some relative metrics like disk is 90% full as opposed to absolute values that probably depend a lot more on concrete customer scenarios (e.g. heap usage over 64MB may be good for small installations, but not for large ones).
In the future with RHQ-alerts, it should also be possible to compare two metrics with each other, which will allow to say "if metric(disk usage) > 90% of metric(disk capacity) then ...".

I've scribbled the idea down in the Wintermute page on the RHQ wiki.

If you think this is useful, please respond here or on the wiki page. Best is if you could add a specific example.

1 comment:

Elias Ross said...

First of all, predefined alert definitions is a great idea.

But there are a lot of questions. I'm not sure how alert notifications should work by default, but here's what I do in most cases:
1. Use a custom alerting configuration called 'tagged alerting'. What this does is you tag a resource group with an email address or name.
2. Or just do nothing, because our dashboard captures it.

In general, alerting is a hassle to copy between test and production environments and this is a way to do that better.