Grafana Provisioned Alerting For Effective Observability banner (1)

June 7, 2024

Reading Time:

Grafana Provisioned Alerting for Effective Observability

Implementing a consistent and reliable alerting system across a sprawling organization is a significant challenge for just about any engineering team. For example, diverse infrastructures across different teams and numerous team-specific customizations may not translate well when investigating specific incidents. Inconsistent alerting practices can eventually lead to fatigue, leading to triggering of alerts that may not be relevant or actionable. These issues can even render the entire alerting system ineffective.

However, Grafana’s Provisioned Alerting feature can be an effective way to address these issues. This feature allows you to systematically manage your alerting components by providing modularity for each alerting component. It also enables the importing and exporting of custom alert rules, contact points, notification policies, mute timings, and templates across different Grafana instances. This flexibility allows large organizations to quickly set up alerting resources that are common across different teams, thereby saving time and reducing the margin of human error inherently involved in the process.

Here are three ways you can import alerting resources into your Grafana instance:

The following is an example of how you can use the alert provisioning API to create an alert rule:

Make an API call to create an alert with the alert rule in the body. The screenshot below is an example of an Host Out of Memory (OOM) alert rule. Note that this cannot be used as is to make an API call to your respective Grafana instance as it assumes you have a folder, OpsVerse-Alerts, and a Prometheus data source with UID metrics. You may edit this JSON to match your setup.

{ “orgID”: 1, “folderUID”: “OpsVerse-Alerts”, “ruleGroup”: “host-alerts”, “title”: “HostOutOfMemoryAlert”, “condition”: “C”, “data”: [ { “refId”: “A”, “queryType”: “”, “relativeTimeRange”: { “from”: 600, “to”: 0 }, “datasourceUid”: “metrics”, “model”: { “editorMode”: “code”, “expr”: “node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100”, “hide”: false, “intervalMs”: 1000, “legendFormat”: “__auto”, “maxDataPoints”: 43200, “range”: true, “refId”: “A” } }, { “refId”: “B”, “queryType”: “”, “relativeTimeRange”: { “from”: 600, “to”: 0 }, “datasourceUid”: “__expr__”, “model”: { “conditions”: [ { “evaluator”: { “params”: [], “type”: “gt” }, “operator”: { “type”: “and” }, “query”: { “params”: [ “B” ] }, “reducer”: { “params”: [], “type”: “last” }, “type”: “query” } ], “datasource”: { “type”: “__expr__”, “uid”: “__expr__” }, “expression”: “A”, “hide”: false, “intervalMs”: 1000, “maxDataPoints”: 43200, “reducer”: “last”, “refId”: “B”, “settings”: { “mode”: “” }, “type”: “reduce” } }, { “refId”: “C”, “queryType”: “”, “relativeTimeRange”: { “from”: 600, “to”: 0 }, “datasourceUid”: “__expr__”, “model”: { “conditions”: [ { “evaluator”: { “params”: [ 5 ], “type”: “lt” }, “operator”: { “type”: “and” }, “query”: { “params”: [ “C” ] }, “reducer”: { “params”: [], “type”: “last” }, “type”: “query” } ], “datasource”: { “type”: “__expr__”, “uid”: “__expr__” }, “expression”: “B”, “hide”: false, “intervalMs”: 1000, “maxDataPoints”: 43200, “refId”: “C”, “type”: “threshold” } } ], “noDataState”: “OK”, “execErrState”: “Error”, “for”: “2m”, “annotations”: { “description”: “Host out of memory (instance {{ $labels.node }})\nMemory Left : {{ $values.B }} %”, “summary”: “Node memory is filling up (< 5% left)" }, "labels": { "alerttype": "opsverse", "severity": "warning" }, "isPaused": false, "notification_settings": null }

The alert should appear on your Grafana instances with a Provisioned label, as shown here:

Alerting resources can be exported via UI as well as API. The UI exports alert resources in Terraform format. To export an alert in a provisioning file format, the Alerting HTTP API endpoints can be used.

To learn more about how to streamline your alerts and save valuable time, check out the links and detailed documentation we posted in this blog. Feel free to also contact our experts who can help take your alerting systems to the next level.

Written by Shivtej Narake

Categories

Grafana Loki vs. ELK Stack for Logging: A Comprehensive Comparison

With the increasing complexity of modern applications, log management solutions have become synonymous with...

Divyarthini Rajender | Jul 26, 2024 | 7 min read

Exploring Application Performance Monitoring – Importance and Open-Source Options

For any digital business today, keeping applications running smoothly and efficiently is a no-brainer. Application...

Divyarthini Rajender | Jul 18, 2024 | 7 min read

Understanding Traces and Spans: Span Filtering With ObserveNow and Grafana 10.4

ObserveNow, the leading open source-based observability stack, has recently enhanced its capabilities with the...

Divyarthini Rajender | Jun 28, 2024 | 4 min read

IDP Benefits – Critical Developer Efficiency Features

There’s no doubt about it: Internal Developer Portals (IDPs) are transforming the way organizations manage their...

Divyarthini Rajender | Jun 14, 2024 | 5 min read

5 Ways To Optimize Skyrocketing Observability Costs

Many of our customers frequently ask us how they can calculate the ROI of their observability platforms. It’s a...

Divyarthini Rajender | May 31, 2024 | 8 min read

What is Log Analytics? The Significance of Log Analytics Solutions Explained

Every action, transaction, and interaction in an application generates some sort of data. ThisAnd this data holds a...

Divyarthini Rajender | May 23, 2024 | 8 min read

Aiden

ObserveNow

OpsVerse One

Private SaaS

Download Product Brief

OpsVerse Blogs

Community

Videos

Webinars

Documentation

Latest Blog Posts

The Company

About Us

Investors

Follow Us

In The News

Press Releases

Connect With Us

Contact Us

Subscribe To Our Newsletter >>