Skip to main content
Version: Next

Grafana High Availability

Grafana can be configured for high availability (HA) so that the application can remain operational if a pod, node, or backing database goes down.

Turning on High Availability for Grafana changes the way that the application operates by turn on unified alerting mode which has grafana pods talk to each other. Unified alerting attempts remove duplicate alerts as much as possible but some duplicate will make it through to ensure that alerting still occurs if one or more pods errors, maintainers should be proactive and setup alerting tools to deduplicate when Grafana HA is one

Requirements

  • Enable Grafana IaC module that creates an RDS

Optional: turn Grafana RDS clustering on for multi-region AZ RDS HA

Grafana IaC

To enable HA turn on the Module grafana by either turning ont he module directly or setting grafna.inputs.high_availability to true

To enable HA for just Grafana set high_availability to true in grafana_inputs

Optional: you can also specify multiple rds_instance here for database HA

locals {
...
grafana_inputs = {
# If enabled, this sets high availability for grafana, the higher level high_availability input has priority
high_availability = true
rds_instances = {
primary = {} # AZ can be set per instance
# secondary = {}
# additional replicas can be added (third = {})
}
}
...

Turns on the module directly, this will also give setting to grafana in cluster

modules {
...
grafana = true
...

Grafana HA can be enables on existing clusters. In addition to giving Grafana a database backend it adds setting to the helm chart that maintainers should be aware of so they are merged in without issue:

grafana:
values:
headlessService: true
autoscaling:
enabled: true
podDisruptionBudget:
apiVersion: "policy/v1"
minAvailable: 1
grafana.ini:
database:
type: postgres
host: "${db_host}:${db_port}"
name: grafana
user: "${db_name}"
unified_alerting:
enabled: true
alerting:
enabled: false