Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

## About this project

A set of Plutono and Perses dashboards and Prometheus alerting rules combined with playbooks to ensure effective operations of Kubernetes.
A set of Perses dashboards and Prometheus alerting rules combined with playbooks to ensure effective operations of Kubernetes.

# Content

Expand All @@ -23,8 +23,6 @@ kubernetes-operations
├── alerts Prometheus alerts for kubernetes.
├── dashboards Plutono dashboards for visualizing key metrics.
├── perses-dashboards Perses dashboards for visualizing key metrics.
└── Chart.yaml Helm chart manifest.
Expand All @@ -40,14 +38,14 @@ The content of the repository can be installed independently or as part of the [
|-----|------|---------|-------------|
| dashboards.create | bool | `true` | Enables ConfigMap resources with dashboards to be created |
| dashboards.persesSelectors | list | `[{"name":"perses.dev/resource","value":"\"true\""}]` | Label selectors for the Perses dashboards to be picked up by Perses. |
| dashboards.plutonoSelectors | list | `[{"name":"plutono-dashboard","value":"\"true\""}]` | Label selectors for the Plutono dashboards to be picked up by Plutono. |
| global.commonLabels | object | `{}` | Common labels to add to all resources # |
| prometheusRules.NodeInMaintenance | object | `{"label":"maintenance_state","value":"in-maintenance"}` | The label value pair that marks a Kubernetes node as 'in maintenance' |
| prometheusRules.additionalRuleAnnotations | object | `{}` | Additional annotations for PrometheusRule alerts |
| prometheusRules.additionalRuleLabels | string | `nil` | Additional labels for PrometheusRule alerts # This is useful for adding additional labels such as "support_group" or "service" for the routing of alerts to each rule |
| prometheusRules.annotations | object | `{}` | Annotations for PrometheusRules |
| prometheusRules.create | bool | `true` | Enables PrometheusRule resources to be created |
| prometheusRules.disabled | object | `{}` | Disabled PrometheusRule alerts |
| prometheusRules.kubeLabels | list | `[]` | Enrich pod- and deployment-level alert expressions with labels from kube_pod_labels / kube_deployment_labels. Provide a list of kube-state-metrics label names to include in group_left(). Affects: KubernetesPodRestartingTooMuch, KubePodNotReady (join on pod+namespace), KubernetesDeploymentReplicasMismatch (join on namespace+deployment). |
| prometheusRules.labels | object | `{}` | Labels for PrometheusRules |
| prometheusRules.ruleSelectors | string | `nil` | Label selectors for the Prometheus rules to be picked up by Prometheus. |

Expand Down
4 changes: 1 addition & 3 deletions README.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

## About this project

A set of Plutono and Perses dashboards and Prometheus alerting rules combined with playbooks to ensure effective operations of Kubernetes.
A set of Perses dashboards and Prometheus alerting rules combined with playbooks to ensure effective operations of Kubernetes.

# Content

Expand All @@ -23,8 +23,6 @@ kubernetes-operations
├── alerts Prometheus alerts for kubernetes.
├── dashboards Plutono dashboards for visualizing key metrics.
├── perses-dashboards Perses dashboards for visualizing key metrics.
└── Chart.yaml Helm chart manifest.
Expand Down
7 changes: 3 additions & 4 deletions charts/kubernetes-operations/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,14 @@

apiVersion: v2
name: kubernetes-operations
version: 1.2.11
description: A set of Plutono dashboards and Prometheus alerting rules combined with playbooks to ensure effective operations of Kubernetes.
version: 1.3.0
description: A set of Perses dashboards and Prometheus alerting rules combined with playbooks to ensure effective operations of Kubernetes.
maintainers:
- name: richardtief
email: richard.tief@sap.com
- name: trouaux
keywords:
- Helm Chart
- Kubernetes operations
- Plutono Dashboards
- Perses Dashboards
- Prometheus Alerting
- Alert Rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,9 @@ groups:

{{- if not (.Values.prometheusRules.disabled.KubernetesPodRestartingTooMuch | default false) }}
- alert: KubernetesPodRestartingTooMuch
expr: sum by(pod, namespace, container) (rate(kube_pod_container_status_restarts_total[15m])) > 0
expr: |
sum by(pod, namespace, container) (rate(kube_pod_container_status_restarts_total[15m]))
{{- include "kubernetes-operations.kubePodLabelsJoin" . }} > 0
for: {{ dig "KubernetesPodRestartingTooMuch" "for" "1h" .Values.prometheusRules }}
labels:
severity: {{ dig "KubernetesPodRestartingTooMuch" "severity" "warning" .Values.prometheusRules }}
Expand Down Expand Up @@ -93,6 +95,7 @@ groups:
==
0
)
{{- include "kubernetes-operations.kubeDeploymentLabelsJoin" . }}
Comment thread
trouaux marked this conversation as resolved.
for: {{ dig "KubernetesDeploymentReplicasMismatch" "for" "10m" .Values.prometheusRules }}
labels:
severity: {{ dig "KubernetesDeploymentReplicasMismatch" "severity" "warning" .Values.prometheusRules }}
Expand All @@ -116,6 +119,7 @@ groups:
* on(node) group_left()
kube_node_status_condition{condition="Ready",status="true"}==1
)
{{- include "kubernetes-operations.kubePodLabelsJoin" . }}
Comment thread
trouaux marked this conversation as resolved.
for: {{ dig "KubePodNotReady" "for" "30m" .Values.prometheusRules }}
labels:
severity: {{ dig "KubePodNotReady" "severity" "warning" .Values.prometheusRules }}
Expand Down
Loading
Loading