🕵️ Grafana et les logs / traces
Estimated time to read: 3 minutes
Quand on parle de monitoring, on distingue souvent trois types de données :
- Les métriques
- Les logs
- Les traces
Nous avons déjà vu comment afficher les métriques dans Grafana en s'appuyant sur Prometheus, mais il est aussi possible d'afficher les logs et les traces.
Ajout des DS Loki et Tempo
Pour celà on vas configurer Grafana pour qu'il puisse se connecter à Loki et Tempo.
Objectifs:
- Si ce n'est pas déjà fait la datasource pour
Prometheusici - Une data source pour se connecter à l'instance
Loki(Le storage pour nos logs)- URL : http://loki:3100
- Une data source pour se connecter à l'instance
Tempo(le storage pour les traces)- URL : http://tempo:3200
- Trace to metrics : "prometheus"
- Trace to Logs : "loki"
Les logs sont ici gérés par Grafana Loki et les traces par Grafana Tempo. Nous allons importer le dashboard suivant pour manipuler les logs et les traces.
Observability Sample.json
{
"__inputs": [
{
"name": "DS_PROMETHEUS",
"label": "prometheus",
"description": "",
"type": "datasource",
"pluginId": "prometheus",
"pluginName": "Prometheus"
},
{
"name": "DS_LOKI",
"label": "loki",
"description": "",
"type": "datasource",
"pluginId": "loki",
"pluginName": "Loki"
},
{
"name": "DS_TEMPO",
"label": "tempo",
"description": "",
"type": "datasource",
"pluginId": "tempo",
"pluginName": "Tempo"
}
],
"__elements": {},
"__requires": [
{
"type": "panel",
"id": "bargauge",
"name": "Bar gauge",
"version": ""
},
{
"type": "grafana",
"id": "grafana",
"name": "Grafana",
"version": "11.6.0"
},
{
"type": "panel",
"id": "logs",
"name": "Logs",
"version": ""
},
{
"type": "datasource",
"id": "loki",
"name": "Loki",
"version": "1.0.0"
},
{
"type": "datasource",
"id": "prometheus",
"name": "Prometheus",
"version": "1.0.0"
},
{
"type": "panel",
"id": "stat",
"name": "Stat",
"version": ""
},
{
"type": "panel",
"id": "table",
"name": "Table",
"version": ""
},
{
"type": "datasource",
"id": "tempo",
"name": "Tempo",
"version": "11.6.0-pre"
}
],
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"description": "Observe Spring Boot application with three pillars of observability: Traces (Tempo), Metrics (Prometheus), Logs (Loki) on Grafana through OpenTelemetry and OpenMetrics.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [],
"panels": [
{
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 0
},
"id": 28,
"panels": [],
"title": "Metrics",
"type": "row"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 3,
"w": 4,
"x": 0,
"y": 1
},
"id": 4,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"percentChangeColorMode": "standard",
"reduceOptions": {
"calcs": [
"range"
],
"fields": "",
"values": false
},
"showPercentChange": false,
"textMode": "auto",
"wideLayout": true
},
"pluginVersion": "11.6.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": true,
"expr": "sum(http_server_request_duration_seconds_bucket{service_name=\"$app_name\", http_status_code=~\".*\", http_route!~\"$url_filter_regex\"})",
"interval": "",
"legendFormat": "{{span_kind}}",
"range": true,
"refId": "A"
}
],
"title": "Total Requests",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"mappings": [],
"noValue": "0",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 3,
"w": 3,
"x": 4,
"y": 1
},
"id": 22,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"percentChangeColorMode": "standard",
"reduceOptions": {
"calcs": [
"range"
],
"fields": "",
"values": false
},
"showPercentChange": false,
"textMode": "auto",
"wideLayout": true
},
"pluginVersion": "11.6.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": true,
"expr": "sum(http_server_request_duration_seconds_bucket{service_name=\"$app_name\", http_response_status_code=~\"2.*\", http_route!~\"$url_filter_regex\"})",
"interval": "",
"legendFormat": "",
"range": true,
"refId": "A"
}
],
"title": "Total 2XX",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"mappings": [],
"noValue": "0",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 3,
"w": 3,
"x": 7,
"y": 1
},
"id": 31,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"percentChangeColorMode": "standard",
"reduceOptions": {
"calcs": [
"range"
],
"fields": "",
"values": false
},
"showPercentChange": false,
"textMode": "auto",
"wideLayout": true
},
"pluginVersion": "11.6.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": true,
"expr": "sum(http_server_request_duration_seconds_bucket{service_name=\"$app_name\", http_response_status_code=~\"3.*\", http_route!~\"$url_filter_regex\"})",
"interval": "",
"legendFormat": "",
"range": true,
"refId": "A"
}
],
"title": "Total Exceptions ( 3XX )",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"mappings": [],
"noValue": "0",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 3,
"w": 3,
"x": 10,
"y": 1
},
"id": 32,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"percentChangeColorMode": "standard",
"reduceOptions": {
"calcs": [
"range"
],
"fields": "",
"values": false
},
"showPercentChange": false,
"textMode": "auto",
"wideLayout": true
},
"pluginVersion": "11.6.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": true,
"expr": "sum(http_server_request_duration_seconds_bucket{service_name=\"$app_name\", http_response_status_code=~\"4.*\", http_route!~\"$url_filter_regex\"})",
"interval": "",
"legendFormat": "",
"range": true,
"refId": "A"
}
],
"title": "Total Exceptions ( 4XX )",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"mappings": [],
"noValue": "0",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 3,
"w": 3,
"x": 13,
"y": 1
},
"id": 33,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"percentChangeColorMode": "standard",
"reduceOptions": {
"calcs": [
"range"
],
"fields": "",
"values": false
},
"showPercentChange": false,
"textMode": "auto",
"wideLayout": true
},
"pluginVersion": "11.6.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": true,
"expr": "sum(duration_count{service_name=\"$app_name\", http_response_status_code=~\"5.*\", http_route!~\"$url_filter_regex\"})",
"interval": "",
"legendFormat": "",
"range": true,
"refId": "A"
}
],
"title": "Total Exceptions ( 5XX )",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "continuous-GrYlRd"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
},
"unit": "ms"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 4
},
"id": 6,
"options": {
"displayMode": "lcd",
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": false
},
"maxVizHeight": 300,
"minVizHeight": 10,
"minVizWidth": 0,
"namePlacement": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"range"
],
"fields": "",
"values": false
},
"showUnfilled": true,
"sizing": "auto",
"valueMode": "color"
},
"pluginVersion": "11.6.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": true,
"expr": "sum by(http_route)(http_server_request_duration_seconds_sum{service_name=\"$app_name\", http_route!~\"$url_filter_regex\"}) / \r\nsum by(http_route)(http_server_request_duration_seconds_count{service_name=\"$app_name\", http_route!~\"$url_filter_regex\"})",
"interval": "",
"legendFormat": "{{ span_kind}}",
"range": true,
"refId": "A"
}
],
"title": "Requests Average Duration",
"type": "bargauge"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"fixedColor": "blue",
"mode": "fixed"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 4
},
"id": 37,
"options": {
"displayMode": "lcd",
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": false
},
"maxVizHeight": 300,
"minVizHeight": 10,
"minVizWidth": 0,
"namePlacement": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"range"
],
"fields": "",
"values": false
},
"showUnfilled": true,
"sizing": "auto",
"valueMode": "color"
},
"pluginVersion": "11.6.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": true,
"expr": "http_server_request_duration_seconds_count{service_name=\"$app_name\", http_route!~\"$url_filter_regex\"}",
"interval": "",
"legendFormat": "{{ http_route }}",
"range": true,
"refId": "A"
}
],
"title": "Number of requests",
"type": "bargauge"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"fixedColor": "blue",
"mode": "fixed"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 13
},
"id": 24,
"options": {
"displayMode": "basic",
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": false
},
"maxVizHeight": 300,
"minVizHeight": 10,
"minVizWidth": 0,
"namePlacement": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"range"
],
"fields": "",
"values": false
},
"showUnfilled": true,
"sizing": "auto",
"valueMode": "color"
},
"pluginVersion": "11.6.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": true,
"expr": "http_server_request_duration_seconds_sum{service_name=\"$app_name\", http_response_status_code=~\"2.*\", http_route!~\"$url_filter_regex\"}",
"interval": "",
"legendFormat": "{{http_route}}",
"range": true,
"refId": "A"
}
],
"title": "Number of 2xx Requests",
"type": "bargauge"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"fixedColor": "red",
"mode": "fixed"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 13
},
"id": 36,
"options": {
"displayMode": "basic",
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": false
},
"maxVizHeight": 300,
"minVizHeight": 10,
"minVizWidth": 0,
"namePlacement": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"range"
],
"fields": "",
"values": false
},
"showUnfilled": true,
"sizing": "auto",
"valueMode": "color"
},
"pluginVersion": "11.6.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": true,
"expr": "http_server_request_duration_seconds_sum{service_name=\"$app_name\", http_response_status_code!~\"2.*\", http_route!~\"$url_filter_regex\"}",
"interval": "",
"legendFormat": "{{http_route}}",
"range": true,
"refId": "A"
}
],
"title": "Number of none 2xx Requests",
"type": "bargauge"
},
{
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 22
},
"id": 30,
"panels": [],
"title": "Logs / Traces",
"type": "row"
},
{
"datasource": {
"type": "loki",
"uid": "${DS_LOKI}"
},
"gridPos": {
"h": 8,
"w": 24,
"x": 0,
"y": 23
},
"id": 2,
"options": {
"dedupStrategy": "exact",
"enableInfiniteScrolling": false,
"enableLogDetails": true,
"prettifyLogMessage": false,
"showCommonLabels": false,
"showLabels": false,
"showTime": true,
"sortOrder": "Descending",
"wrapLogMessage": false
},
"pluginVersion": "11.6.0",
"targets": [
{
"datasource": {
"type": "loki",
"uid": "${DS_LOKI}"
},
"editorMode": "code",
"expr": "{application=\"$app_name\"} | json | line_format \"{{.severity}}\\t{{.body}}\"",
"hide": false,
"queryType": "range",
"refId": "A"
}
],
"title": "Log of All Spring Boot Apps",
"type": "logs"
},
{
"datasource": {
"type": "tempo",
"uid": "${DS_TEMPO}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"cellOptions": {
"type": "auto"
},
"inspect": false
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 26,
"w": 24,
"x": 0,
"y": 31
},
"id": 26,
"options": {
"cellHeight": "sm",
"footer": {
"countRows": false,
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"pluginVersion": "11.6.0",
"targets": [
{
"datasource": {
"type": "tempo",
"uid": "${DS_TEMPO}"
},
"filters": [
{
"id": "5578df66",
"operator": "=",
"scope": "span",
"value": [],
"valueType": "int"
},
{
"id": "span-name",
"operator": "=",
"scope": "span",
"tag": "name",
"value": [],
"valueType": "string"
},
{
"id": "service-name",
"operator": "=",
"scope": "resource",
"tag": "service.name",
"value": [
"$app_name"
],
"valueType": "string"
}
],
"groupBy": [
{
"id": "8dcc5ceb",
"scope": "span"
}
],
"hide": false,
"limit": 20,
"query": "$trace_id",
"queryType": "traceqlSearch",
"refId": "A",
"tableType": "traces"
}
],
"title": "Traces",
"type": "table"
}
],
"refresh": "",
"schemaVersion": 41,
"tags": [],
"templating": {
"list": [
{
"current": {},
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"definition": "label_values(http_server_request_duration_seconds_sum,service_name)",
"includeAll": false,
"label": "Application Name",
"name": "app_name",
"options": [],
"query": {
"qryType": 1,
"query": "label_values(http_server_request_duration_seconds_sum,service_name)",
"refId": "PrometheusVariableQueryEditor-VariableQuery"
},
"refresh": 1,
"regex": "",
"type": "query"
},
{
"current": {
"text": "/management/health|/management/info",
"value": "/management/health|/management/info"
},
"description": "Regular expression for excluse URL to monitor",
"label": "URL Filter Regex",
"name": "url_filter_regex",
"options": [
{
"selected": true,
"text": "/management/health|/management/info",
"value": "/management/health|/management/info"
}
],
"query": "/management/health|/management/info",
"type": "textbox"
},
{
"current": {},
"datasource": {
"type": "tempo",
"uid": "${DS_TEMPO}"
},
"definition": "",
"description": "",
"label": "Span Names",
"name": "trace_id",
"options": [],
"query": {
"label": "kind",
"refId": "TempoDatasourceVariableQueryEditor-VariableQuery",
"type": 1
},
"refresh": 1,
"regex": "",
"type": "query"
}
]
},
"time": {
"from": "now-30m",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "Observability Sample",
"uid": "aeap3x0fe720wf",
"version": 2,
"weekStart": ""
}
Deux nouvelles visualisation sont disponible avec ce Dashboard.
Logs
La premiére est la visualisation Log qui vas permettre d'afficher les résultats de nos requetes vers Loki
Utilisez la fonction Explain Query pour obtenir de l'aide sur ce que fait cette requete

Pour aller plus loin sur la configuration des logs c'est par ici.
On utilise Loki comme stockage mais il est possible de faire la meme chose avec Elastic, InfluxD ou autre.
On n'affiche pas les logs INFO
On vas ici modifier la requete pour masquer les logs au niveau INFO
Spoiler la solution est là
On ajoute simplement un filtre sur le label level pour exclure la valeur INFO

Traces
Les traces permettent de visualiser un succession d'événements.
Une trace est consitituée de N Spans qui correspondent à une action.
Ici nous utilisons l'agent OpenTelemetry sur les applications Java pour les outiller. C'est l'instrumentation automatique de l'agent qui définit les spans.
Il est aussi possible de définir ces spans par développement dans l'application en utilisant la librairie OTEL. Elle est disponible dans plusieurs langage.
Pour afficher les traces pas de visualisation spécifique mais un tableau est suffisant. L'idée ici est de lister l'ensemble des traces que l'on à collecté.

Il est possible de configurr la query pour ne retenir que certains traces. Le builder permet de faire facilement des filtres, sans forcement maitriser la syntaxe de Tempo. Plus de détails sur cet écran dans la doc officiel sur traceql de grafana

On vas filtrer uniquement sur les traces d'erreur
On vas dupliquer la visualization et rajouter un filtre pour n'afficher que les traces avec un status code différent de 200.
Spoiler la solution est là
ou alors directement en traceQL : {span.http.response.status_code!=200}
On explore les traces ?
Pour aller plus loin avec les traces on peux, en cliquant sur le lien avec l'ID de la Trace ou de la Span acceder à la page de détail de la trace.
On retrouve ici les différents spans qui compose la trace, les temps de traitement de chaque span. Assez pratique poru diagnostiquer les problemes en production :

🛫 Prochaine étape : Plugin Infinity & API ➡️