Building a Complete Grafana LGTM Observability Platform with Docker Compose

What is LGTM?

LGTM stands for Loki, Grafana, Tempo, and Metrics (usually Prometheus). It’s an open-source toolkit for monitoring your applications and infrastructure.

Think of it as a control center where you can:

  • See all your logs in one place (Loki)
  • View beautiful dashboards of your system’s performance (Grafana)
  • Track requests as they move through your system (Tempo)
  • Monitor key metrics like CPU, memory usage, and more (Prometheus)

Why Should You Care?

  • Find Problems Faster: When something breaks, quickly see what happened
  • Understand Your System: Get insights into how your application performs
  • Plan Better: Use real data to make decisions about scaling or optimizing
  • Learn by Doing: Perfect project to understand modern monitoring techniques

What You’ll Build

This guide will show you how to create a complete monitoring system using Docker Compose. When finished, you’ll have:

LGTM Architecture
How it works: Your applications send data to the OpenTelemetry Collector, which forwards it to specialized storage systems. Grafana brings everything together in one dashboard.

Ready-to-Use Sample Repository

For convenience, you can directly use or reference a complete sample repository containing all the configuration files and setup described in this guide:

This repository includes all the configuration files for:

  • OpenTelemetry Collector
  • Prometheus
  • Loki
  • Tempo
  • Grafana (with pre-configured datasources and dashboards)

You can either clone this repository to get started immediately or follow the step-by-step guide below to understand each component.

1
2
3
4
5
6
# Clone the repository
git clone https://github.com/samzhu/grafana-otel-lgtm-stack.git
cd grafana-otel-lgtm-stack

# Start the stack
docker compose up -d

Prerequisites

  • Basic knowledge of command line and Docker
  • Docker and Docker Compose installed on your computer
  • A terminal to run commands

Installing Docker (for Absolute Beginners)

If you don’t have Docker installed yet:

Docker Compose is included with Docker Desktop. For Linux, you might need to install it separately.

Quick Start Guide

Step 1: Setting Up Your Environment

Create a new directory for your project:

1
2
mkdir grafana-lgtm
cd grafana-lgtm

Create the following files with the configurations shown below.

Step 2: Starting the System

1
2
3
4
5
# Start all services
docker compose up -d

# Check if everything is running
docker compose ps

Step 3: Accessing Your New Monitoring System

Open your browser and visit:

Congratulations! You now have a professional-grade monitoring system running on your computer.

What’s Included in Your Monitoring System

Main Components

  1. Grafana (http://localhost:3000)

    • The dashboard where you view everything
    • Pre-configured to show logs, metrics, and traces
  2. Prometheus (http://localhost:9090)

    • Stores performance metrics
    • Keeps track of numbers like CPU usage, memory consumption, etc.
  3. Loki (http://localhost:3100)

    • Stores logs from your applications
    • Lets you search through logs efficiently
  4. Tempo (http://localhost:3200)

    • Stores traces of requests through your system
    • Helps you understand how requests flow through your applications
  5. OpenTelemetry Collector (http://localhost:4318)

    • Central receiver for all monitoring data
    • Routes each type of data to the right storage system

Understanding OTLP (OpenTelemetry Protocol)

Throughout this guide, you’ll see “OTLP” mentioned. This is the OpenTelemetry Protocol - a standardized way for applications to send monitoring data. Think of it as a common language that your applications speak to tell the monitoring system what’s happening.

Configuration Files Explained

Let’s look at the main configuration files you’ll need to create:

Docker Compose File (docker-compose.yml)

This file tells Docker how to set up all the services:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
services:
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
container_name: otel-collector
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml:ro
command: ["--config", "/etc/otel-collector-config.yaml"]
ports:
- "4318:4318" # OTLP HTTP
- "9464:9464" # Prometheus metrics
depends_on:
- prometheus
- loki
- tempo

prometheus:
image: prom/prometheus:latest
container_name: prometheus
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
ports:
- "9090:9090"

loki:
image: grafana/loki:latest
container_name: loki
user: "0:0"
volumes:
- ./loki-local-config.yaml:/etc/loki/local-config.yaml:ro
- loki-data:/tmp/loki
command: -config.file=/etc/loki/local-config.yaml
ports:
- "3100:3100"

tempo:
image: grafana/tempo:latest
container_name: tempo
user: "0:0"
volumes:
- ./tempo-local.yaml:/etc/tempo/tempo-local.yaml:ro
- tempo-data:/tmp/tempo
command: ["-config.file=/etc/tempo/tempo-local.yaml"]
ports:
- "3200:3200"

grafana:
image: grafana/grafana:latest
container_name: grafana
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- ./grafana/datasources.yml:/etc/grafana/provisioning/datasources/datasources.yml:ro
- ./grafana/dashboards.yml:/etc/grafana/provisioning/dashboards/dashboards.yml:ro
- ./grafana/dashboards/:/var/lib/grafana/dashboards:ro
ports:
- "3000:3000"
depends_on:
- prometheus
- loki
- tempo

volumes:
loki-data: {}
tempo-data: {}

OpenTelemetry Collector Configuration (otel-collector-config.yaml)

The collector receives data from your applications and sends it to the right place:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
grpc:
endpoint: 0.0.0.0:4317

processors:
batch: {}

exporters:
prometheus:
endpoint: "0.0.0.0:9464"
resource_to_telemetry_conversion:
enabled: true

otlphttp/loki:
endpoint: "http://loki:3100/loki/api/v1/push"

otlp/tempo:
endpoint: "tempo:4317"
insecure: true

service:
pipelines:
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus]
logs:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/loki]
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp/tempo]

Prometheus Configuration (prometheus.yml)

Prometheus collects metrics from the OpenTelemetry Collector:

1
2
3
4
5
6
global:
scrape_interval: 5s
scrape_configs:
- job_name: 'otel-collector'
static_configs:
- targets: ['otel-collector:9464']

Loki Configuration (loki-local-config.yaml)

Loki stores and indexes your logs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
auth_enabled: false

server:
http_listen_port: 3100

common:
path: /tmp/loki
storage:
filesystem:
chunks_directory: /tmp/loki/chunks
rules_directory: /tmp/loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory

schema_config:
configs:
- from: 2022-10-24
store: boltdb-shipper
object_store: filesystem
schema: v12
index:
prefix: index_
period: 24h

limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
allow_structured_metadata: true

Tempo Configuration (tempo-local.yaml)

Tempo stores trace data that shows how requests move through your system:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
server:
http_listen_port: 3200
grpc_listen_port: 4317

distributor:
receivers:
otlp:
protocols:
grpc:
http:

ingester:
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1

compactor:
compaction:
block_retention: 1h

storage:
trace:
backend: local
wal:
path: /tmp/tempo/wal
local:
path: /tmp/tempo/blocks

Grafana Configuration

Data Sources (grafana/datasources.yml)

This tells Grafana where to find your data:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true

- name: Loki
type: loki
access: proxy
url: http://loki:3100
isDefault: false

- name: Tempo
type: tempo
access: proxy
url: http://tempo:3200
isDefault: false

Dashboard Provider (grafana/dashboards.yml)

This tells Grafana where to find dashboard definitions:

1
2
3
4
5
6
7
8
9
10
apiVersion: 1
providers:
- name: 'Local Dashboards'
orgId: 1
folder: ''
type: file
disableDeletion: false
updateIntervalSeconds: 5
options:
path: /var/lib/grafana/dashboards

Sample Dashboard (grafana/dashboards/springboot-observability.json)

Create this directory and file to include a sample dashboard:

1
mkdir -p grafana/dashboards
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
{
"uid": "springboot-observability",
"title": "Spring Boot OTEL Demo Dashboard",
"time": {
"from": "now-5m",
"to": "now"
},
"panels": [
{
"type": "timeseries",
"title": "JVM Heap Memory Used",
"datasource": "Prometheus",
"targets": [
{
"expr": "process_runtime_jvm_memory_usage{service_name=\"springboot-otel-demo\", type=\"heap\"}"
}
],
"fieldConfig": {
"defaults": {
"unit": "bytes"
}
},
"gridPos": {
"x": 0,
"y": 0,
"w": 12,
"h": 8
}
},
{
"type": "logs",
"title": "Application Logs (springboot-otel-demo)",
"datasource": "Loki",
"targets": [
{
"expr": "{service_name=\"springboot-otel-demo\"}"
}
],
"gridPos": {
"x": 0,
"y": 8,
"w": 24,
"h": 8
}
}
]
}

Connecting Your Applications

To get the most value from your monitoring system, you need to connect your applications to it. Here’s how to do it for some common languages:

For Java Applications

Add the OpenTelemetry Java Agent when starting your application:

1
2
3
4
5
6
7
8
# Download the agent (if you haven't already)
curl -L -o opentelemetry-javaagent.jar https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar

# Run your application with the agent
java -javaagent:opentelemetry-javaagent.jar \
-Dotel.exporter.otlp.endpoint=http://localhost:4318 \
-Dotel.resource.attributes=service.name=my-java-app \
-jar your-application.jar

For Node.js Applications

Install OpenTelemetry packages:

1
npm install @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-http @opentelemetry/exporter-metrics-otlp-http @opentelemetry/exporter-logs-otlp-http

Add this code to your application:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { OTLPMetricExporter } = require('@opentelemetry/exporter-metrics-otlp-http');
const { OTLPLogExporter } = require('@opentelemetry/exporter-logs-otlp-http');

const sdk = new NodeSDK({
traceExporter: new OTLPTraceExporter({
url: 'http://localhost:4318/v1/traces'
}),
metricExporter: new OTLPMetricExporter({
url: 'http://localhost:4318/v1/metrics'
}),
logExporter: new OTLPLogExporter({
url: 'http://localhost:4318/v1/logs'
}),
serviceName: 'my-nodejs-app'
});

sdk.start();

Viewing Your Data

  1. Open Grafana at http://localhost:3000
  2. Login with username admin and password admin
  3. Click “Explore” in the left sidebar to query your data
  4. Select a data source (Prometheus, Loki, or Tempo) and start exploring

Basic Query Examples

Here are some simple queries to get you started:

Prometheus (for metrics)

1
2
3
4
5
# Show all metrics for a specific service
{service_name="my-java-app"}

# Show CPU usage
process_cpu_usage{service_name="my-java-app"}

Loki (for logs)

1
2
3
4
5
# Show all logs for a specific service
{service_name="my-java-app"}

# Show only error logs
{service_name="my-java-app"} |= "error"

Tempo (for traces)

Usually, you’ll click on a trace ID in a log or metric to see a trace. But you can also search:

  • By service name
  • By duration (to find slow requests)
  • By trace ID (if you know it)

Common Tasks and Commands

Restarting or Resetting Everything

1
2
3
4
5
6
# Restart all services
docker compose restart

# Completely reset (deletes all data)
docker compose down -v
docker compose up -d

Checking for Problems

If something isn’t working right:

1
2
3
4
5
# View logs from all services
docker compose logs

# View logs from just one service (e.g., Grafana)
docker compose logs grafana

Glossary for Beginners

  • Logs: Text messages your applications produce, like “User logged in” or “Error occurred”
  • Metrics: Numbers that measure performance, like CPU usage or request count
  • Traces: Records of a request as it travels through your system
  • Dashboard: A screen showing charts and data visualizations
  • OpenTelemetry: A standard way to collect and send monitoring data
  • Docker: A tool for running applications in containers
  • Docker Compose: A tool for running multiple containers together

Next Steps

Once you’re comfortable with the basics:

  1. Create custom dashboards for your applications
  2. Set up alerts to notify you when problems occur
  3. Add more applications to your monitoring system
  4. Learn PromQL (Prometheus Query Language) for advanced metrics analysis
  5. Explore more OpenTelemetry features

Resources for Learning More

Troubleshooting Common Issues

Services Won’t Start

If some services fail to start:

  1. Make sure all your configuration files are correctly named and placed
  2. Check for port conflicts - another application might be using the same ports
  3. Check the logs with docker compose logs [service-name]

Can’t See Data in Grafana

If Grafana is running but you don’t see data:

  1. Verify all services are running with docker compose ps
  2. Check if data sources are correctly configured in Grafana
  3. Make sure your application is correctly sending data to the collector

OpenTelemetry Issues

If your application connects but data doesn’t appear:

  1. Confirm you’re using the correct endpoint URL (http://localhost:4318)
  2. Check for firewall or network issues blocking connections
  3. Verify your application is properly configured with the correct service name

Reference