OpenTelemetry is an open-source observability framework that captures three types of signals from your web apps, APIs, and services:
Choosing the OpenTelemetry (OTLP) source type when creating a data stream turns Keboola into a drop-in OTLP/HTTP endpoint. Any official OpenTelemetry SDK or collector can export directly to Keboola, and the incoming telemetry data lands in Storage — queryable alongside your business data.
Your web apps, APIs, and services already generate telemetry data. Typically, this data lives in a dedicated monitoring tool (e.g., Datadog, New Relic, Grafana) while your business data lives in Keboola. When you need to understand how application performance affects business outcomes, you have to export data, build custom pipelines, or switch between tools.
With OTLP source support in Data Streams, your telemetry and business data live side by side in Keboola Storage. This lets you:
The source detail page displays the OTLP endpoint URL along with a copy button and a ready-to-paste environment variable snippet.
To connect any OpenTelemetry SDK or collector, set two environment variables:
export OTEL_EXPORTER_OTLP_ENDPOINT="<your-stream-endpoint>"
export OTEL_EXPORTER_OTLP_PROTOCOL="http/protobuf"
Replace <your-stream-endpoint> with the endpoint URL shown on the source detail page. Once these variables are set, any official OpenTelemetry SDK will automatically pick them up and begin exporting telemetry data to your Keboola project.
Records are typically available in Storage within approximately 15 seconds of ingestion. Check the Table statistics on the source detail page to confirm data is flowing.
When you create an OTLP data stream, three destination tables are automatically created and pre-mapped:
| Table | Content |
|---|---|
| logs | Log records emitted by your applications (events, errors, warnings). |
| metrics | Metric data points (counters, gauges, histograms, etc.). |
| traces | Distributed trace spans with timing and context. |
Each table’s column mapping is pre-configured so the fields you query most often — such as service, severity, trace_id, host_name, k8s_pod_name, and deployment_environment — are stored as top-level columns rather than buried inside a JSON blob.
Key details:
The OTLP/HTTP endpoint is compatible with any OpenTelemetry SDK or collector. Below are quick-start snippets for popular languages and the OpenTelemetry Collector.
Install the OpenTelemetry SDK and OTLP exporter:
pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp-proto-http
Then set the environment variables before running your application:
export OTEL_EXPORTER_OTLP_ENDPOINT="<your-stream-endpoint>"
export OTEL_EXPORTER_OTLP_PROTOCOL="http/protobuf"
export OTEL_SERVICE_NAME="my-python-service"
python my_app.py
Install the OpenTelemetry SDK and OTLP exporter:
npm install @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-proto @opentelemetry/exporter-metrics-otlp-proto
Set the environment variables:
export OTEL_EXPORTER_OTLP_ENDPOINT="<your-stream-endpoint>"
export OTEL_EXPORTER_OTLP_PROTOCOL="http/protobuf"
export OTEL_SERVICE_NAME="my-node-service"
node --require @opentelemetry/sdk-node/register my_app.js
go get go.opentelemetry.io/otel \
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp \
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp
export OTEL_EXPORTER_OTLP_ENDPOINT="<your-stream-endpoint>"
export OTEL_EXPORTER_OTLP_PROTOCOL="http/protobuf"
export OTEL_SERVICE_NAME="my-go-service"
If you already run an OpenTelemetry Collector, add Keboola as an OTLP/HTTP exporter in your collector configuration:
exporters:
otlphttp/keboola:
endpoint: "<your-stream-endpoint>"
compression: gzip
service:
pipelines:
traces:
exporters: [otlphttp/keboola]
metrics:
exporters: [otlphttp/keboola]
logs:
exporters: [otlphttp/keboola]
This approach lets you fan out telemetry to both your existing monitoring backend and Keboola simultaneously.
Once your telemetry data is in Keboola Storage, you can query it alongside any other table. For example, to find how API errors affected revenue:
SELECT
DATE(logs."timestamp") AS date,
COUNT(DISTINCT transactions."order_id") AS lost_orders,
SUM(transactions."amount") AS lost_revenue
FROM logs
JOIN transactions ON logs."trace_id" = transactions."trace_id"
WHERE logs."severity" = 'ERROR'
GROUP BY date
Track how deployments affect error rates by correlating deployment timestamps with trace data:
SELECT
traces."deployment_environment",
traces."service",
DATE_TRUNC('hour', traces."timestamp") AS hour,
COUNT(*) AS total_spans,
COUNT(CASE WHEN traces."status_code" = 'ERROR' THEN 1 END) AS error_spans,
ROUND(100.0 * COUNT(CASE WHEN traces."status_code" = 'ERROR' THEN 1 END) / COUNT(*), 2) AS error_rate
FROM traces
WHERE traces."timestamp" >= DATEADD('day', -7, CURRENT_TIMESTAMP())
GROUP BY 1, 2, 3
ORDER BY hour DESC
If your application uses LLM APIs, instrument them with OpenTelemetry to track token usage, latency, and costs alongside product metrics:
SELECT
traces."service",
traces."ai_model" AS model,
COUNT(*) AS total_calls,
AVG(traces."duration_ms") AS avg_latency_ms,
SUM(traces."ai_total_tokens") AS total_tokens
FROM traces
WHERE traces."ai_model" IS NOT NULL
AND traces."timestamp" >= DATEADD('day', -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY total_tokens DESC
The OTLP source detail page provides: