Tracing & Logging with OpenTelemetry
Overview
Section titled “Overview”OpenTelemetry (OTEL) is an open-source observability framework that provides a set of APIs, libraries, agents, and instrumentation to capture and export telemetry data.
Core components
Section titled “Core components”- Traces: Execution paths of requests through services, composed of multiple spans.
- Spans: Individual operations within a trace containing metadata like operation name and timestamps.
- Metrics: Quantitative performance data (CPU, memory, request counts).
- Logs: Event records providing context about operations and errors.
- Instrumentation: Process of adding code to collect telemetry data.
- Collector: Component that receives, processes, and exports telemetry data.
Orvanta integrates OTEL for centralized aggregation of traces and logs, enabling enhanced alerting, monitoring, and analysis beyond the built-in service logs.
Jaeger integration
Section titled “Jaeger integration”Add to docker-compose.yml:
jaeger: image: jaegertracing/jaeger:latest ports: - "16686:16686" expose: - 4317This exposes the Jaeger UI on port 16686 and the OTEL collector on port 4317.
Configuration
Section titled “Configuration”In Orvanta’s Instance Settings under the OTEL/Prom tab:
- Set the Jaeger endpoint to
http://jaeger:4317. - Configure the service name.
- Toggle the Tracing option.
Trace filtering tags
Section titled “Trace filtering tags”Available tags for searching traces:
job_id: Job identifierroot_job: Root job (flow) IDparent_job: Parent job IDflow_step_id: Workflow step IDscript_path: Script pathscript_hash: Deployed script version hashworkspace_id: Workspace nameworker_id: Worker identifierlanguage: Script languagetag: Queue tagjob_kind: Job type (script, flow, appscript, aiagent, preview, flowscript)trigger_kind: Trigger method (schedule, webhook, kafka, http, sqs)trigger: Trigger identifiercreated_by: User or system that started the job
OTEL trace context in jobs
Section titled “OTEL trace context in jobs”When tracing is enabled, Orvanta exposes trace context as environment variables:
| Variable | Description |
|---|---|
TRACEPARENT | W3C Trace Context header: 00-{trace_id}-{span_id}-01 |
OTEL_TRACE_ID | Hex-encoded trace ID |
OTEL_SPAN_ID | Hex-encoded span ID |
These are available in Python, Bash, Bun, Deno, Go, TypeScript, Rust, C#, Ruby, Nu, Java, and PHP jobs.
Metrics with Prometheus
Section titled “Metrics with Prometheus”Jaeger can generate time series metrics stored in Prometheus. Add to docker-compose.yml:
prometheus: image: prom/prometheus:latest expose: - 9090 volumes: - ./prometheus-config.yaml:/etc/prometheus/prometheus.yml command: - "--config.file=/etc/prometheus/prometheus.yml"With prometheus-config.yaml:
global: scrape_interval: 15s evaluation_interval: 15s
scrape_configs: - job_name: aggregated-trace-metrics static_configs: - targets: ['jaeger:8889']Orvanta metrics export via OTLP
Section titled “Orvanta metrics export via OTLP”Enable the Metrics toggle in Instance settings > OTEL/Prom to export operational metrics to any OTLP-compatible collector alongside traces and logs.
Exported metrics
Section titled “Exported metrics”| Metric | Type | Attributes |
|---|---|---|
orvanta.queue.push_count | Counter | — |
orvanta.queue.delete_count | Counter | — |
orvanta.queue.pull_count | Counter | — |
orvanta.queue.zombie_restart_count | Counter | — |
orvanta.queue.zombie_delete_count | Counter | — |
orvanta.queue.count | Gauge | tag |
orvanta.queue.running_count | Gauge | tag |
orvanta.worker.started | Counter | — |
orvanta.worker.uptime | Gauge | worker |
orvanta.worker.execution_count | Counter | tag |
orvanta.worker.execution_duration | Histogram | tag |
orvanta.worker.execution_failed | Counter | tag |
orvanta.worker.busy | Gauge | worker |
orvanta.worker.pull_duration | Histogram | worker, has_job |
orvanta.db.pool.active | Gauge | — |
orvanta.db.pool.idle | Gauge | — |
orvanta.db.pool.max | Gauge | — |
orvanta.health.db_latency | Gauge | — |
orvanta.health.db_unresponsive | Gauge | — |
orvanta.health.status | Gauge | phase |
Protocol selection
Section titled “Protocol selection”The OTEL/Prom settings tab provides a Protocol dropdown:
- grpc (default): Uses tonic gRPC client against the OTLP gRPC endpoint (port 4317).
- http/protobuf: Uses HTTP client against the OTLP HTTP endpoint (port 4318).
Use http/protobuf when gRPC is unsupported.
Tempo and Grafana integration
Section titled “Tempo and Grafana integration”Use the example docker-compose.yml from the Orvanta repo, which includes the OpenTelemetry collector, Tempo, Loki, and Grafana.
OpenTelemetry Collector configuration
Section titled “OpenTelemetry Collector configuration”receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317
processors: batch: timeout: 5s
exporters: otlphttp/loki: endpoint: http://loki:3100/otlp tls: insecure: true otlp/tempo: endpoint: http://tempo:4317 tls: insecure: true
service: pipelines: traces: receivers: [otlp] processors: [batch] exporters: [otlp/tempo] logs: receivers: [otlp] processors: [batch] exporters: [otlphttp/loki]Tempo configuration
Section titled “Tempo configuration”stream_over_http_enabled: true
server: http_listen_port: 3200 log_level: info
query_frontend: search: duration_slo: 5s throughput_bytes_slo: 1.073741824e+09 metadata_slo: duration_slo: 5s throughput_bytes_slo: 1.073741824e+09 trace_by_id: duration_slo: 5s
distributor: receivers: otlp: protocols: grpc: endpoint: "tempo:4317"
ingester: max_block_duration: 5m
compactor: compaction: block_retention: 1h
metrics_generator: registry: external_labels: source: tempo cluster: orvanta storage: path: /var/tempo/generator/wal remote_write: - url: http://prometheus:9090/api/v1/write send_exemplars: true traces_storage: path: /var/tempo/generator/traces
storage: trace: backend: local wal: path: /var/tempo/wal local: path: /var/tempo/blocks
overrides: defaults: metrics_generator: processors: [service-graphs, span-metrics, local-blocks] generate_native_histograms: bothLoki configuration
Section titled “Loki configuration”auth_enabled: false
server: http_listen_port: 3100
common: ring: instance_addr: 0.0.0.0 kvstore: store: inmemory replication_factor: 1 path_prefix: /tmp/loki
schema_config: configs: - from: 2020-05-15 store: tsdb object_store: filesystem schema: v13 index: prefix: index_ period: 24h
storage_config: filesystem: directory: /tmp/loki/chunks
limits_config: allow_structured_metadata: truePrometheus configuration
Section titled “Prometheus configuration”global: scrape_interval: 15s evaluation_interval: 15s
scrape_configs: - job_name: 'prometheus' static_configs: - targets: [ 'localhost:9090' ] - job_name: 'tempo' static_configs: - targets: [ 'tempo:3200' ]Orvanta configuration
Section titled “Orvanta configuration”In Instance Settings > OTEL/Prom:
- Set the endpoint to
http://otel-collector:4317. - Toggle both Tracing and Logs options.
Grafana UI (port 3000)
Section titled “Grafana UI (port 3000)”- Traces: Use the Tempo datasource to search traces by Orvanta-set tags.
- Logs: Use the Loki datasource to view logs.
- Metrics: Use the Prometheus datasource; Tempo generates metrics labeled as:
traces_spanmetrics_calls_totaltraces_spanmetrics_latencytraces_spanmetrics_latency_buckettraces_spanmetrics_latency_counttraces_spanmetrics_latency_sumtraces_spanmetrics_size_total