OTEL Receiver Configuration
OpenTelemetry (OTel) native receivers are integral components of the OpenTelemetry Collector, designed to collect telemetry data such as metrics, traces, and logs directly from supported applications or services. These native receivers understand and ingest data in the format emitted by specific technologies without requiring translation or external exporters.
Similarly, the Prometheus receiver in OpenTelemetry facilitates the collection of metrics from systems exposing data in the Prometheus exposition format, enabling seamless integration with existing Prometheus instrumentation.
Native OTEL Receivers
What Are Native OTEL Receivers?
Native OTEL receivers connect directly to applications or services (e.g., Redis, Jaeger, MySQL), collecting telemetry using the native protocols or APIs of those technologies. This direct integration simplifies observability by reducing the need for custom instrumentation.
Key Features
- Direct Integration: Native receivers connect directly to applications or services (like Redis, Jaeger, or MySQL) and collect telemetry data using the application’s native protocols or APIs.
- Automatic Data Collection: They simplify observability by automatically gathering relevant metrics or traces, reducing the need for custom instrumentation or additional exporters.
- Configuration via YAML: Receivers are configured using YAML files, specifying endpoints, authentication, and other parameters.
Common OTel Native Receivers
- Redis Receiver: Collects metrics from Redis instances by connecting to the Redis server and querying for statistics such as memory usage, command counts, and latency.
- Jaeger Receiver: Ingests trace data from Jaeger clients or agents, allowing the OpenTelemetry Collector to process and export traces to various backends.
- Other Examples: Receivers exist for many technologies, including Kafka, MongoDB, MySQL, and more.
Example Configurations
Redis Receiver:
receivers:
redis:
endpoint: "localhost:6379"
password: ""
collection_interval: 10s
Jaeger Receiver:
receivers:
jaeger:
protocols:
grpc:
thrift_http:
thrift_compact:
thrift_binary:
Prometheus Receiver in OpenTelemetry
What is the Prometheus Receiver?
The Prometheus receiver is designed to scrape metrics endpoints that expose data in the Prometheus format (typically via HTTP). It collects these metrics and makes them available for processing, transformation, and export to various backends supported by OpenTelemetry.
Why Use the Prometheus Receiver?
- Leverage Existing Instrumentation: Many applications and infrastructure components already expose metrics in Prometheus format. The Prometheus receiver allows you to reuse this instrumentation without modification.
- Unified Observability: By collecting Prometheus metrics alongside other telemetry data (traces, logs), you achieve a unified observability pipeline.
- Flexibility: Integrate with a wide range of exporters and backends supported by OpenTelemetry, beyond what Prometheus natively supports.
Example Configurations
Scraping Node Exporter Metrics:
Node Exporter is a Prometheus exporter that exposes hardware and OS metrics from *nix
systems.
receivers:
prometheus:
config:
scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
Scraping Windows Exporter Metrics:
Windows Exporter exposes Windows system metrics in Prometheus format.
receivers:
prometheus:
config:
scrape_configs:
- job_name: 'windows'
static_configs:
- targets: ['localhost:9182']
Exporter Management Options
Beyond scraping existing exporters, you can automate deployment of exporters such as Node Exporter or Windows Exporter:
- Automatic Download and Run: Enable the exporter from the configuration UI to download and run it from the OpsRamp portal on the default port.
- Custom Exporter Port: Specify a custom port via command-line arguments.
- Custom Configuration: Provide a custom configuration file for the exporter.
This automation ensures consistent metrics collection, even if exporters are not pre-installed on target systems.
Benefits
- Unified Observability: Collect telemetry from diverse sources using a standardized approach.
- Flexibility: Easily extend monitoring by enabling or configuring new receivers.
- Vendor Neutrality: OpenTelemetry is open-source and vendor-agnostic, suitable for varied environments.
Summary
OTel native receivers are essential for collecting telemetry data from various applications and services without additional dependencies. They provide a flexible, scalable, and unified approach to observability, making it easier to monitor and troubleshoot distributed systems.
Exporter Config File
You can maintain configurations for both Node Exporter and Windows Exporter in YAML files. These configurations specify settings such as the web endpoint, telemetry path, enabled and disabled collectors, and logging level. Customize these files as needed and reference them in your deployment or startup scripts.
Example Configurations
Node Exporter Configuration (node_exporter_config.yaml)
web:
listen-address: ":9100"
telemetry-path: "/metrics"
collectors:
enabled:
- cpu
- meminfo
- filesystem
- netdev
- loadavg
- diskstats
- time
- uname
- vmstat
- stat
- systemd
- textfile
disabled:
- hwmon
- mdadm
- nfs
- zfs
log:
level: "info"
Windows Exporter Configuration (windows_exporter_config.yaml)
web:
listen-address: ":9100"
telemetry-path: "/metrics"
collectors:
enabled:
- cpu
- meminfo
- filesystem
- netdev
- loadavg
- diskstats
- time
- uname
- vmstat
- stat
- systemd
- textfile
disabled:
- hwmon
- mdadm
- nfs
- zfs
log:
level: "info"
How It Will Be Used In Agent?
- Agent Reference these config files when starting the exporter using the appropriate command-line flag, for example:
- Node Exporter:
./node_exporter --config.file={agent_installed_path}/plugins/node_exporter_config.yaml
- Windows Exporter:
windows_exporter.exe --config.file="C:\path\to\windows_exporter_config.yaml"
- Node Exporter:
- Customize the enabled/disabled collectors and other settings as per your monitoring requirements.
Agent Alert Definitions
In addition to collecting metrics, our system supports user-defined alert definitions. These alerts allow you to monitor specific conditions on your devices and receive notifications when thresholds are breached. Alert definitions are specified in YAML format, as shown below.
Template for Alert Definition
Below is a sample template for a single alert definition:
alertDefinitions:
- name: alert_definition_name
interval: alert_polling_time
expr: promql_expression
isAvailability: true
warnOperator: operator_macro
warnThreshold: str_threshold_value
criticalOperator: operator_macro
criticalThreshold: str_threshold_value
alertSub: alert_subject
alertBody: alert_description
Field Descriptions:
name
: Provide a unique name for the alert definition.interval
: Polling interval at which alert definition should run. The interval should given in time duration format like 1m, 5m, 15m or 1h.expr
: Valid promQL query expression that computes the value for alert generation.isAvailability
: boolean variable that tells whether the alert definition should consider for resource availability computation or not.warnOperator
/criticalOperator
: Operators used to compare metric values against thresholds. Supported operators:- GREATER_THAN_EQUAL
- GREATER_THAN
- EQUAL
- NOT_EQUAL
- LESS_THAN_EQUAL
- LESS_THAN
- EXISTS
Note
EXISTS
operator works as if any metric sample exists wih given promql expression then it alerts irrespective of thresolds.
warnThreshold
: Specify warning-level threshold Value for the metric.criticalThreshold
: Specify critical-level threshold Value for the metric.alertSuband
/alertBody
: are for content displayed for a warning or critical alert on the alert browser. We can use macros to get the dynamic values in it. The actual values replace the alert displayed on the alert browser. Below are macros that can be used while defining alert subject/body.${severity}
${metric.name}
${component.name}
${metric.value}
${threshold}
${resource.name}
${resource.uniqueid}
User Configuration
By default, OpsRamp provides basic alert definitions for pods, nodes, and more. Users can customize alert definitions by editing the alert definitions section within the template.
Example configuration:
alertDefinitions:
- name: "HighCPUUsage"
interval: "1m"
expr: "avg(rate(node_cpu_seconds_total[5m])) > 0.8"
isAvailability: false
warnOperator: "GREATER_THAN"
warnThreshold: "0.7"
criticalOperator: "GREATER_THAN"
criticalThreshold: "0.8"
alertSub: "High CPU Usage Alert"
alertBody: "CPU usage is critically high on the system."
- name: "MemoryUsage"
interval: "2m"
expr:
isAvailability: false
warnOperator: "GREATER_THAN"
warnThreshold: "0.85"
criticalOperator: "GREATER_THAN"
criticalThreshold: "0.9"
alertSub: "High Memory Usage Alert"
alertBody: "Memory usage is critically high on the system."
- name: "DiskSpace"
interval: "5m"
expr: "(node_filesystem_free_bytes{fstype!~\"nfs|tmpfs|rootfs\"} / node_filesystem_size_bytes{fstype!~\"nfs|tmpfs|rootfs\"})"
isAvailability: false
warnOperator: "LESS_THAN"
warnThreshold: "0.15"
criticalOperator: "LESS_THAN"
criticalThreshold: "0.1"
alertSub: "Low Disk Space Alert"
alertBody: "Disk space is critically low on the system."
You can remove or add new alerts using standard PromQL expressions.
Configure Availability
To configure resource availability, define an alert and set isAvailability
to true
. This alert definition will be used to compute the availability status of the resource. For example, to define pod availability based on pod memory usage:
alertDefinitions:
- name: "HighCPUUsage"
interval: "1m"
expr: "avg(rate(node_cpu_seconds_total[5m])) > 0.8"
isAvailability: false
warnOperator: "GREATER_THAN"
warnThreshold: "0.7"
criticalOperator: "GREATER_THAN"
criticalThreshold: "0.8"
alertSub: "High CPU Usage Alert"
alertBody: "CPU usage is critically high on the system."
If the alert HighCPUUsage
is triggered at either warning or critical level, the availability of the resource will be considered down; otherwise, it will be up.