Sample
Synopsis
Reduces data volume by sampling log entries based on configurable rules.
Schema
- sample:
rules: <rule[]>
exclude_filters: <string[]>
tag: <string>
description: <text>
if: <script>
ignore_failure: <boolean>
ignore_missing: <boolean>
on_failure: <processor[]>
on_success: <processor[]>
Configuration
The following fields are used to define the processor:
Field | Required | Default | Description |
---|---|---|---|
rules | N | - | List of sampling rules with filters and rates |
exclude_filters | N | - | List of conditions to exclude events from sampling |
description | N | - | Explanatory note |
if | N | - | Condition to run |
ignore_failure | N | false | Continue processing if sampling fails |
ignore_missing | N | false | Skip if referenced fields don't exist |
on_failure | N | - | Error handling processors |
on_success | N | - | Success handling processors |
tag | N | - | Identifier |
Sampling Rule Object
Each rule in the rules
array is an object with the following properties:
Field | Required | Default | Description |
---|---|---|---|
filter | Y | - | Condition that determines which events this rule applies to |
sampling_rate | Y | 2 | Keep 1 event for every N events (integer or string template) |
Details
Reduces data volume by implementing systematic sampling of log entries based on configurable rules. The processor keeps one event for every N matching events, where N is the specified sampling rate, allowing for efficient data reduction while maintaining statistical representation.
The processor adds a metadata field _vmetric.sampled
to sampled events, showing the current sampling rate (e.g., "10:1"). This information is useful for adjusting statistics during analysis to account for sampling.
Rule-based sampling provides fine-grained control over which types of events are sampled and at what rates. This helps balance data volume with analytical needs by keeping all critical events while sampling high-volume, routine events.
Sampling inherently discards data, so use with caution for critical events. Always use exclude_filters
to preserve important events like errors, alerts, or security incidents that require 100% preservation regardless of volume.
Examples
Basic
Applying simple 1:10 sampling... |
|
keeps every 10th event, reducing volume by 90% |
Conditional
Applying different sampling rates based on log level... |
|
keeps all error/critical logs, 50% of warnings, 10% of info, and 1% of debug logs |
Dynamic
Using field values to determine sampling rate... |
|
applies configurable rates from system settings |
Service-Based
Sampling differently based on service... |
|
tailors sampling based on service characteristics while preserving error data |
Complex
Comprehensive sampling keeps relevant data while significantly reducing traffic |
|