Skip to main content

Linux Datasets

Linux datasets define collection rules for Linux devices (dataset type linux). The collector types below, including file log collection, are configured through the dataset creation wizard — see Management. For the dataset and profile model, see Overview.

Collector Types

  • Linux System Events (linux_host_log_collector): File path input for the system log source.
  • Linux Audit Events (linux_audit_report_log_collector): File path input for the audit log source.
  • Linux Firewall Events (linux_firewall_log_collector): File path input for the firewall log source.

File Log Collection

The linux_file_log_collector collects lines from arbitrary log files on Linux hosts. It supports glob path expansion, lookback-based backfill, multiline parsing, include/exclude filtering, character-set decoding, and per-pipeline routing. The path syntax uses Linux-style paths (e.g. /var/log/myapp/*.log).

note

File log datasets can be created through the dataset creation wizard (Linux File Logs) or configured via device YAML files under config/devices/ using the schema below.

Loading include...

Example

Collecting a rotating application log on Linux with regex-based multiline detection and an include filter for errors and warnings...

devices:
- id: 123457
name: app-linux
type: linux
status: true
properties:
file_log_concurrency: 2
definitions:
- name: linux_file_log_collector
status: true
inputs:
- id: 1001
name: Application Logs
status: true
properties:
path: /var/log/myapp/*.log
start_date: 300
date_format: yyyy-MM-dd HH:mm:ss
line_parser:
type: regex
regex: '^\d{4}-\d{2}-\d{2}'
date_based: true
filter_mode: include
filter_rules:
- type: regex
regex: '^(ERROR|WARN)'
encoding: utf-8
pipeline_name: my-pipeline