Normalization
Normalization is a critical stage that standardizes log data from diverse sources into consistent formats, enabling unified analysis across different logging systems and formats.
Log Format Standards
The processor supports several widely-used log formats:
Generic Formats
Format | Notation | Key Identifier | Layout Characteristics | Example Fields |
---|---|---|---|---|
Elastic Common Schema (ECS) | Dot notation with lowercase | @timestamp | Hierarchical structure | source.ip , network.direction |
Splunk Common Information Model (CIM) | Underscore with lowercase | _time | Flat structure | src_ip , network_direction |
Advanced Security Information Model (ASIM) | PascalCase | TimeGenerated | Explicit names | SourceIp , NetworkDirection |
Security-specific Formats
Format | Description | Key Identifier | Example Fields |
---|---|---|---|
Common Event Format (CEF) | ArcSight's standard format | rt (receiptTime) | networkUser , sourceAddress |
Log Event Extended Format (LEEF) | IBM QRadar's format | devTime | networkUser , srcAddr |
Common Security Log (CSL) | Microsoft Sentinel's format | TimeGenerated | NetworkUser , SourceAddress |
Format Detection
The processor can automatically detect source formats through certain characteristic fields, e.g.:
Context | Field | Format |
---|---|---|
Timestamp | @timestamp | ECS |
_time | CIM | |
TimeGenerated | ASIM/CSL | |
Security | rt | CEF |
devTime | LEEF | |
CSL detection | TimeGenerated + LogSeverity | CSL |
TimeGenerated only | ASIM |
Conversion
Casing and Delimiters
Each format follows specific naming conventions:
ECS | source.ip , event.severity |
CIM | src_ip , event_severity |
ASIM | SourceIp , EventSeverity |
CEF | sourceAddress , eventSeverity |
LEEF | srcAddr , evtSev |
CSL | SourceIP , EventSeverity |
Complex format conversions may impact performance.
Field Mapping
Common network fields based on context across various formats:
Context | |||
---|---|---|---|
Format | Source IP | Destination IP | Direction |
ECS | source.ip | destination.ip | network.direction |
CIM | src_ip | dest_ip | network_direction |
ASIM | SourceIp | DstIp | NetworkDirection |
CEF | src | dst | networkDirection |
LEEF | srcAddr | dstAddr | netDir |
CSL | SourceIp | DestinationIp | NetworkDirection |
Configuration
Basic
Convert from ECS to ASIM format:
normalize:
- source_format: ecs
- target_format: asim
Field-specific
Convert a specific network field:
normalize:
- field: network_data
- source_format: cef
- target_format: ecs
Auto-detection
Let the processor detect the source format:
normalize:
- target_format: cim
Best Practices
For data integrity, always validate transformed logs against originals, keep original fields when possible for debugging, and document format-specific transformations.
For performance, do the normalization early in the pipeline, cache results for lookup when possible, and monitor transformation overhead.
For error handling, use ignore_failure
and implement fallback mechanisms. Also, do not forget to test with diverse samples.