Azure Data Explorer
Synopsis
Creates an Azure Data Explorer (Kusto) target that ingests data directly into Azure Data Explorer tables. Supports multiple tables, custom schemas, and various file formats.
Schema
- name: <string>
description: <string>
type: azdx
pipelines: <pipeline[]>
status: <boolean>
properties:
tenant_id: <string>
client_id: <string>
client_secret: <string>
endpoint: <string>
database: <string>
table: <string>
schema: <string>
type: <string>
drop_unknown_stream_events: <boolean>
flush_immediately: <boolean>
timeout: <numeric>
batch_size: <numeric>
max_size: <numeric>
field_format: <string>
tables:
- name: <string>
schema: <string>
debug:
status: <boolean>
dont_send_logs: <boolean>
Configuration
The following fields are used to define the target:
Field | Required | Default | Description |
---|---|---|---|
name | Y | - | Target name |
description | N | - | Optional description |
type | Y | - | Must be azdx |
pipelines | N | - | Optional post-processor pipelines |
status | N | true | Enable/disable the target |
Azure
Field | Required | Default | Description |
---|---|---|---|
tenant_id | N* | - | Azure tenant ID (required unless using managed identity) |
client_id | N* | - | Azure client ID (required unless using managed identity) |
client_secret | N* | - | Azure client secret (required unless using managed identity) |
endpoint | Y | - | Azure Data Explorer cluster endpoint |
database | Y | - | Target database name |
table | N | - | Default table name |
schema | N | - | Table schema definition for the default table |
type | N | parquet | Data format: parquet , json , multijson , or avro |
* = Conditionally required (see authentication methods above)
Ingestion Options
Field | Required | Default | Description |
---|---|---|---|
drop_unknown_stream_events | N | true | Ignore events for undefined tables |
flush_immediately | N | true | Send data to ADX without waiting for batch completion |
timeout | N | 30 | Connection timeout in seconds |
batch_size | N | 100000 | Maximum number of messages per batch |
max_size | N | 32MB | Maximum batch size in bytes (0 for unlimited) |
field_format | N | - | Data normalization format. See applicable Normalization section |
Multiple Tables
You can define multiple tables to ingest data into:
targets:
- name: azdx_multiple_tables
type: azdx
properties:
tables:
- name: "security_logs"
schema: "<schema definition>"
- name: "system_logs"
schema: "<schema definition>"
Debug Options
Field | Required | Default | Description |
---|---|---|---|
debug.status | N | false | Enable debug logging |
debug.dont_send_logs | N | false | Process logs but don't send to ADX (testing) |
Details
The Azure Data Explorer target supports ingesting data into multiple tables with different schemas. When using the SystemS3
field in your logs, the value will be used to route the message to the appropriate table.
The target automatically validates table existence before starting ingestion. Data is buffered locally until batch_size
or max_size
is reached, or when an explicit flush is triggered.
For tables not defined in the configuration, the target can automatically discover them from the database if they exist. The service principal must have appropriate permissions on the database and tables.
Formats
Format | Description |
---|---|
json | Each log entry is written as a separate JSON line (JSONL format) |
multijson | All log entries are written as a single JSON array |
avro | Apache Avro format with schema |
parquet | Apache Parquet columnar format with schema (default) |
Consider cluster capacity when setting batch sizes and timeouts.
Messages with empty table names or undefined tables will be sent to the default table if specified, or dropped if drop_unknown_stream_events
is enabled.
Examples
Basic
The minimum required configuration for Parquet ingestion:
targets:
- name: basic_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
table: "system_events"
Multiple Tables
Configuration with multiple target tables:
targets:
- name: multi_table_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
type: "parquet"
tables:
- name: "security_events"
schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"
- name: "system_events"
schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"
- name: "application_events"
schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"
High-Volume
Configuration optimized for high-volume ingestion:
targets:
- name: high_volume_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
type: "parquet"
batch_size: 50000
max_size: 536870912
timeout: 60
flush_immediately: false
With Debugging
Configuration with debug options:
targets:
- name: debug_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
debug:
status: true
dont_send_logs: true # Test mode that doesn't actually upload
Normalized
Using field normalization before ingestion:
targets:
- name: normalized_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
field_format: "asim"