Skip to main content

Azure Data Explorer

Microsoft Azure Observability

Synopsis

Creates an Azure Data Explorer (Kusto) target that ingests data directly into Azure Data Explorer tables.

Schema

- id: <numeric>
name: <string>
description: <string>
type: azdx
pipelines: <pipeline[]>
status: <boolean>
properties:
tenant_id: <string>
client_id: <string>
client_secret: <string>
endpoint: <string>
database: <string>
table: <string>
type: <string>
max_retry: <numeric>
retry_interval: <numeric>
max_size: <numeric>

Configuration

The following are the minimum requirements to define the target.

FieldRequiredDefaultDescription
idY-Unique identifier
nameY-Target name
descriptionN-Optional description
typeY-Must be azdx
pipelinesN-Optional post-processor pipelines
statusNtrueEnable/disable the target

Azure

FieldRequiredDefaultDescription
tenant_idY-Azure tenant ID
client_idY-Azure client ID
client_secretY-Azure client secret
endpointY-Azure Data Explorer cluster endpoint
databaseNvmetricTarget database name
tableNvmetricTarget table name
typeNparquetData format type: parquet, json, or ocf

Connection

FieldRequiredDefaultDescription
max_retryN5Maximum number of retry attempts
retry_intervalN10Base interval between retries in seconds
max_sizeN0Maximum batch size in bytes (0 for unlimited)

Details

The target supports three data formats:

Parquet (default) - Columnar storage. Provides efficient compression, and is best for analytical workloads.

JSON - Human-readable. Provides a flexible schema, and is good for debugging

OCF (AVRO) - Binary serialization. Provides schema evolution support, and is efficient for streaming

The target automatically validates table existence before starting ingestion. Data is buffered locally until max_size is reached or explicit flush is triggered. The retry mechanism uses an exponential backoff which is retry_interval * 2 ^ attempt.

warning

Consider cluster capacity when setting retry intervals.

The service principal must have appropriate permissions on the database and table.

warning

Large batch sizes may impact memory usage during ingestion.

Examples

Basic

The minimum required configuration for Parquet ingestion:

- id: 1
name: basic_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"

Custom

Configuration with specific database and table:

- id: 2
name: custom_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
database: "logs"
table: "system_events"
type: "json"

High-Volume

Configuration optimized for high-volume ingestion:

- id: 3
name: high_volume_adx
type: azdx
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
type: "parquet"
max_retry: 10
retry_interval: 30
max_size: 536870912 # 512MB

Normalized

Using data normalization before ingestion:

- id: 4
name: normalized_adx
type: azdx
pipelines:
- normalize_logs
properties:
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "11111111-1111-1111-1111-111111111111"
client_secret: "your-client-secret"
endpoint: "https://cluster.region.kusto.windows.net"
format: "ecs" # Normalize to Elastic Common Schema