When integrating on-premises log sources with Microsoft Sentinel, Terraform provides a powerful way to automate the deployment of necessary resources. However, not all Azure objects that are needed in the stack are fully supported by the official Azure Resource Manager 'azurerm' Terraform provider. This limitation requires us to leverage the Azure 'azapi' provider to configure certain aspects of the setup.
In this article, we will explore how to use Terraform to ship logs from:
- Linux - and Windows hosts using AMA and DCR
- Custom log paths to capture additional data sources
By combining Terraform’s infrastructure-as-code capabilities with the flexibility of Azure APIs, we can ensure a scalable and automated approach to log ingestion from on-premises environments into Sentinel.
NOTE: This article will not cover the installation and configuration of the Arc agent. It assumes that an Arc-enabled object exists and that the Azure Monitor Agent (AMA) has been installed as an extension via the Azure portal.
We also assume you already have a Log Analytics Workspace.
First off, let’s start with a resource group to keep everything tidy:
resource "azurerm_resource_group" "onpremise-logs" {
name = "rg-onpremise-logs"
location = var.location
}
Ingesting custom logs requires an DCE (Data Collection Endpoint). You can skip this step if you’re not ingesting one of the below types:
- IIS Logs
- Windows Firewall Logs
- Text Logs
- JSON Logs
- Prometheus Metrics (Container Insights)
resource "azurerm_monitor_data_collection_endpoint" "dce_custom_logs" {
name = "dce-custom-logs"
location = var.location
resource_group_name = azurerm_resource_group.onpremise-logs.name
description = "DCE used to ingest DHCP logs from on-premise Windows Servers"
}
One object that isn’t supported by the azurerm provider is the creation of new custom Log Analytics tables. Fortunately, the azapi provider can handle this part. It allows you to interact directly with the Azure Resource Manager by sending raw JSON payloads to the API, essentially mimicking what an ARM template does behind the scenes. This makes it possible to provision unsupported resources declaratively through Terraform.
So, let’s use azapi to initiate the custom Log Analytics table:
resource "azapi_resource" "onprem_logs_table" {
name = "OnPremLogs_CL"
parent_id = azurerm_log_analytics_workspace.log-analytics-workspace.id
type = "Microsoft.OperationalInsights/workspaces/tables@2023-09-01"
body = {
properties = {
plan = "Analytics"
schema = {
name = "OnPremLogs_CL"
columns = [
{
name = "IPAddress"
type = "string"
},
{
name = "EventTime"
type = "datetime"
},
{
name = "Message"
type = "string"
},
{
name = "ObjectType"
type = "string"
},
{
name = "AccessedBy"
type = "string"
},
{
name = "TimeGenerated"
type = "datetime"
}
]
}
}
}
}
Note: All custom tables needs to end with “_CL”, according to Microsoft documentation. I would also advise to initiate the table in an initial terraform apply so the rest of the code picks it up correctly (or you can use a “depends_on” on the Data Collection Rule to make sure the table is created before you point to it in the DCR).
Now that we got our custom table for logs to flow into, lets map out the Data Collection Rule that will receive the logs (in my instance, I had a python script that was pushing json logs towards the endpoint) and put stuff into the correct columns:
resource "azurerm_monitor_data_collection_rule" "dcr_custom_logs" {
location = var.location
name = "dcr-custom-logs"
resource_group_name = azurerm_resource_group.onpremise-logs.name
data_collection_endpoint_id = azurerm_monitor_data_collection_endpoint.dce_custom_logs.id
description = "Data Collection Rule for Custom Logs"
stream_declaration {
stream_name = "Custom-OnPremLogs_CL"
column {
name = "TimeGenerated"
type = "datetime"
}
column {
name = "IPAddress"
type = "string"
}
column {
name = "EventTime"
type = "datetime"
}
column {
name = "Message"
type = "string"
}
column {
name = "ObjectType"
type = "string"
}
column {
name = "AccessedBy"
type = "string"
}
}
data_flow {
streams = ["Custom-OnPremLogs_CL"]
destinations = ["OnPrem-Workspace"]
output_stream = "Custom-OnPremLogs_CL"
}
destinations {
log_analytics {
workspace_resource_id = azurerm_log_analytics_workspace.log-analytics-workspace.id
name = "OnPrem-Workspace"
}
}
}
Don’t forget that Azure has a set of pre-defined sources available as a Data Collection Rule, e.g.
Linux Syslogs:
data_sources {
syslog {
name = "onprem-syslog"
streams = ["Microsoft-Syslog"]
facility_names = ["*"]
log_levels = ["*"]
}
}
Collect Windows Events from an XPath:
data_sources {
windows_event_log {
x_path_queries = [
"Microsoft-Windows-Dhcp-Server/Operational!*",
"Microsoft-Windows-Dhcp-Server/AuditLog!*",
"Microsoft-Windows-Dhcp-Server/FilterNotifications!*"
]
name = "win-onprem-dhcp-logs"
streams = ["Microsoft-Event"]
}
}
Collect logs from a custom log file (requires a custom table with a schema as the one we created above):
data_sources {
log_file {
name = "dhcp-logs"
format = "text"
streams = ["Custom-Table_CL"]
file_patterns = ["C:\\Windows\\System32\\Dhcp\\Dhcp*.log"]
settings {
text {
record_start_timestamp_format = "ISO 8601"
}
}
}
}
Finally, to connect your on-prem hosts connected with Arc, we use a Data Collection Rule Association:
resource "azurerm_monitor_data_collection_rule_association" "dcra_srvexample11" {
name = "srvexample11"
target_resource_id = "/subscriptions/<subscription id>/resourceGroups/<resource group name>/providers/Microsoft.HybridCompute/machines/srvexample11"
data_collection_rule_id = azurerm_monitor_data_collection_rule.dcr_custom_logs.id
description = "DCR Association rule for srvexample11"
}
Throughout this process, I came across various guides and documentation that described around 90 percent of the components needed for this setup.Most of them cover data sources, data collection rules, and endpoints, but they all mention that the custom table must be manually created or deployed via a script.
By adding the missing piece, the table creation using the azapi provider, the entire pipeline becomes fully Terraform-managed. This enables a reproducible, scalable, and automated solution for ingesting on-premises logs into Microsoft Sentinel without any manual intervention.