The log driver default in our ECS stack is Cloudwatch, although some clients ask us to change to another one such as Splunk.

The Splunk log driver is not enabled by default in the ECS service running in the EC2 instance.

To enable the Splunk we need to change the /etc/ecs/ecs.config file adding the new logging driver. Our ECS terraform module has an option to send more commands to an user data. See the example below:


module "ecs_apps" {
source = "git::https://github.com/DNXLabs/terraform-aws-ecs.git?ref=4.5.2"
name = local.workspace["cluster_name"]
instance_type_1 = local.workspace["instance_type_1"]
instance_type_2 = local.workspace["instance_type_2"]
instance_type_3 = local.workspace["instance_type_3"]
vpc_id = data.aws_vpc.selected.id
private_subnet_ids = data.aws_subnet_ids.private.ids
public_subnet_ids = data.aws_subnet_ids.public.ids
secure_subnet_ids = data.aws_subnet_ids.secure.ids
certificate_arn = data.aws_acm_certificate.domain_host.arn
on_demand_percentage = local.workspace["on_demand_percentage"]
asg_min = local.workspace["asg_min"]
asg_max = local.workspace["asg_max"]
asg_memory_target = local.workspace["asg_memory_target"]
alarm_sns_topics = local.workspace["alarm_sns_topics"]
lb_access_logs_bucket = "example-${local.workspace["account_name"]}-elb-logs"
throughput_mode = local.workspace["efs_throughput_mode"]
provisioned_throughput_in_mibps = local.workspace["efs_provisioned_throughput"]
enable_schedule = local.workspace["cluster_schedule_enable"]
schedule_cron_start = local.cluster_schedule_cron_start
schedule_cron_stop = local.cluster_schedule_cron_stop
userdata = file("./extra_userdata")
}



Inside the app stack create a new file called 'extra_userdata' with the following content:


echo ECS_AVAILABLE_LOGGING_DRIVERS='["splunk","awslogs"]' >> /etc/ecs/ecs.config

You will need to create two SSM parameters which are used to store the Splunk URL and TOKEN. These environment variables will by the application to send the logs to the Splunk endpoint.


resource "aws_ssm_parameter" "TOKEN" {
name = "/app/${local.workspace["account_name"]}/${local.workspace["cluster_name"]}/splunk/TOKEN"
type = "SecureString"
value = " "
lifecycle {
ignore_changes = [value]
}
}
resource "aws_ssm_parameter" "URL" {
name = "/app/${local.workspace["account_name"]}/${local.workspace["cluster_name"]}/splunk/URL"
type = "SecureString"
value = " "
lifecycle {
ignore_changes = [value]
}
}

After apply these changes, go to the AWS console > EC2 > Launch template > click on the Launch Template ID and change the default version to the new version with the extra user data in it.

New EC2 instances must come up with the new configuration. Therefore one option is terminate the current EC2s running and await for the auto scaling service to do its job.


To avoid downtime, I'd recommend you ssh into the running instances and run:

echo ECS_AVAILABLE_LOGGING_DRIVERS='["splunk","awslogs"]' >> /etc/ecs/ecs.config

stop ecs
start ecs


These commands will stop the ECS controller container and bring up a new one with the updated configuration.

Finally, go to the application that will send the logs to Splunk and update its log configuration:


        "logConfiguration": {
"logDriver": "splunk",
"options": {
"splunk-index": "${CLUSTER_NAME}",
"splunk-format": "json",
"tag": "{{.ImageName}}/{{.Name}}/{{.ID}}"
},
"secretOptions": [
{
"name": "splunk-url",
"valueFrom": "arn:aws:ssm:${AWS_DEFAULT_REGION}:${AWS_ACCOUNT_ID}:parameter/app/${AWS_ACCOUNT_NAME}/${AWS_ENV}/splunk/URL"
},
{
"name": "splunk-token",
"valueFrom": "arn:aws:ssm:${AWS_DEFAULT_REGION}:${AWS_ACCOUNT_ID}:parameter/app/${AWS_ACCOUNT_NAME}/${AWS_ENV}/splunk/TOKEN"
}
]
        }



Trigger a new deployment of the application and check if the new container will come up. In case of errors check the cluster events.



Credits:

Jeremias Roma