How to implement log aggregation for AWS Lambda

By Yan Cui

During the execution of a Lambda function, whatever you write to stdout (for example, using console.log in Node.js) will be captured by Lambda and sent to CloudWatch Logs asynchronously in the background. And it does this without adding any overhead to your function execution time.

You can find all the logs for your Lambda functions in CloudWatch Logs. There is a unique log group for each function. Each log group then consists of many log streams, one for each concurrently executing instance of the function.

You can send logs to CloudWatch Logs yourself via the PutLogEvents operation. Or you can send them to your preferred log aggregation service such as Splunk or Elasticsearch.

But, remember that everything has to be done during a function’s invocation. If you make additional network calls during the invocation, then you’ll pay for that additional execution time. Your users would also have to wait longer for the API to respond.

These extra network calls might only add 10–20ms per invocation. But you have microservices, and a single user action can involve several API calls. Those 10–20ms per API call can compound and add over 100ms to your user-facing latency, which is enough to reduce sales by 1% according to Amazon.

So, don’t do that!

Instead, process the logs from CloudWatch Logs after the fact.

In the CloudWatch Logs console, you can select a log group and choose to stream the data directly to Amazon’s hosted Elasticsearch service.

This is very useful if you’re using the hosted Elasticsearch service already. But if you’re still evaluating your options, then give this post a read before you decide on the AWS-hosted Elasticsearch.

You can also stream the logs to a Lambda function instead. There are even a number of Lambda function blueprints for pushing CloudWatch Logs to other log aggregation services already.

Clearly this is something a lot of AWS’s customers have asked for.

You can find blueprints for shipping CloudWatch Logs to Sumologic, Splunk and Loggly out of the box.

You can use these blueprints to help you write a Lambda function that’ll ship CloudWatch Logs to your preferred log aggregation service. But here are a few more things to keep in mind.

Whenever you create a new Lambda function, it’ll create a new log group in CloudWatch logs. You want to avoid a manual process for subscribing log groups to your log shipping function.

Instead, enable CloudTrail, and then setup an event pattern in CloudWatch Events to invoke another Lambda function whenever a log group is created.

You can do this one-off setup in the CloudWatch console.

Match the CreateLogGroup API call in CloudWatch Logs and trigger a subscribe-log-group Lambda function. This function would subscribe the new log group to the log shipping function.

If you’re working with multiple AWS accounts, then you should avoid making the setup a manual process. With the Serverless framework, you can setup the event source for this subscribe-log-group function in the serverless.yml.

Another thing to keep in mind is that you need to avoid subscribing the log group for the ship-logs function to itself. It’ll create an infinite invocation loop and that’s a painful lesson that you want to avoid.

One more thing.

By default, when Lambda creates a new log group for your function, the retention policy is set to Never Expire. This is overkill, as the data storage cost can add up over time. It’s also unnecessary if you’re shipping the logs elsewhere already!

By default, logs for your Lambda functions are kept in CloudWatch Logs forever

We can apply the same technique above and add another Lambda function to automatically update the retention policy to something more reasonable.

Here’s a Lambda function for auto-updating the log retention policy to 30 days.

If you already have lots of existing log groups, then consider writing one-off scripts to update them all. You can do this by recursing through all log groups with the DescribeLogGroups API call.

If you’re interested in applying these techniques yourself, I have put together a simple demo project for you. If you follow the instructions in the README and deploy the functions, then all the logs for your Lambda functions would be delivered to Logz.io.