The Dark Side of AWS Lambda
Earlier this week, I walked into the office and started checking our daily logs. Everything looked fine except there were a few failed Lambda deploys to development overnight. Time to dig in.
What came next was a realization of the dark side of using Lambdas heavily at a company that also follows a strong CI/CD process.
Let’s back up a bit.
Rapid Development Through CI/CD
At Fluidity, we use highly templated services to accelerate development and maintain standards across our code. Within the DevOps, this helps me have a common framework for guiding any project from idea to production using our existing automated tooling. We’ve seen great success in launching services from idea to production in as little as 48 hours using these templates. It works, and it works really well for us.
As a part of our CI/CD process, we follow the GitFlow model. This means that every commit to the
develop branch is pushed out to our development environment automatically. This is the secondary cause of the failed Lambda deploys that I saw.
Lambda versions every function. We use the Serverless Framework for developing Lambda application. This means that Serverless creates Lambda functions. Our average Lambda function is about 60MB. Yes, we could likely optimize this but that would have only forestalled the underlying problem we encountered.
When you couple CI/CD with rapid development and Lambda functions, you get many versions. Hundreds even. And Lambda code storage is limited to 75GB. We hit that limit, and we hit it hard. After two years of CI/CD driven development, our lack of version cleanup led to complete gridlock in our development process.
That’s a bad day for a deploy## Version 1 (Last modified 2 years ago)
It turns out that we have been storing code for our functions from up to 2 years ago. Some of the functions were at version 100; while it is feasible to expect some older versions for rollback capabilities, storing code that’s already stored via GitHub history just seems excessive. That’s 90+ artifacts per function sitting around unneeded.
Perusing the Lambda documentation, AWS does allow you to delete specified versions of functions that are no longer in use. For our use case, it could have been simple to create a bash script to remove all the old versions. But, there had to be a better way to tie the CI/CD process to the cleanup.
In 2016, The Burning Monk released their
janitor-lambda function. This was a good reference point for us, but we did not want to use a non-version controlled Lambda function to simply clean up all the older functions we had. Tying the clean up to the deployment of a specific application was what we envisioned.
Enter Serverless Plugins
The Serverless community has a ton of plugins that can be used to solve the most common problems with developing Lambda-based applications. Luckily for us, one of these, serverless-prune-plugin, can automatically prune stale versions of an application when
sls deploy is run. Jackpot.
Using a quick and easy installation command, we were able to remove all of the stale versions of our Lambda functions, except the last 5 versions kept for rollback scenarios.
Here are the steps to follow:
- Install the plugin with
sls plugin install -n serverless-prune-plugin
- Modify your
serverless.ymlfile to include the following:
automatic: true number: 5`3\. Use the[ AirSwap Serverless Orb for CircleCI](https://circleci.com/orbs/registry/orb/airswap/serverless) for easy deployment in your CircleCI file — or simply run `sls deploy` in the CI/CD tool of your choice.
We drastically reduced the code storage space we used by pruning old versions automatically. The deploys take a few seconds longer now with the pruning step after deployment, but the results are worth it.
We cut over 75GB of stale codeSeeing these impressive results, I applied the serverless-prune-plugin to my own repositories and side projects. I personally cut another 35GB of code storage space in my own AWS account.
If you use the Serverless Framework, we highly suggest adding this plugin to your own code. Clean up those stale versions automatically, so you can keep deploying quickly and without hitting your code storage limit.
We will be merging in this new auto-pruning feature to our production code in the coming weeks. We can’t wait for the space savings that we achieve in our production accounts!