AWS Lambda is an AWS managed serverless compute platform, largely designed for event-driven architectures. It runs in response to events, such as a REST call to AWS API Gateway, or a new file added to an S3 bucket.
Serverless compute platform means that you do not need to provision and manage servers. All administration of compute resources is managed by AWS.
Lambda functions are invoked in a secure and isolated execution environment whose parameters you specify with the Lambda function – you can choose up to 10GB ephemeral storage (/tmp temporary directory) for each Lambda function. Ephemeral storage means that the data location is not intended for durable storage. The same ephemeral storage location can be reused by multiple subsequent Lambda invocations and is recreated only when the entirely new execution environment is created.
For durable storage from Lambda functions, AWS offer these options:
- Amazon S3 – an elastically scalable object storage service. The key disadvantage of the service for certain use cases is that you cannot append data to a file like in a file system. Instead, you have to store an entirely new version of the object. In addition, while still very low, the latency of access to objects is higher than in Amazon EFS.
- Amazon EFS – a fully managed elastic, shared file system. You can mount EFS volumes in Lambdas and the file system grows or shrinks as you add or delete files. Its key advantage is low latency of access and the ability to append to existing files.
For deployment of additional libraries required by Lambda functions, you may bundle these in the deployment archive or move them to a Lambda layer. Each Lambda function can have up to 5 Lambda layers and can have up to 50 MB deployment size. The key advantage of using Lambda layers is sharing of the layers among a number of Lambda functions.
When to use which storage option?
For distribution of code libraries used by a Lambda function use a Lambda layer if the same library is used by a number of other Lambda functions.
For temporary data processed by a Lambda function
- If the total volume is less expected to be less than 10 GB use the Lambda function’s ephemeral /tmp storage with the knowledge that the same directory can be reused by a number of subsequent Lambda functions.
- Otherwise use the AWS EFS or S3 depending on the nature of the storage requirements.
For durable storage of data processed by a Lambda function, use
- AWS EFS if you require lowest latency and/or the ability to to append files.
- AWS S3 in any other scenario since it’s substantially cheaper.