6/02/2016

AWS Lambda and Python

What is the AWS Lambda?

AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume - there is no charge when your code is not running.
So, you can build a scalable application without managing servers. Something like a microservice without servers. Serverless!

Hence, your function must be written in a stateless style. If you want to store something somewhere, you can connect to S3, Redshift, DynamoDB, etc. AWS Lambda will start to execute your code within milliseconds.

I am not going to write a tutorial step by step. You will find here only a handful necessary information.

Limits

Let's look at the limits (all limits are default, you can ask guys from AWS to increase them).
  • You have access to an ephemeral disk with limit 512 MB (access to /tmp only).
  • Your function must be finished within 300 seconds (it is a default max value; if you want you can set 10 seconds also); if not AWS Lambda will terminate it.
  • Zip package with your function must be less than 50 MB.
  • Uncompressed zip must be less than 250 MB.
  • Memory; you can choose from 128 MB to 1536 MB (you choose more memory, you will pay more; pricing details)
Tip: remember to remove not used files from /tmp. AWS Lambda can spawn in "the same box" (if you call many of them in a short period), so you do not want to get the error: "Sorry. Not enough space".
The Lambda free tier does not automatically expire at the end of your 12 month AWS Free Tier term, but is available to both existing and new AWS customers indefinitely.
What does it mean? It means that you have 1M free requests per month and 400000 GB-seconds of compute time per month. It will reduce your costs a lot.

The flow

  1. Upload your code to AWS Lambda
  2. Set up your code (when to trigger your code, etc.)
  3. Something triggers your code
  4. You pay only for the compute time you use 

Asynchronous and Synchronous

AWS Lambda can work in two modes. In Asynchronous mode, AWS Lamba is triggered by events in AWS Services and a caller does not wait. In synchronous mode, you trigger the code, and you wait for the result.

Python

Which versions of Python are supported?
Lambda provides a Python 2.7-compatible runtime to execute your Lambda functions. Lambda will include the latest AWS SDK for Python (boto3) by default.
Can I use packages with AWS Lambda in Python?
Yes. You can use pip to install any python packages needed.

Programming Model

Handler - you have to specify a handler function. If something invokes your Lambda function, AWS Lambda will execute your code by calling the handler function.

def handler_function(event, context): 
    # event - some event data (usually dict)
    # context - runtime information 
    return 'ok'

The Context Object - you will find here some information such as function name, version, memory limit, etc.

Logging - you can use print or logging. Both variants will produce and put logs into CloudWatch, but with logging, you will have more information like timestamp and log level. For example, you can turn on DEBUG on dev, but WARN on production.

Exceptions - if your code raises an exception, AWS Lambda will notify you about it. In synchronous mode, AWS Lambda will return you a JSON with information about the exception. In asynchronous mode, you will find the exception in logs.

Versioning and Aliases

You do not want to put your code directly on production (I merely assume), so you need versioning. It works very smoothly:

Versioning and Aliases

Versions are immutable (you cannot change it), but aliases are mutable. You can think of an alias as a pointer to a particular version. So, your PROD alias can point at Version 1 (stable), but DEV alias can point at Version 5 (in progress). Remember, you cannot change a version; you have to create a new one. But, you can change an alias.

Upload code to AWS Lambda

Here is a list of packages which make it easier to deploy/update for your function:
I have been using only one of them: lambda-uploader. I have not had any problems with this package.

Asynchronous mode

Let's say that you have a bucket (S3) where clients can put files, and you want to perform some operations after someone has uploaded a file. You can use AWS Lambda for this. You have to implement a logic in your package and upload it on AWS Lambda. After that, you should create an alias and add event source for this function. You want to call your function when someone uploads a new file to a bucket so that it might look like:

Add event source

Next, if someone put a file in a bucket, AWS Lambda will invoke the function and your code will do something! In monitoring tab you can see stats from the last 24 hours (in CloudWatch you have more data of course):

Monitoring

You can create alarms for errors, so if something bad happens, you can intervene.

If you want, you can also use AWS Lambda as a cron. You have to set this in event source:


Synchronous mode

And now, you want to call Lambda function and get a result. Accordingly, you can use synchronous invoke. In this way, you will receive a response if Lambda completes executing (or you will receive JSON with an error). You can use API Gateway and Lambda for this. Here is a framework in BETA stage (2016/05/31) if you are interested.

Here you will find instruction how to create an API. If you need a tutorial (API Gateway + Lambda), I have found two which are interesting:
You can also use events from Kinesis, CloudWatch, DynamoDB and SNS to call AWS Lambda.


AWS Lambda is an excellent strategy if you want to perform a specific action from time to time. You have to remember that Lambda is stateless. Therefore, you have to keep information about a state somewhere else (if you need). You do not have to pay for a machine which has to operate all month. Your code runs? You will pay.

Sources: