lambda – Bryson Tyrrell

A Quick Look at S3 Read Speeds and Python Lambda Functions

The other day I was having a conversation with a colleague around an asynchronous file hashing operation that triggers off new objects uploaded to a S3 bucket. At one point we were talking throughput. The design has a notification configuration that sends the S3 events into a SQS queue for processing. This means for the first minute we have five Lambda functions each processing a file one at a time (batch size of 1: this is an implementation decision, and for the sake of this article we won’t get into larger batch sizes), and then at the second minute 65, the thirds 125, and so on.

The napkin math of this discussion assumed 1 GB file size average and an ideal 100 MBps throughput. At 10 seconds per file, 6 files per minute per Lambda function, we could expect during a scale up to process 30 files in the first minute, 380 in the second minute, and 730 in the third minute. Our current implementation allows us to hash through 1,170 files at the (admittedly on the higher end) 1 GB files within 3 minutes.

That is, if we actually get 100 MBps.

What can we actually expect?

The rest of my afternoon became focused on finding out what the realistic throughput for this code would be. I stripped down the service’s Lambda function to the key items and preloaded a S3 bucket with four test files at sizes we commonly expect coming into the system: 100 MB, 500 MB, 1 GB, and 5 GB.

Here’s the code:

	import hashlib
	import time

	import boto3

	s3 = boto3.resource("s3")

	BUCKET = "my-bucket"
	KEYS = [
	"test100mb.file",
	"test500mb.file",
	"test1gb.file",
	"test5gb.file",
	]
	CHUNK_SIZE = 20000000


	def lambda_handler(event, context):
	print("Starting…")
	for key in KEYS:
	stream_file(key)

	print("Complete")


	def stream_file(key):
	start_time = time.time()
	hash_digest = hashlib.sha1()
	s3_object = s3.Object(bucket_name=BUCKET, key=key).get()

	for chunk in read_in_chunks(s3_object):
	hash_digest.update(chunk)

	ellapsed_time = time.time() – start_time
	print(
	f"File {key} – SHA1 {hash_digest.hexdigest()} – Total Seconds: {round(ellapsed_time, 2)}"
	)


	def read_in_chunks(s3_object: dict):
	"""A generator that iterates over an S3 object in 10 MB chunks."""
	stream = s3_object["Body"]._raw_stream
	while True:
	data = stream.read(CHUNK_SIZE)
	if not data:
	break
	yield data

view raw s3tester.py hosted with ❤ by GitHub

Now came the testing portion. I need to see how this code performs not only at different memory settings (remember: the memory setting also allocates CPU and network IO to our functions), but also the chunk size that will be streamed from S3 for each object. The code above uses the chunk size to stream X bytes of the S3 object into memory, then updates the hash digest with it, and discards before moving onto the next chunk. This makes our actual memory utilization very low. In fact, the above hashing operation works at 128 MB of memory for the Lambda function even if the execution time isn’t great.

At this point I must inform you all that I did my testing like a barbarian of old by clicking “Invoke” in the console after changing my memory and chunk settings. If you’re looking to do performance testing, I recommend you go check out the AWS Lambda Power Tuning project for this. It’s pretty great.

The table below is the data I recorded as a part of this effort. A few things to note that limit this data and makes it incomplete:

I only performed two runs at each configuration. This is a very limited data set and there’s clearly environmental variance between executions that affected the times. These could have been leveled out, and outliers dropped, if I obtained a larger data set.
With only one run being performed at a time I have no clear indication if a mass number of parallel read operations of different files will impact read speeds. My assumption is no, but that is an assumption.
While it would be possible to multi-thread this workflow, and potentially multi-process it at much higher memory settings, I don’t see the benefit in doing so for the added code complexity. Plus, splitting up the downloads across threads likely won’t increased read speeds from S3 as now there are multiple streams competing for bandwidth.

We’ll pick up on my thought process on the other side of this table.

File Size (MB)	Memory Used	First Run (Seconds)	Second Run	Avg Speed (MBps)
128 MB Memory / 1 MB Chunk Size
100	82	4.91	7.49	16.13
500	82	27.26	30.74	17.24
1000	82	55.7	74.82	15.32
5000	82	329.68	329.68	15.17
128 MB Memory / 10 MB Chunk Size
100	108	5.18	7.64	15.60
500	108	21.26	24.96	21.64
1000	108	43.32	49.82	21.47
5000	108	217.84	240.16	21.83
256 MB Memory / 10 MB Chunk Size
100	108	2.65	2.51	38.76
500	108	9.8	10.8	48.54
1000	108	20.32	21.48	47.85
5000	108	118.7	99.08	45.92
256 MB Memory / 20 MB Chunk Size
100	138	2.64	2.4	39.68
500	138	9.74	9.92	50.86
1000	138	19.58	19.92	50.63
5000	138	99.42	97.44	50.80
256 MB Memory / 50 MB Chunk Size
100	245	3.55	2.71	31.95
500	245	12.68	12.52	39.68
1000	245	25.54	25.4	39.26
5000	245	128.06	127.5	39.13
512 MB Memory / 20 MB Chunk Size
100	137	1.5	1.16	75.19
500	137	5.38	5.34	93.28
1000	137	13.76	13.8	72.57
5000	137	69.74	69.74	71.69
512 MB Memory / 50 MB Chunk Size
100	245	2.07	1.73	52.63
500	245	6.74	6.52	75.41
1000	245	13.66	14.3	71.53
5000	245	68.78	70.21	71.95
1024 MB Memory / 20 MB Chunk Size
100	137	1.2	1.1	86.96
500	137	6.6	5.29	84.10
1000	137	14.57	13.93	70.18
5000	137	72.76	69.57	70.26
1024 MB Memory / 50 MB Chunk Size
100	246	1.33	1.21	78.74
500	246	6.52	6.53	76.63
1000	246	14.46	14.61	68.80
5000	246	72.65	72.69	68.80
2048 MB Memory / 20 MB Chunk Size
100	138	1.09	1.06	93.02
500	138	5.33	5.35	93.63
1000	138	13.89	13.91	71.94
5000	138	69.69	69.56	71.81

I started off at the default 128 MB and a typo of 1 MB chunks (I thought I had written 10000000 😅 ). The smaller chunk size means we’re making many, many more requests to S3, so increasing it to 10 MB is a simple way to improve performance. At the higher chunk size we’re now getting close to utilizing all the available memory and we can’t increase it again.

I think it should be said that anyone deploying Lambda functions should default their memory setting to 256 MB to start no matter what. The leap in performance is clear no matter what you’re doing, and at per-millisecond billing there’s no reason not to go for it.

With the additional memory overhead I decided to see what would happen if I 5x the chunk size. While within the limit, my performance actually decreased. Dropping the chunk size down to 20 MB revealed a sweet spot (someone help here, but I know I’ve heard of the 20 MB number being used in a few other places within AWS for chunking/in-memory caching) where we can now consistency get ~50 MBps reads from S3.

At 512 MB of memory and the 20 MB chunk size we’ve hit the optimal settings across object sizes. 70+ MBps baseline with variance up to 90+ MBps.

If I were to do more intensive performance testing I would be focused here. Increasing memory to 1024 MB and 2048 MB improved the read speed for < 1 GB objects, but not the ≥ 1 GB ones. I still tested 50 MB chunks at 512 MB and 1024 MB but it again resulted in performance hits.

It might be tempting to look at the speed increases for < 1 GB files and say the function should run at that to burn through those faster, but the timing difference is insignificant in our context with 1.06 seconds at 2048 MB vs 1.5 seconds at our “optimal” 512 MB for 100 MB objects.

I say it that way because this system isn’t expected to have to deal with constant, high volume ingress of objects to our bucket. Ingress will be inconsistent and spiky at certain times of a monthly cycle. Now, if I were expecting high volume ingress and at a more constant rate I might find the increase warranted. ~3,400 100 MB objects per hour vs ~2,400 is a very different kind of measurement.

I hope you all enjoyed coming along for this little journey. Perhaps some day I’ll come back to it and put it through some proper performance tuning analysis.

Trick Sam into building your Lambda Layers

Right now, the SAM CLI doesn’t support building Lambda Layers; those magical additions to Lambda that allow you to defined shared dependencies and modules. If you’re unfamiliar, you can read more about them here:

New for AWS Lambda – Use Any Programming Language and Share Common Components

If you read my last article on using sam build, you might think to yourself, “Hey, I can add Layers into my template to share common code across all my Lambdas!”, but hold on! At the moment sam build does not support building Layers the way it builds Lambda packages.

But, there is a hacky way around that. Here’s our repository:

carbon (7)

Now here’s the contents of template.yaml:

carbon (6).png

We’ve defined an AWS::Serverless::Function resource, but with no events, or any other attributes for that matter. We have also defined a AWS::Serverless::LayerVersion resource for our Lambda Layer, but the ContentUri path points to the build directory for the Lambda function.

See where this is going?

sam build will install all the dependencies for our Layer and copy its code into the build directory, and then when we call sam package the Layer will use that output! Spiffy. This does result in an orphan Lambda function that will never be used, but it won’t hurt anything just sitting out there.

Now, we aren’t done quite yet. According to AWS’s documentation, you need to place Python resources within a python directory inside your Layer. The zip file that sam build creates will be extracted into /opt, but the runtimes will only look, by default in a matching directory within /opt (so in the case of Python, that would be /opt/python).

See AWS Lambda Layers documentation for more details.

We can’t tell sam build to do that, but we can still get around this inside our Lambda functions that use the new Layer by adding /opt into sys.path (import searches all of the locations listed here when you call it). Here’s an example Python Lambda function that does this:

carbon (5)

Performing a test execution gives us the following output:

START RequestId: fd5a0bf2-f9af-11e8-bff4-ab8ada75cf17 Version: $LATEST
['/var/task', '/opt/python/lib/python3.6/site-packages', '/opt/python', '/var/runtime', '/var/runtime/awslambda', '/var/lang/lib/python36.zip', '/var/lang/lib/python3.6', '/var/lang/lib/python3.6/lib-dynload', '/var/lang/lib/python3.6/site-packages', '/opt/python/lib/python3.6/site-packages', '/opt/python', '/opt']
<module 'pymysql' from '/opt/pymysql/__init__.py'>
<module 'sqlalchemy' from '/opt/sqlalchemy/__init__.py'>
<module 'stored_procedures' from '/opt/stored_procedures.py'>
END RequestId: fd5a0bf2-f9af-11e8-bff4-ab8ada75cf17
REPORT RequestId: fd5a0bf2-f9af-11e8-bff4-ab8ada75cf17	Duration: 0.82 ms	Billed Duration: 100 ms 	Memory Size: 128 MB	Max Memory Used: 34 MB

Voila! We can see the inclusion of /opt into our path (and the expected path of /opt/python before it) and that our dependencies and custom module were all successfully imported.

It breaks PEP8 a little, but it gets the job done and we have now successfully automated the building and deployment of our Lambda Layer using AWS’s provided tooling.

Possum is dead; long live the squirrel.

Sam Build

A part of me is sorry to say that the title is not clickbait. Just before re:Invent, the AWS SAM developers made a pretty big announcement:

SAM CLI Introduces sam build Command

You can now use the sam build command to compile deployment packages for AWS Lambda functions written in Python using the AWS Serverless Application Model (AWS SAM) Command Line Interface (CLI).

All you need to do is*:

This command will iterate over your SAM template and output ready to package versions of your template and Python Lambdas to a .aws-sam/build directory. This lines up exactly with work I was preparing to do for possum, but AWS has gone ahead and done all the work.

* With other required arguments depending on your environment.

In fact, you’ll find that sam build nearly has feature parity with possum with a few exceptions which I’ll go into. Let’s take a look at what one of my serverless projects looks like as an example:

MyApp/
├── src/
|   └── functions/
│       └── MyLambda/
│           ├── my_lambda.py
│           └── requirements.txt
├── Pipfile
├── Pipfile.lock
└── template.yaml

I use pipenv for managing my development environments. The project’s overall dependencies are defined in my Pipfile while the pinned versions of those dependencies are in the Pipfile.lock. Individual dependencies for my Lambdas are defined in their own requirements.txt files within their directories.

I use PyCharm for all of my Python development. Using pipenv to manage the individual virtual environment for a given project allows me to take advantage of the autocompletion features of PyCharm across all the Lambda functions I’m working on. I maintain the individual requirements.txt files for each of my Lambdas and have their listed packages match the version in my Pipfile.lock (I have scripting in possum 1.5.0 that manages syncing the package versions in the requirements.txt files for me).

Now, when I run sam build it will perform all the same actions as possum, but instead of creating the zipped archive and uploading straight to S3 the built Lambdas will be available within the project’s directory.

Possum was originally written as a replacement for sam package that would include dependencies. It would upload the Lambda packages directly to an S3 bucket.

MyApp/
├── .aws-sam/
|   └── build/
|       ├── MyLambda/
│       |   ├── installed_depencency/
│       |   |   └── {dependency files}
│       |   └── my_lambda.py
|       └── template.yaml
├── src/
|   └── functions/
│       └── MyLambda/
│           ├── my_lambda.py
│           └── requirements.txt
├── Pipfile
├── Pipfile.lock
└── template.yaml

The new template located at .aws-sam/build/template.yaml has had the CodeUri keys updated to reference the relative paths within the .aws-sam/build directory. You will see that these copies of the Lambda code now contain all the dependencies that were defined within the requirements.txt file.

The example above generalizes this. Just to show you, the ApiContributorRegistration Lambda for CommunityPatch installs the cryptography and jsonschema packages. This is what the output looks like for a built Lambda:

CommunityPatch/
├── .aws-sam/
    └── build/
        └── ApiContributorRegistration/
            ├── asn1crypto/
            ├── asn1crypto-0.24.0.dist-info/
            ├── cffi/
            ├── cffi-1.11.5.dist-info/
            ├── cryptography/
            ├── cryptography-2.4.1.dist-info/
            ├── idna/
            ├── idna-2.7.dist-info/
            ├── jsonschema/
            ├── jsonschema-2.6.0.dist-info/
            ├── pycparser/
            ├── pycparser-2.19.dist-info/
            ├── schemas/
            ├── six-1.11.0.dist-info/
            ├── _cffi_backend.cpython-36m-x86_64-linux-gnu.so
            ├── api_contributor_registration.py
            ├── requirements.txt
            └── six.py

Dependencies usually have dependencies of their own (those two packages became seven!). And that’s just one Lambda.

Sam Invoke

Now, at this point you could take the output from sam build and perform sam package to get everything loaded into S3 and have a deployment template to run in CloudFormation. However, now that we have a build template we can take advantage of the SAM CLI’s powerful test features which possum was working towards adopting:

carbon (1).png

We can unit test our Lambdas using generated AWS events from the SAM CLI! I’ll cover my workflow for this in more detail at a later time, but before deploying the entire app out to AWS we can now perform some sanity checks that the Lambdas should execute successfully when given a proper payload. Ideally, you would want to generate multiple event payloads to cover a variety of potential invocations.

Sam Package/Deploy

From here the standard package and deploy steps follow (using either the sam or aws CLI tools) which I won’t cover here as I’ve done so in other posts. The full process referencing the new .aws-sam/build directory looks like this:

carbon (4).png

sam package knows to use the template output from sam build without having to specify the path to it!

Gotchas

While all of this is great, let’s cover the exceptions I alluded to earlier.

sam build will perform the build every single time. Even if you don’t make changes between builds it will still rebuild all your functions. This is agonizingly slow. Preventing unneeded builds was one of the first features that went into possum to speed up my personal development. The AWS devs have been listening to some of my feedback on how I implemented this and are looking into adopting a similar solution for sam build.

Every Lambda must have a requirements.txt file even if they don’t have any external dependencies. I ran into this one right away. At the moment, sam build expects there to always be a requirements.txt file within a Lambda function’s code directory. Use a blank file for simple Lambdas as a workaround. The AWS devs are aware of this and will be fixing it.

python must resolve to a Python environment of the same version as your serverless app. If python resolves to a different version (like on a Mac where it resolves to the 2.7 system executable) activate a virtual environment of the correct version as a workaround. You should be able to easily do this if you’re using pipenv by running pipenv shell. The reason this isn’t an issue for possum is because possum relies on pipenv for generating the correct Python build environment based upon the runtime version defined in the template. The AWS devs have been taking my feedback and are looking into this.

Edit: The below wheel issue is fixed in sam 0.8.1!

You may run into The error message “Error: PythonPipBuilder:ResolveDependencies – {pycparser==2.19(sdist)}”. This happens if you’re using a version of Python that didn’t include the wheel package. This will be fixed in a future release, but you can pip install wheel in the Python environment that sam was installed to as a workaround.

~~You’re also going to run into that error when you try to us the –use-container option because the Docker image pulled for the build environment is also missed that package.~~

The workaround is to build intermediary image based on lambci/lambda:build-python3.6, install the wheel package, and then tag it using the same tag (yes, you’re tagging an image to override the existing tag with your own custom one) . This will also be fixed in a future release.

Spread the word:

Spread the word:

Sam Build

Sam Invoke

Sam Package/Deploy

Gotchas

Spread the word: