AWS Logo

The trials and tribulations of running statically linked 32 bit binaries in AWS Lambda

Posted by

Foreword

AWS Lambda is absolutely wonderful. The ability to deploy code and have it triggered by a variety of sources without having to worry (too much) about capacity scaling, retry logic, cascading failure, etc. is invaluable if you’re trying to build reliable systems on tight timelines.

A fantastic usecase for Lambda that we at CloudCall have been working on is the transcoding of call recordings from raw RTP (in a variety of codecs) into mp3 files stored in S3. This is a great model because the capturing of raw RTP as it flows across our network is lightweight and can be done on reasonably low powered servers, by means of a network tap. Conversion of the RTP into mp3 is incredibly CPU intensive so Lambda provides an excellent platform to do this on.

Performance

Our current Lambda transcoder is doing pretty well – it processes over 500,000 recordings per day, regularly running 130 concurrent transcodes and makes call recordings available to customers in an average of 23 seconds after their call hangs up! Hopefully the new version will be even better 🙂

Binaries in Lambda

It may surprise you to learn that you can run more than AWS’s prescribed languages in Lambda. You can upload external programmes (binaries) compiled to run on an x86_64 Linux distribution and shell out to them from your Lambda code. For example, in Python:

import subprocess
subprocess.call(['some-external-programme', 'param'])

Statically compiled binaries are better as they don’t rely on a multitude of libraries existing within the Lambda environment, though it is usually possible to ship the library binaries for any missing ones with your Lambda function code.

The Headache

We use a 3rd party tool to convert RTP to mp3. This tool runs fine on every Linux OS that one could wish for, including Amazon Linux on EC2… but not Lambda. On Lambda, the tool segfaulted whenever it was run. You can’t run strace or gdb on Lambda due to limitations of the container environment it runs so tracking down the root cause of this wasn’t an easy task. AWS support weren’t overly helpful. They kept insisting that Lambda was the same as Amazon Linux 2 on EC2 and that there was a problem with the binary we were trying to run, which could be debugged on EC2. Clearly this wasn’t the case.

Narrowing it down

We managed to get a core dump out of Lambda, which looked like this:

(gdb) bt
0 0x0817a242 in __syscall_error ()
1 0x081a0c6a in _dl_discover_osversion ()
2 0x0817a035 in __libc_start_main ()
3 0x08048171 in _start ()

_dl_discover_osversion() is a function implemented in glibc that is run at application startup, before your own code, to determine the OS that the app is running on. This code runs a uname() system call. If you strace an application which is statically compiled with gcc against glibc, you’ll see this, which confirms it:

phil@phil-vm:~# strace ./hello
execve("./hello", ["./hello"], 0x7ffd95f1d000 /* 18 vars */) = 0
brk(NULL)                               = 0x823000
brk(0x8241c0)                           = 0x8241c0
arch_prctl(ARCH_SET_FS, 0x823880)       = 0
uname({sysname="Linux", nodename="phil-vm", ...}) = 0

The 3rd party tool we were using was compiled with a very old version of gcc against an equally old version of glibc. To try and rule out this tool being an issue, we wrote a very simple application in c:

#include
int main() {
printf("Hello World!\n");
return 0;
}

Compiling this on a fully updated 32bit (i386) version of Debian 10 produced a working 32 bit binary:

phil@phil-vm:~# gcc -static -o hello hello.c
phil@phil-vm:~# file hello
hello: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.18, BuildID[sha1]=d8b4df1c0abb643b7d9901a2a1ba4878a704afbe, not stripped
phil@phil-vm:~# ./hello
Hello World!

BUT… it didn’t work on Lambda. Compiling it on a 64bit (AMD64) version of Debian 10 produced a binary that did work on Lambda.

It is highly suspected that the uname() system call does not work on Lambda, when called by a 32 bit binary. This may be because of a bug in Lambda or glibc but it’s really difficult to diagnose in a limited environment such as Lambda.

The Solution

The solution came from a user of the AWS Developer Forums. qemu (a KVM based hypervisor for Linux) provide statically compiled tools to run non-native applications on a variety of host architectures. If you ship a 64 bit version of their qemu-i386-static binary into Lambda and run your 32 bit binary through it, your binary should run as desired.

To do this:

  • Download the qemu-user-static Debian package from an appropriate mirror at https://packages.debian.org/sid/amd64/qemu-user-static/download
  • Extract the .deb with ar as follows: ar x qemu-user-static_5.0-14_amd64.deb
  • Extract the resulting data.tar.xz as follows: tar -xvf data.tar.xz
  • Copy the resulting ./usr/bin/qemu-i386-static into your Lambda function and run it from Python as follows:

import subprocess
subprocess.call(['/path/to/qemu-i386-static', '/path/to/your-32-bit-application', 'params'])

The Outcome?

Aside from learning some new things about Lambda, we’re nearly ready to get our new version of Lambda transcoding out the door! 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *