Contents

Solving Flare-On 2016 Challenge 1 with Angr

Contents

Took me way too long to finally getting around to publishing this post, but here it is.

Last fall, FireEye announced their 3rd annual Flare-On challenge. It’s sort of a recruiting thing for them, I think – if you solve all the challenges (and especially if you solve them quickly), sounds like you might get an interview/job offer – after all, “The challenge runs the gamut of skills we believe are necessary to succeed on the FLARE team”.1

Shortly after the contest was over, FireEye published solutions for all the challenges. As you can see, the solution for Challenge 1 is not particularly complex, at least so far as CTF challenges go… but, it does require you, the reverse-engineer, to actually understand what the challenge1.exe2 program is doing.

At the time, I was particularly interested in Angr, a symbolic execution and binary analysis framework. Having spent a lot of time reversing binaries and manually figuring out how to get from program entry points to interesting places in the program, the idea that you can tell Angr “figure out the conditions that get me from here to there” seems both intriguing and appealing.3 If you open challenge1.exe in IDA, you’ll see that it’s a great candidate for Angr – a short main() function, with two possible outcomes (“Correct!”, or “Wrong password”). So let’s just tell Angr to get us executing down the “Correct” path, right? And then print out the relevant condition/s that got us there4

/images/2017/07/challenge1_main_ida.PNG

Here’s the reasonably-heavily commented solution:

#!/usr/bin/env python2
import angr
angr.path_group.l.setLevel('DEBUG')
def writefile_hook(state):
print('writefile hook')
# Just return "success"
# https://msdn.microsoft.com/en-us/library/windows/desktop/aa365747(v=vs.85).aspx
state.regs.eax = 1
def readfile_hook(state):
print('readfile hook')
# Just return "success"
# https://msdn.microsoft.com/en-us/library/windows/desktop/aa365467(v=vs.85).aspx
state.regs.eax = 1
def buffer_hook(state):
print('buffer hook')
# Store the symbolic address of our symbolic 'user_input' buffer in
# eax, which is where the real 'user_input' buffer would be if the program
# was _actually_ executing.
# http://angr.io/api-doc/simuvex.html?highlight=store#simuvex.storage.memory.SimMemory.store
state.memory.store(state.regs.eax, user_input)
# print state.memory.load(state.regs.eax)
def malloc_hook(state):
print('malloc_hook')
# Just return a concrete value.
# I think I picked this mostly at random, and it worked.
# Thankfully, malloc is only called once in `challenge1.exe`,
# otherwise we might have to do something fancier here.
state.regs.eax = 0xC0000000
# Load the project. Don't load libraries, for multiple reasons:
# * Speed: https://docs.angr.io/docs/speed.html
# * CLE, the symbolic loader, doesn't support
# Windows binaries super well: https://github.com/angr/cle
p = angr.Project('challenge1.exe', load_options={'auto_load_libs': False})
# Need to hook WriteFile, since we're not loading any libraries.
# Even if we were, we wouldn't want to make Angr symbolically execute
# all that anyway...
p.hook(0x401457, writefile_hook, length=6)
# Need to hook ReadFile, since we're not loading any libraries.
# Even if we were, we wouldn't want to make Angr symbolically execute
# all that anyway...
p.hook(0x401473, readfile_hook, length=6)
# Need to hook the location where the pointer to the user's input is
# passed as an argument to the encoding function. Our hook will
# replace whatever is there with a pointer to our symbolic
# 'user_input' buffer.
p.hook(0x401480, buffer_hook, length=6)
# Need to hook malloc, mostly for speed reasons -- without it,
# I'm not sure if this would ever complete.
p.hook(0x401283, malloc_hook, length=5)
# Note: For all hooks, the first argument is the address of the
# instruction to begin the hook at. The second argument is the
# hook function, which gets executed when the address (first
# argument) is hit. The length is the number of bytes to hook.
# I've set it to the length of the instruction at that address.
# I'm frankly not sure what happens if you don't set the length.
# To get the length, I just set the 'number of opcode bytes (graph)'
# option under IDA's General settings to 8. Whe you're in graph mode,
# IDA will display the instruction's bytes. Count them to get the
# length of the instruction.
# Note the starting `addr` -- had to bypass the initialization and loading
# stuff because it caused Angr errors...
initial_state = p.factory.blank_state(addr=0x40143C)
# Create our symbolic user_input buffer, which will use when we make Z3 solve
# for the correct password! The length 0x80 was chosen because I didn't expect
# the password to be super short (i.e. less than 10 chars or something --
# that'd be too easy to brute force), but I also didn't expect it to be soo
# long that someone wouldn't be able to type it in a reasonable amount of time.
user_input = initial_state.se.BVS("user_input", 8 * 0x80)
# Make an initial path from our state...
initial_path = p.factory.path(initial_state)
# Create a path group from that path...
path_group = p.factory.path_group(initial_state)
# Tell Angr where to go, and where to avoid (one is "Correct!", the
# other is "Wrong password").
path_group.explore(find=(0x4014AE,), avoid=(0x4014C7,))
# Once the "Correct!" path is found...
found = path_group.found[0]
# Print the flag!
print('FLAG: {0}'.format(found.state.se.any_str(user_input)))

It takes about 15 minutes to solve on an i7-6700HQ. If you have any suggestions for efficiency improvements, I’m all ears.

I did use Angr in a Sandstorm iPython/Jupyter notebook to help prototype the solution, which was pretty nice. I haven’t updated that Sandstorm package since that post… I should do that. >_>

Footnotes


  1. Something to keep in mind as this fall rolls around, if you’re looking for a job. ↩︎

  2. You can download all the challenge binaries here: http://flare-on.com/files/Flare-On3_Challenges.zip ↩︎

  3. Of course there is a great deal of value in spending time doing this manually – you’ll learn a lot about how programs and computers work! But after a while, you will probably start to wonder if any parts of that problem solving process can be automated/abstracted… ↩︎

  4. Sounds maybe sorta easy, right? But really, at least in this case, we’re just choosing to invest brain cycles in understanding how Angr works, instead of the similar amount of brain cycles it would take to understand the program’s actual password encoding algorithm… but I think this is a worthwhile investment. :) ↩︎