reverse engineering: for fun's sake

Serious buffer tracking PoC progress

August 13, 2019

Foreword

Well I’ll be damned (or maybe I don’t believe in God, who cares if I do?). This whole buffer tracking PoC has been a right pain. While coding it up I went through stages of thing, “No, it’s impossible” to “Ha, this is so easy”. Mostly the impossible one…

But I do have a clear way forward now. I have a partial PoC working for libc memcpy copies. Manual naive copies in C still cause issues, but I have a way forward for that.

I have come to realise i suck at RE… But it’s fun to blog so I’ll keep on going.

Aim

I’ll report a TL;DR, some embarrassing things about how I thought this could be done initially…, and where the PoC is at as of now.

Plus I’ve uploaded all of the code as it stands now, so the adventurous can take a look and even verify if super keen.

TL;DR

It turns out the way to do this is to work what registers depend on.

Consider the instruction:

add esi, [rax + 0x10]

From the assembly, it is clear esi depends on rax. But the PCode for this instruction is not so straight forward:

        00125724 03 70 10        ADD        ESI,dword ptr [RAX + 0x10]
                                                      $U620:8 = INT_ADD RAX, 16:8
                                                      $U1f50:4 = LOAD ram($U620)
                                                      CF = INT_CARRY ESI, $U1f50
                                                      $U1f50:4 = LOAD ram($U620)
                                                      OF = INT_SCARRY ESI, $U1f50
                                                      $U1f50:4 = LOAD ram($U620)
                                                      ESI = INT_ADD ESI, $U1f50
                                                      RSI = INT_ZEXT ESI
                                                      SF = INT_SLESS ESI, 0:4
                                                      ZF = INT_EQUAL ESI, 0:4

However, observe PcodeOps have at most one output and at least one input. It can be concluded that the output of a PcodeOp depends on its inputs.

So?

Well, an instruction is a list of PcodeOps, you can start at the bottom (at an output register) and go back and progressively work out all of the dependants until you get to a register again.

Solved? No.

As you can see in the add instruction, the address which is loaded for the add is an equation based on rax and other things.

The progress reported here has dependency tracking but not equation calculations.

How not to do it…

Just taking instruction inputs and output and doing a heuristic is so very limited. Plus the edge cases…

I mean imagine the add instruction above, you can get the inputs but you don’t know how 0x10 relates to rax.

This applies the same for plainly taking the PCode inputs and outputs too.

Tracking dependencies

Dependencies without equations is super simple, consider following which finds the earliest dependency for a given register:

def get_deps(pcodes):
    
    deps = []
    
    dinputs = [y for y in pcodes[0].getInputs()]
    deps.extend(dinputs)
    for dpcode in pcodes[1:]:
        doutput = dpcode.getOutput()
        if doutput is not None and (not deps or doutput in deps):
            dinputs = [y for y in dpcode.getInputs()]
            deps.extend(dinputs)
            
    return deps

def update_reg(program, instruction, reg):
    
    deps = []
    pcodes = [x for x in instruction.getPcode()]
    for x, pcode in enumerate(pcodes):
        output = pcode.getOutput()
        if output is not None and output.register:
            deps.extend(get_deps(pcodes[x:0:-1]))
    
    for dep in deps:
        if dep.register and program.getRegister(dep) == reg:
            return program.getRegister(dep)

So simple ay? It took me a long time to get to this point, it’s just embarrassing, but hey, I report it here and maybe I save someone 5mines somewhere.

However, the next steps in the PoC is doing dependencies with equations, which is less simple… Though I think quite achievable.

Tracking register dependencies through a basic block

So, we arrive at a watch point, what do we do? My basic thought is to update the register of interest as dependencies change:

NOOP = {
    'op': None,
}
OPEND = {
    'op': 'end',
}
def wp_track(file_name=None, start_offset=None):
    
    program = get_program(file_name)
    if program is None:
        return NOOP
    
    instruction = Programs.get_instruction(file_name, start_offset)
    if instruction is None:
        return NOOP
    
    instruction = offset_inst(instruction, -1)
    if instruction is None:
        return NOOP
    
    reg = load_reg(program, instruction)
    print('load_reg: %s' % reg)
    if reg is None:
        return NOOP
    
    while True:
        instruction = instruction.getNext()
        if is_branch(instruction):
            return {
                'op': 'branch',
                'args': [instruction.getAddress().getOffset()],
            }
            
        if is_end(instruction):
            return OPEND
        
        sreg = store_reg(program, instruction, reg)
        if sreg is not None:
            return {
                'op': 'store',
                'args': [
                    sreg.getName(),
                    instruction.getAddress().getOffset()
                ],
            }
            
        reg = update_reg(program, instruction, reg)

Not much to it really.

The repos

Getting there

So I don’t think this is a full PoC because the equations are tracked and calculated. That is my next piece of work, I don’t think it will be too bad actually, but we’ll see.


Dan Farrell

Written by Dan Farrell who lives and works in Seattle tinkering away on firmware. To subscribe send an email to subscribe@re-ffs.com.