reverse engineering: for fun's sake

Introducing a dynamic/programmatic SRE technique

July 14, 2019

Foreword

Ok, we’re getting closer to doing some SRE. Sorry it’s taking a while, I really do procrastinate quite badly, plus I have a full time job.

The bad news: no SRE is actually done here.

The good news: I’m introducing a new(*) technique.

* Well, new to me. I use to do static analysis, it was just when I got to the point of combining static and dynamic analysis that I had to leave the industry. Well, maybe I could have stayed in the industry but a lot changed for me in a short period of time and I decided to become a Software Engineer.

Aim

This time I want to provide the basis for the simplest dynamic/programmatic SRE I can think of. As a bonus it will be used when I tackle the illegitimate software.

In particular I’m going to propose an algorithm that combines GDB debugging and programmatic static analysis in Ghidra to track when and where memory is copied to another memory location.

I won’t actually implement it in this post. I have to get a wriggle on and modify PyGDB and make a Ghidra script for this. But, really, does it matter how I implement it? As long as I describe how it works anyone can implement it.

Why is this useful?

Spoiler alert: the illegitimate software that I will tackle has hidden stolen keys that are used to decrypt data. I don’t know where the keys are, but I know where the encrypted data is.

I can track the encrypted data to the decrypt algorithm. Giving me the keys? No, giving me a key that has probably already been expanded. Last time I looked there were many decrypt functions around and the code was obfuscated, so I couldn’t even find the actually executed decryption code!

Spoiler alert: once I know where the decrypt algorithm is I can use reverse execution (which GDB supports (and can be used with Python extensions)) to find where the key originates from.

That’s the game folks. I think this could be useful in SRE of malware too.

The algorithm

Requirements:

  • A reference to the data
Let data_addr be the address of the data of interest

Set watchpoint at data_addr

When watchpoint hit
    Use Ghidra API to get destination register of data_addr access

    Let src_reg be the instruction destination register
    Let current_pc be the current program counter

    While true
        Let inst be the instruction at current_pc

        If inst is a conditional jump
            Step GDB
            Update current_pc
            Continue
        Else if src_reg is copied to new destination register
            Update src_reg with new destination register
        Else if src_reg is copied to memory
            Add comment to Ghidra about copy
            Return new memory address

As you can see, the algorithm is using dynamic debugging with GDB and programmatic SRE with Ghidra.

But it is quite simple, and could easily be counter-measured. I guess I’m hoping this technique isn’t on peoples minds right now for obfuscated software. For legit software this should be pretty reliable.

Obvious limitation

Watchpoints are actually hardware breakpoints. These are a scarce resource, in fact from what I understand x86 has two of them!

So that’s it

All I have to do now is code it up and provide a POC… Not sure how long that will take, but it should be fun.


Dan Farrell

Written by Dan Farrell who lives and works in Seattle tinkering away on firmware. To subscribe send an email to subscribe@re-ffs.com.