Progress on buffer tracking PoC
July 29, 2019
Foreword
I’m making progress on the buffer tracking PoC, but I want to make all of the code portable. I mean, the PyGDB plugin for this need not be x86 specific. This has made it a little fiddly, but I am making progress, which I report here.
Things done so far:
- Added PyGDB watchpoint support
- Added Ghidra/PyGDB API
- Added ghidraremote PyGDB plugin
- Started buffertracker PyGDB plugin
- Proving to myself Ghidra provides enough PCode/Instruction APIs to get this done
So far what it can do is have a PyGDB script get symbols from Ghidra (including user created symbols), set a watchpoint, do an action on hitting a watchpoint.
What remains:
- Support matching symbols from loaded libraries between PyGDB and Ghidra
- Coding up the algorithm for finding the next address for a watch point
Aim
The aim of this post is to show some snippets showing that a programmatic architecture independent method for buffer tracking can be done.
High level view
The high level view is to make it possible to script up finding a buffer address based of reverse engineering and auto-magically follow the buffer through execution.
For example, this script for PyGDB should not change much from now:
#!/usr/bin/env python3
import sys
import os
#sys.path.append(os.getcwd())
import PyGDB
PyGDB.gdb = gdb
from PyGDB.bp import BPStopper
from PyGDB.env import Env
from PyGDB.bp_mgr import BPMgr
from PyGDB.plugins.ghidraremote import ghidraremote
from PyGDB.plugins.buffertracker import buffertracker
from PyGDB.fn_find import find_function, find_symbol
main_bp = BPStopper('__libc_start_main')
gdb.execute('run')
main_bp.enabled = False
env = Env()
client = ghidraremote.initialise()
tracker = None
first_malloc_ret = client.do_get_symbol_address(name='first_malloc')['address']
if first_malloc_ret is None:
print('Could not find: first_malloc in Ghidra')
else:
print('First malloc: %012x' % first_malloc_ret)
tracker = buffertracker.initialise_on_ret(client, first_malloc_ret)
bp_mgr = BPMgr()
bp_mgr.run()
print(tracker)
client.disconnect()
This shows getting the post malloc call address from Ghidra which is found by static analysis, and setting up a buffer tracker based upon it.
Setting up a watchpoint
Then in the tracker we have the initialise:
def initialise_on_ret(client, ret_addr):
'''Initialises the BufferTracker based on a ret value
Args:
client (CmdClient): Ghidra client
ret_addr (int): Where the ret is valid
'''
tracker = BufferTracker(client)
tracker.start_ret(ret_addr)
return tracker
And how it gets started:
def start_ret(self, ret_addr):
'''Sets up the trackter for the ret value at ret_addr'''
self.bp_mgr.add(ret_addr, on_stops=[self.on_start_ret])
def on_start_ret(self):
'''Callback for hitting start_ret ret_addr'''
buffer_addr = self.abi.get_return()
print('Buffer address: 0x%012x' % buffer_addr)
self.update_wp(buffer_addr)
def update_wp(self, addr):
'''Updates the buffer watch point
Args:
addr (int): the buffer address
'''
if self.wp_addr is not None:
self.bp_mgr.remove(self.wp_addr, on_stops=[self.on_wp])
self.wp_addr = addr
self.bp_mgr.add(self.wp_addr, on_stops=[self.on_wp],
_type=self.bp_mgr.WP)
Basic Ghidra code
Adding remote APIs to Ghidra is quite straightforward:
def test(a=None, b=None, c=None):
ret = {}
ret['entry_point'] = getFunctionContaining(currentAddress).getEntryPoint()\
.getOffset()
return ret
def get_symbol_address(name=None):
'''Finds the symbol address
Args:
name (str): name of the symbol
Returns:
dict: return value
'''
ret = {}
if not name:
return ret
symbol_table = currentProgram.getSymbolTable()
symbols = symbol_table.getSymbols(name)
symbol = None
for x in symbols:
if symbol is not None:
raise RuntimeError('More than one symbol named "%s"' % name)
symbol = x
if symbol is None:
ret['address'] = None
else:
ret['address'] = symbol.getProgramLocation().getAddress().getOffset()
return ret
server = CmdServer()
server.add_command('test', test)
server.add_command('get_symbol_address', get_symbol_address)
Up next
I’m really hoping I’ll have this buffer tracking done for the next post. And I’ll try and provide a complete example that you can execute yourself.
Written by Dan Farrell who lives and works in Seattle tinkering away on firmware. To subscribe send an email to subscribe@re-ffs.com.