Notes: * Hi everyone! * Name is datalocaltmp or datalocal"temp". * In today's talk titled "A Ghidra Visualization is worth a thousand GDB Breakpoints" I'll be describing how I've enhanced my reverse engineering and debugging tasks through code coverage visualizations. * Thanks to BSides Montreal for the space and all the hard work they've done gathering everyone!
Notes: * Independent security researcher focused on mobile * Previously dedicated to researching privacy within mobile apps and featured in TechCrunch as "theappanalyst" * Some notable bounty programs I've worked with include Bird Scooters, Biden Campaign App, Ring Cameras, Match.com. * Nowadays I focus on mobile platform security and in particular reverse engineering the native layer * Virtual Reality Enthusiast
Notes: *
Notes: * My goal today is to introduce you to the world of code coverage visualizations all within 25 minutes. I'll try and do this by first.... * Describing a scenario where you would have to do traditional debugging with gdb and a decompiler, highlight some of the shortcomings. * Next I'll show you how to generate these visualizations, specifically for a common Android App (in this instance Facebook Messenger). * Then I'll finish with addressing the first scenario by using code coverage visualizations and hopefully illustrate their benefit.
Notes: * To start I'd like to talk about when you would conduct traditional debugging or reverse engineering. * Often debugging and reversing happens when you have a binary and you want to understand what it's doing but don't have the source code to guide you. * I.e Corporate mobile apps, imported libraries, malware. * This process generally consists of static and dynamic components * Decompiling * Debugging
Notes: * Because I want to have an example that we can actually demo the coverage generation against later, I've written a bit of a toy scenario for the Meta Quest 2. * Lets say you're a game dev who wants to interface with a utilities library on the Quest 2 * Your game calls a function that returns a processes name given a pid, and you're hoping to use this to catch cheaters * You go ahead and test out your game and...
Notes: * Your anti-cheat engine crashes the game and it's all because you wanted to get the process name using that utilities library. * A couple of notes here from this crash dump * 1. We see that we're running this from the command line, that's a toy program I've written that we'll use later to generate coverage. * 2. We can see where the program crashed and we can get underway setting our breakpoints. * And this is when we'd start the debugging process.
Notes: * So finally we can say the general process will look like 1. you decompile the binary and examine the getProcessName function. 2. run your game with the debugger attached and set a breakpoint on the crashing function. 3. set a breakpoint on the crashing function. 4. Iterate until you understand the bug.
Notes: *
Notes: * This is going to take a lot of iterations to get to the root of the bug * During the decompiliation you might be tempted down a path that has no relevance to the bug at hand * Finally, note taking is prone to human-error; you might write-down the wrong offset and investigate something useless. * so in generally it'd be really nice if we could quickly collect the ground truth regarding what's executed.
Notes: * And we can because of all these fine tools * Due to time, the cliff notes are that: * Lighthouse generates the coverage information by using frida to inspect each assembly instruction executed. * Cartographer loads the coverage information into Ghidra for visualization * Note that I had to patch Cartographer to work with the older coverage output produced by lighthouse and that patched cartographer is available on my git.
Notes: * So lets hop into it and generate some coverage * So I've picked facebook messenger since it's somewhat like a lorum ipsum of applications * I've picked an arbitrary function that runs when we open a message * It's worth noting that the library we'll look at is one of hundreds of dependencies in Messenger * and this example really could easily extend to something like mobile game development.
Notes: * I'll have a short video after this but the general process is; * You run frida-server on the device executing the program * Note that it can help to first determine when the function is called using frida-trace * Navigate within the Lighthouse tool and run their coverage generator for the Application * You may crash the process if there are too many threads * Note that this process only works for Android apps (hint that it wouldn't work for the previous Quest example) * Finally, load the output coverage file into the Cartographer Ghidra extension
Notes: * So this is our first attempt at following the process that I outlined above.
Notes: * This time we'll do the same thing but narrow our focus to the thread which is executing the function we're interested in.
Notes: * So now what does it look like to load that coverage data into Ghidra
Notes: * Here we are at the function we've just generated coverage for
Notes: * We load the coverage data and select the module we're interested * I've modified the coverage data to subtly highlight the specific module we're loading
Notes: * O
Notes: * And now we can answer the question I posed earlier; what blocks within the blue and red highlighted area are executed. * Compared to the classic process of using GDB breakpoints we can immediately know the ground truth for program execution.
Notes: * So what about if we want to generate coverage for process that aren't apps or aren't running on a rooted device (a.k.a we can't run frida-server)