Reverse Engineering Is Not Just for Hackers

We spend a lot of time putting apps together, but when was the last time you pulled one apart? If we can streamline the process of looking inside a compiled application then we’re more likely to employ it to answer questions and teach us valuable lessons we can apply to our work. This talk will present simple real-world examples for maximum practical benefit using some of the ever improving set of reverse engineering tools for Android. There’s an incredible amount that can be learned from taking things apart!


Intro (0:00)

I’m Jon Reeve and this is “Reverse Engineering Is Not Just For Hackers.” If you go to a stock photo website and search for hacking, the images show something malicious or some sort of malware research. I feel differently about this and that’s why I created this talk.

Why Take Things Apart? (1:19)

Consider the Large Hadron Collider (LHC) at CERN, which smashes particles apart to see what’s inside. Anything, from the cutting edge of technology to a washing machine, can teach you something when you take it apart. An application shouldn’t be any different, and we don’t necessarily need the source code to be able to see the source code.

For instance, you might wonder why an app is asking for a specific permission. Perhaps an app that wants to do something funny with your camera also seems to want to read all your contacts and send them somewhere. Maybe it wants to send SMS to other things that cost you money, and you can’t see any reason why.

Perhaps an app is crashing, so it might be your app, but it might be a third party library, something for analytics or ad networks, something that you might not have the source for. Maybe it’s not your app, but you’re a developer, and maybe you’re running the newest release of Android 0, and this app that you really like is crashing, maybe you can provide a little bit more info and help the developer fix this a bit sooner. Maybe there’s just something cool in this app that you want to see.

I work on both contract and freelance, and a lot of the time when I’m brought in on a project, teams have a lot of ideas from looking at other apps and saying they want to do something similar. Obviously, you wouldn’t necessarily want to copy these things, but it’s worth seeing how it was achieved.

Maybe it’s a nice visual effect; maybe they’ve managed to do something that you didn’t think was possible–they’ve managed to get the camera API to behave if it’s a Samsung device, for example. Maybe they’ve just done something and you’re thinking, Well, there’s a whole lot of libraries for this, what are the most popular ones? What did people use for this?

You can actually look at popular applications, open them up, and see the package structure and which libraries they’re using. I’m certainly not saying reverse engineer people’s apps and rip them off, but it doesn’t hurt to have a look at what other people are doing, rather than reinventing everything yourself, before writing your own app.

Get the APK (4:19)

The first thing we’d have to do is get the Android Package Kit (APK).

$ adb shell pm list packages -f -3

This is a simple way you could list all the packages on your device. The -f is for the file name, which shows you what the APK file is at, and the -3 refers to 3rd party. This is basically all the installed packages on your device.

$ adb pull "$(adb shell pm path $1 | cut -d : -f 2 | tr -d ‘\015’)"

Alternatively, you could use something like this horrendous one-liner to pull a very specific application’s APK. (It’s particularly awful because of ADB newlines, and this has to do some translating of that, and it’s pretty nasty.) If you have this in a little Bash script, it’s kind of useful.

You could also grab APKs from other places. There are sites that host APKs, but it’s worth mentioning that if you download an APK from somewhere on the internet, you don’t really know what you’re installing. There are certain verifications in place with something like the Google Play Store. You would need to make sure that the APK was signed, and be aware that you might be violating some terms of service.

Get more development news like this

AAPT (5:45)

After you have the APK, one of the most basic things you could do with it is use aapt, which is included in the Android SDK. Using aapt will tell you an awful lot of information if you let it.

$ aapt
Android Asset Packaging Tool
Usage:
 aapt l[ist] [-v] [-a] file.{zip,jar,apk}
 	List contents of Zip-compatible archive.
 aapt d[ump] [--values] [--include-meta-data] WHAT file.{apk} [asset [asset ...]]
 strings 		Print the contents of the resource table string pool in the APK.
 badging 		Print the label and icon for the app declared in APK.
 permissions 	Print the permissions from the APK.
 resources 		Print the resource table from the APK.
 configurations Print the configurations in the APK.
 xmltree 		Print the compiled xmls in the given assets.
 xmlstrings 	Print the strings of the given compiled xml assets.
aapt p[ackage] [-d][-f][-m][-u][-v][-x][-z][-M AndroidManifest.xml] \
	...
 Package the android resources. It will read assets and resources that are supplied with the -M -A -S or raw-files-dir arguments. The -J -P -F and -R options control which files are output.
aapt r[emove] [-v] file.{zip,jar,apk} file1 [file2 ...]
 Delete specified files from Zip-compatible archive.
aapt a[dd] [-v] file.{zip,jar,apk} file1 [file2 ...]
 Add specified files to Zip-compatible archive.

This is the stuff that it can do, which is very extensive. The dump part will allow you to get an awful lot of information about an APK. (You could also do this with GUI tools.)

I find things like strings particularly useful. If, you were running this on your own app, this would all you to quickly see if there’s anything obvious that you might have left in the open. It’s so simple to just list all the strings in an app. Maybe you left a private API key sitting in there and it’s very readable. Maybe you just want to see the basic permissions it uses: you could get the badging, and then the permissions. It’s very handy.

AAPT Examples

$ aapt dump badging Mysterious.apk

This will get you some basic info like the badging; it’ll tell you an awful lot of basic stuff to start with.

$ aapt dump strings Mysterious.apk

This will get interesting strings, as I mentioned.

$ aapt dump xmltree Mysterious.apk AndroidManifest.xml

This will dump the whole manifest as an XML tree. Obviously the manifest has been turned into a binary representation in the XML, so it kind of recreates the XML tree, but this is what Android will read.

The APK (7:17)

What else is in the APK? There’s all these common directories:

  • assets/ Raw files, anything, even dynamically loaded code
  • lib/ Native code libraries
  • META-INF/ Certificate, signature and file hashes, to verify origin and integrity
  • res/ Non-compiled resources
  • AndroidManifest.xml Binary XML version of manifest
  • classes.dex Dalvik Executable - All the classes for the Dalvik VM
  • resources.arsc Compiled resources
  • * (other)

META-INF

The META-INF directory is interesting because you can put a lot of things in there. The signature files have to go in there, and those signature files represent the signature of the APK.

There’s kind of a recursion problem here. If you sign your APK and then you put the signature file into this directory, then you would’ve changed the signature for your APK by doing this. So this directory then had to be excluded from what is signed, which means you could modify things in this directory, and it wouldn’t modify the signature of the APK.

This is an interesting little aside, that you could change files in here without invalidating a signature. Basically, it contains the signatures, and you can use these if you download a 3rd party APK to verify that it is what you think it is.

Other Contents

The assets could be anything. Interestingly, if you are dealing with malware, it might actually contain some sort of slightly obfuscated, almost steganography item sort of hidden in plain sight, so that it doesn’t look like code. You could have things in here that would be dynamically loaded code. Malware sometimes does this kind of thing.

Classes.dex contains all of your Java classes that have been turned into Dalvik classes. It’s worth mentioning that nowadays, with the whole switch to Android runtime (ART) and ahead-of-time compilation, and switching back to partly just-in-time, some of these things may not be used in quite the same way but they all have to still be present in an APK, because that format is fixed. So while they can add to it, you really can’t remove any of these. Even if they’re not fulling being used, they have to be there, and if you want to reverse engineer, these things will still be there.

In the lib/ directory are the native code libraries. This is a bit more in-depth reverse engineering that we aren’t going to go over in this talk, because it’s a lot more tricky.

Other Tools (10:30)

Beyond AAPT, getting into more interesting tools, this is the kind of thing I used to do:

#!/bin/bash
unzip -d zip-out "$1"
java -jar AXMLPrinter2.jar zip-out/AndroidManifest.xml > AndroidManifest.xml
/opt/dex2jar-0.0.9.15/d2j-dex2jar.sh “$1" # creates “${1%.apk}-dex2jar.jar”
mkdir cfr-extracted && /opt/cfr/cfr.sh “${1%.apk}-dex2jar.jar” --outputdir java-out
java -jar /opt/smali/baksmali-2.0.6.jar -o smali-out zip-out/classes.dex

Don’t worry about this, it’s terribly in-depth. I was running a whole bunch of tools, it’s very imperfect, doing things like dex2jar. These things kind of mangle what you’re trying to decompile and you’re not really looking at the best version of it.

All of that’s pretty much irrelevant nowadays. That is how we used to do things back in Java 1.5 or early Android versions like Cupcake. Now you could use something like Apktool.

Apktool

So for instance, just running this:

$ apktool d target.apk

With the d meaning “dump.” It will basically get all the same stuff that enormous amount of script did above, and this is a good starting point. This will give you an awful lot of information. It’ll give you the manifest, resources, several XMLs and Baksmaling classes.dex.

Smali, for anyone who’s not familiar, is an assembly language for Dalvik bytecode. It’s a way that you can read it, rather than just reading the bytecode, but it’s just about as close to the bytecode as you could get. This means that it’s very easy to reverse to Smali. Baksmali being the undoing of this, turns it into something that you can read.

It takes some getting used to, but reading it gets a lot easier. It also means that you can modify that and recompile it and actually recreate an app, whereas if you try to go all the way to Java, chances of getting it back into an app are somewhat limited if you’ve made changes.

You would use apktool to answer the question we started with: “How did they do that?”

After you get all your basic resources laid out, you can look through for the effect you want to mimic. For example, you could find a rounded image view. This will give me the effect I’m looking for. In our example they used a library, so I can then go and use that library too and that would save me the effort of reinventing this.

As I was mentioning, Smali is not super readable, but you can see what the fields are, and things like some inheritance.

Debugging

Maybe you want to see why something was crashing. Well then, this sometimes works. It depends on the APK, how much it’s been messed with, but obviously people don’t necessarily want you to do this to their apps. But if this does work, it’s amazing.

You can take an app that has been built in a non-debug version, you can rebuild it, just using Apktool, re-sign it (it doesn’t have to be signed with the same key, you can just sign it with your own key), and install it on your own device, and you can run that app with a debugger attached. Obviously assuming that there’s no other measures to stop you running a debugger.

Now, this could be kind of dodgy, but say this is the release version of your own application, and the release version is exhibiting some strange behavior and you cannot attach a debugger. Well, the debugged version might have various other changes. Maybe you want the release version to be debuggable. Maybe it’s an old build you can’t easily recreate. This kind of thing can be done:

$ apktool d -d -o SomeApp SomeApp.apk
...
$ apktool b -d SomeApp
...
$ jarsigner -verbose -sigalg SHA1withRSA -digestalg SHA1 -keystore release-key.keystore SomeApp.apk release_key_alias_name

Androguard (15:02)

This is even more impressive. Androguard is an interactive collection of tools. It’s Python-based, which means it’s quite easy for scripting and things, very modular, and very pluggable, a kind of library of little tools, and it takes the form of an interactive Python shell. It includes its own Dalvik decompiler, which is actually pretty impressive. The site doesn’t seem to update, as far as I can tell, but the project still gets changes, still gets commits on GitHub, to deal with new complications in APKs and things.

$ python androlyze.py -s

So you can just run something like this and it starts a little interactive show. You can say, “analyze this APK,” and it will output these three things: a, d, and dx.

$ python androlyze.py -s
Androlyze version 3.0
In [1]: a, d, dx = AnalyzeAPK(“/Users/jon/Desktop/target.apk")
In [2 ]:

You might want to give them better names, but I’m just abbreviating. You’ve got an object that represents the APK, you’ve got an object that represents the Dalvik VM formats–this is essentially like the .dex file–and you’ve got an object that represents an analysis, which is all sorts of things. It’s pre-analyzed for you.

So then you could say, “on the APK, what is the main activity?” This is a simple example, but it has all sorts of things here that you can interrogate.

$ python androlyze.py -s
Androlyze version 3.0
In [1]: a, d, dx = AnalyzeAPK(“/Users/jon/Desktop/target.apk")
In [2 ]: a, d, dx
Out [2]:
(<androguard.core.bytecodes.apk.APK at 0x10a62c350>,
 <androguard.core.bytecodes.dvm.DalvikVMFormat at 0x10d7e7850>,
 <androguard.core.analysis.analysis.uVMAnalysis at 0x11b80dad0>)
In [3 ]: a.get_main_activity()
Out [3]: u'com.example.app.ui.MainHomeActivity'

For a slightly more involved example, I can take the one that represents the dex, and say, “give me a class, give me the source code,” and that will, on the fly, decompile that using the decompiler that it has built in. You can switch it to use another one, but the one it’s got built in is one of the best.

In [4 ]: d.CLASS_Lcom_example_app_ui_MainHomeActivity.source()

You could tell it to show the permissions, and you can get documentation on any of these methods. The question mark will tell you what it does, tell you what the parameters are needed, so it’s very interactive.

If you show the permissions, it will tell you, for instance, where they might be used. So you could then go and look at those places in the source code, and you can browse around with this tool. It’s very useful for browsing an APK if there’s something very specific you want to know. And having an interactive session for this is very good.

In [5]: show_Permissions?
Signature: show_Permissions(dx)
Docstring:
Show where permissions are used in a specific application
:param dx : the analysis virtual machine
:type dx: a :class:`VMAnalysis` object
File: 	/opt/androguard-2.0/androguard/core/analysis/analysis.py
Type: 	function

In [6]: show_Permissions(dx)
android.permission.READ_CONTACTS :
R ['Landroid/provider/ContactsContract;', 'AUTHORITY_URI', 'Landroid/net/Uri;'] (0x0) --->
Lcom/android/ex/chips/BaseRecipientAdapter$DirectoryListQuery;-><clinit>()V
R ['Landroid/provider/ContactsContract$CommonDataKinds$Email;', 'CONTENT_FILTER_URI',
'Landroid/net/Uri;'] (0x118) ---> Lcom/android/ex/chips/Queries;-><clinit>()V
R ['Landroid/provider/ContactsContract$CommonDataKinds$Phone;', 'CONTENT_FILTER_URI',
'Landroid/net/Uri;'] (0x88) ---> Lcom/android/ex/chips/Queries;-><clinit>()V
R ['Landroid/provider/ContactsContract$CommonDataKinds$Email;', 'CONTENT_URI', 'Landroid/net/
Uri;'] (0x11c) ---> Lcom/android/ex/chips/Queries;-><clinit>()V
R ['Landroid/provider/ContactsContract$CommonDataKinds$Phone;', 'CONTENT_URI', 'Landroid/net/
Uri;'] (0x8c) ---> Lcom/android/ex/chips/Queries;-><clinit>()V

ClassyShark (17:49)

Nowadays, ClassyShark is pretty big. I think when I first wrote this, this didn’t actually exist, but this is a very useful tool published by a Google engineer, and it doesn’t have the most amazing user interface, but it has command line interface, which is really handy. You can even use it in build tools and things, script it in there to tell you certain things when you build.

You can use it on your own projects. It’s very easy to just open up an APK and have a look at some basic info, what your method count is, and this kind of thing. So you could check .dex, method counts, and if you’re worried about having to start multi-dexing. You could check the package structure and see if your obfuscation is actually working, which I’ll get to in a moment. You could just check the size of the thing and so on and so forth.

And it opens just about anything. So that’s really handy. You can just throw a .dex file in there from inside an APK. You can throw an .aar from a library or the native library opens, which is kind of weird and not often kind of used. You can chuck an APK obviously, .jar, class files, whatever. And it’ll open it up and show you some info. It’s very handy.

raredare2 (19:20)

Raredare2 is a scriptable hex editor. It’s not the most friendly thing, but it’s organically evolved into a whole kind of framework with lots of things that people have contributed and plugged into it.

It can support just about anything. It supports all sorts of architectures, not just Android. It wasn’t really just built for Android, but it does support Android. It’s an open source project and has been built for Android, so you can actually run this on your device and chuck a little APK at it and see the output that it generates.

It’s not the most friendly thing because it generates a whole bunch of web content, so you can browse some webpages with the analysis, but it’s usable on your device, which is kind of cool.

Additional Tools (20:15)

There are a whole bunch of other ones that are on the Play Store, things like JaDX. It’s kind old, and the user interface isn’t very friendly, but it’s useful. Show Java, which can use JaDX or a different Java decompiler. Dexplorer does similar things. These are all applications available on the Play Store and you can open up APKs on your device without ever touching your laptop.

Back on your laptop, you’ve got things like Santoku. It’s a Linux distro that is based on Lubuntu, which is the lighter version of Ubuntu. It has a whole bunch of tools that are all pre-installed on this distro for security and reverse engineering and things in general, which includes Android. So you’ve got a whole bunch of things like Androguard already installed and set up and ready to go from the command line, which is quite handy. This means you don’t have to mess up your own machine as well. You can just run this from a VM or bootable USB. Just looking at the list of tools on this distro will show you a lot of the best Android reverse engineering tools that you could put on your own machine.

There are also professional tools that cost a lot of money, like IDA. If you really want to do this seriously, and if you do have a lot of money to spend on it, maybe a bigger company backing you, and maybe security is a really big deal for your app, then you’d get into things like this. IDA Pro has been around forever. It’s not just for Android. It goes way back, and it’s pretty much the king of disassembly and debugging tools. And it now supports Dalvik. It’s like an IDE for disassembly. So you can make notes as you go, make changes, and it can rebuild, but it is commercial, and it’s not cheap.

CodeInspect was free in alpha until recently, but now unfortunately it’s out of alpha; it’s not free anymore. It uses something called Jimple. I mentioned Smali and Baksmali doing this thing that will disassemble Delvik. Jasmin is the popular assembly language for Java specifically. CodeInspect uses another one called Jimple, which I think was mostly built around the idea of a static analysis of things, so it then uses a framework called Soot to analyze Jimple stuff which means that, in theory, people will plug stuff in that will do analysis of an APK for you, find certain vulnerabilities, certain patterns and things, like you would have static analysis on your code. It allows you to debug the app, it allows you to do runtime analysis and navigate around, and rename fields and methods and things, which is very cool if the app is obfuscated. Unfortunately it is based on Eclipse. Not sure why they chose that, but it does mean that it’s kind of ugly.

JEB / JEB2 is another one that costs money. It’s a Dalvik-to-Java source decompiler and is interactive, so you can browse around and rename things. It’s a subscription-based thing.

Security (25:08)

This is a classic quote about security:

“The only truly secure system is one that is powered off, cast in a block of concrete and sealed in a lead-lined room with armed guards–and even then I have my doubts.” – Eugene H. Spafford

You can’t really fully secure your app. It’s a trade-off between the amount of time you want to put in, how much you need to protect things, and how much people are going to want to attack the things that you’re trying to protect. If something is truly secret, don’t put it in the app, because it really is very easy to reverse engineer an app.

I really recommend that you do try reverse engineering apps, including your own apps. You can see an awful lot of things that you can learn from, and to do it on your own apps to see what other people might be able to see in your app. If something really needs to be secret, honestly it just can’t be in the app, because there’s nothing that couldn’t be undone.

At the very least, if you care about security, you can turn on obfuscation and minification, it’s really no effort at all. It is worth checking the results of that, because just switching it on without seeing that it has done a good job can sometimes mean that it’s not really obfuscating as much as you think.

If you have money, and it’s a big deal, and it’s worth money to protect certain things, you can use things like DexGuard, which will cost you, but it is like ProGuard up a level. They’ve done some really crazy tricks in DexGuard which make it very hard to reverse engineer those APKs. The thing is, because it’s a popular tool, other tools emerge that specifically target undoing what it does, and so it’s kind of constantly a race, but it will at least keep away a casual observer most of the time.

SQLCipher is for encrypting your data, but again, these just make it harder to immediately see something. You can still undo this. Again, if it really is important, don’t put it in the app.

Reverse engineer your app to see what you can see. Make sure you’re comfortable with it. If you’re putting private things into your strings, it’s quite easy to tell, and if they’re there, you want to see it before someone else does.

Security Example

For instance, if I have an app that has obfuscation switched on, but then I’ve said that I need to keep the activities, and I need to keep the activity, its full name, and I need to keep its package. Because it’s in the manifest, and I’m not using DexGuard, I’m just using ProGuard, so I need that to be intact.

Now, as is quite common, I might organize my activity into a package by feature, and if I’ve done that, then maybe it’s going to keep this entire package structure just to keep the activity name, and now somebody opening up my obfuscated application actually gets quite a lot of the info about the kind of structure of my app, and that really helps them start to browse around it. I can immediately discount, I don’t really care about this package, and I do want the preview, that’s the thing I’m looking for.

So with this, it would be so simple to just move all the activities to a separate package, if this is important. It depends on what your priorities are. Things that are not going to be obfuscated could be in a whole mirrored package structure next to this one, and it would allow all of these things to be hidden away. So it’s just worth being aware of that. You can group the public things in a totally different package structure. More importantly, actually see what it does and make sure you are comfortable with it.

Wrap Up (29:58)

Android Hacker’s Handbook is a fantastic book on a lot of the things I talked about, and a lot more.

The CodeInspect tool that I mentioned was in a talk from Droidcon Berlin, which is very interesting to watch to see all the stuff that they can do.

Here’s a GitHub link from DEFCON23. Anything that’s at DEFCON is generally a bit more in-depth but very interesting and this was a really, really good talk on reverse engineering.

This is a link that gives you the info on how to use Androguard. It’s just the documentation.

Thank you!

Next Up: Realm Java is Open-Source!

General link arrow white

Jon Reeve

A freelance mobile developer and amateur coffee fanatic with more than 7 years experience in Android (still got that G1!), and over 10 years previous in everything from JEE and Swing to C & C++. Previously worked in the automotive and broadcast industries, then at Shazam, and nowadays hopping all over the place looking for challenges. Passionate about clean code, tests and TDD, and taking things apart to see how they work.

Transcribed by Joseph Buelow
Edited by Curtis Chen