Beginning Mac Hacking

It was the late 90s - the internet was new, every rock band had a token DJ, and I was taking up space in either the university bar or the pentium-1 filled computer labs where I was learning to be social and anti-social respectively. During the anti-social times I would read inspiring tutorials and essays regularly released by a hacker named Fravia.

He was a very mystical fellow, and spoke about reverse engineering with a sense of grand importance and just a pinch of spiritually - all very enticing to a nerdy youngster like myself. His site eventually moved away from cracking and into "search lore" - avoiding the evilness of google (way before it was cool to avoid google), finding stuff using gopher, and many other weird techniques.

I went searching for his old tutorials (via duckduckgo, naturally) and found that Fravia died in 2009. I learned a lot from his work - not about software cracking, but about the hacker spirit and "reality cracking". So as a kind of homage I decided to sit back, sip on a Martini-Wodka have a look at reverse engineering software on my new shiny Mac.

tl;dr: Well, you miss out then.

The target

The first thing to do was to find something to crack: I'd recently downloaded a small desktop utility (I'm not going to say which one... but I'll refer to it as "Spannr" for the purpose of the article) that pops up a "BUY ME NOW!" screen when it loads, and also at random intervals during use. "By applying some ol' Fravia lessons", I reasoned "I should be able to get rid of those pesky messages without doing the bit where I give my credit card details."

Now, before we continue: If your immediate thoughts are either "Oh W00t! I can scam some free shit!" or "Hey, that's not very fair on the hard-working devs that rely on that income to feed their orphaned children" then it's probably best that you know something now: you don't got what it takes to be a hacker. It's not in you.

Don't worry about it - that's not a bad thing to be sure... having the "hacker spirit" means spending waaay too much of your time on things that normal people look at and say "but why?". It's best you head back to reading the startup stories on Hacker News.

The steps

Now that I've clarified that the "having the hacker spirit" means being condescending to normal people, we can move on to hacking on a mac. We should also be aware that playing with a binary file without the source code falls in to the "difficult" category of programming.

If you'd like a non-vehicle based analogy: cracking is like taking a several hundred-thousand piece jigsaw puzzle and being asked to swap out a tiny section of the final picture for a new design, with the constraint: you're not allowed to put any parts back together yourself - you just have to find the correct few pieces, paint them a new colour, and put them back in the box. Sounds fun! Let's get on with it...

I've already said I'm not telling you the name of the application - this isn't an article about releasing warez, it's about figuring out how stuff works - so what follows is more of a story than a tutorial. It's going to be up to you to test this stuff out, and see what you can break, or fix. Thankfully, the ideas and techniques will work on a huge percentage of programs out there.

  1. Research the target
  2. Disassemble the application to machine code
  3. Set interesting looking breakpoints in a debugger
  4. Find the easiest place to hack it
  5. Implement our crack in machine code
  6. Make our hack permanent

The tools

We'll need a few things before we begin. Mostly just standard (or fairly standard) Unix utilities:

  • file - tells you the file type of a given file.
  • nm* - displays the name list (symbol table) of an object.
  • otool* - disassembler - displays specified parts of object files or libraries.
  • gdb* - The GNU Debugger - for "watching" a program as it executes.
  • otx - front-end to otool that gives some extra info.
  • class-dump - dump a binary's class structure - including method signatures.
  • HexFiend - a hex editor for patching the binary.

(items with a * are mandatory. The rest just make things easier!)
Pretty nifty - everything important is already on our system! With the tools in place it was time to find out a bit more about our binary friend...

Divining information

Fravia always stressed the importance of never "hacking blindly" and learning to "feel" the code. He also advocated drinking good alcohol while hacking - a lesson I've held in my heart to this day. So to get a bit of a sense of what we're dealing with use the file utility to find out what our target, "Spannr", is hiding.

cd /Applications/Spannr.app/Contents/MacOS/Spannr
file Spannr
Spannr: Mach-O universal binary with 2 architectures
Spannr (for architecture i386):	Mach-O executable i386
Spannr (for architecture ppc7400):	Mach-O executable ppc

Ok, we have a couple of architectures buried in there. We'll run with the i386, if we need to choose I guess. Instructions are going to look different if you are using a 32 bit or a 64 bit machine - but you can figure it out. The next tool on our list is nm. According to its man page, nm is a tool that will "display the name list (symbol table) of each object file in the argument list". What does that mean?! Let's just run it against our target and find out hey?

nm -arch i386 Spannr
000264ae s  stub helpers
00004b68 t +[AppController initialize]
00014cf4 t +[NSScreen(Extensions) screenContainingFrame:]
00014b87 t +[NSScreen(Extensions) screenContainingMouseCursor]
000188a3 t +[XMLicensingWindowController nibName]
00019043 t +[XMLicensingWindowController sharedLicensingWindowController]
00002f89 t -[AppController showLicensingWindow:]
00003e5a t -[AppController showLicensingWindowOnMainThread]
00004209 t -[AppController showOverlay]
00002c3a t -[AppController showUniversalAccessDialog]
0000481f t -[AppController toggleOverlay:]
000048a4 t -[AppController toggleOverlayViaHotkey]
0001888a t -[AppController(Licensing) sharedLicensingWindowController]
0001873c t -[AppController(Licensing) showLicensingWindow:withTimeoutDuration:]
0001822a t -[AppController(Licensing) verifyLicense]
....

Whoa boy, looks like hacking got lots easier since the 90s! We get a list of every method and property in the executable - including seductive names like verfiyLicense, isLicenced, and setIsLicensed. Well, there goes the "divining" part of this exercise - might as well get on with the hacking. I'm assuming that not every binary is going to be loaded with symbols, so we'll just call this beginners' luck: though I tried a half-dozen other apps and they all looked the pretty much the same.

Now is a great time for you to check out a couple out of binaries for yourself. Try a bunch of different apps: big ones, little ones, native ones, crappy ones. Pick something interesting, then read on...

Handy hint: Don't forget that you can pipe the output of all of these command line tools to a text file by adding > filename.txt. For example: nm Spanner > woot.txt. You can also open it straight up in Textmate (if you use it) with | mate.

Hunting for treasure

Names can be deceiving, so next we'll disassemble the target and have a squiz at some machine code. The otool utility will turn our bucket of bytes (the application) into assembly language mnemonics - so instead of seeing the bytes 01b8 we will see the nemonic mov 1, %eax (which means, move the number 1 into the register called "eax"). Not as nice to read as CoffeeScript, but prettier than binary.

otool -tvV Spannr

Spannr:
(__TEXT,__text) section
start:
00002af0	pushl	$0x00
00002af2	movl	%esp,%ebp
00002af4	andl	$0xf0,%esp
00002af7	subl	$0x10,%esp
00002afa	movl	0x04(%ebp),%ebx
00002afd	movl	%ebx,0x00(%esp)
00002b01	leal	0x08(%ebp),%ecx
00002b04	movl	%ecx,0x04(%esp)
00002b08	addl	$0x01,%ebx
...

To make things a little easier on the eye, try using the otx tool - it uses otool under the hood, but does some "demunging" of names and makes the code more readable. Even so... this ain't your Atari's machine code. Your best bet is to just dive in and start looking around.

Dodgy machine code primer: The first number you see on each line (e.g. 00002af0) is the location in the file of the command - kind of like a line number. Following that, each line of code is made up instructions (pushl, movl, addl) values ($0x00, $xf0) and registers (%esp, %eax, %ebx).

Values are can be pointers to memory locations, or simple integer numbers. Registers are like variables: but there are only a few of them - so you'll see a lot of pushing and popping and moving the values of the registers around.

Finally, instructions are the low level commands that all programs are made from: "move a value to a register", "do a logic 'and' to a register and a value", "add two values" and so on. If you write a small program in C, then disassemble it you'll start to thank your lucky stars we don't have to do all our programming in assembler.

First idea: everybody jump!

So we have a few method names that are interesting to us. I personally like the sound of this isLicensed method - it has a boolean result (You can check this if you disassembled with otx or if you run the file through class-dump) so it's likely that we can apply the age-old

when the code says IF BAD REGISTRATION, GOTO HELL change it to IF _NOT_ BAD REGISTRATION, GOTO HELL

cracking technique - usually just patching a machine code JNE (jump if not equal to) to a JEQ (jump if equal to).

Unfortunately, searching for all the occurrences of isLicensed (easiest to see in the otx output) shows that this method is called in a few different places - and that means we'd have to patch them all.

Additionally, if we apply this type of crack the program will never set itself as "registered" - it just gets tricked when it goes to kick us out. Better would be to actually break the name/key system... which I think occurs inside the method verifyLicense.

Second idea: real cracking

A quick once-over of the verifyLicense method doesn't look good: My assembler is rusty at best - but this lengthy bit of code is comprised of a whole bunch of string manipulation, calls to crypo libraries and private keys and... and it looks pretty hairy - it's using some kind of third-party crypto system that figures out if your name/key is valid and is beyond me for now. But this is most certainly the place to be if you wanted to create a keygen - you'd just have to figure out what every call did and how it calculates the correct results. Just.

Third idea: the lazy way out

In the end I decided to combine the Fravia "code feeling" approach with my own personal "laziness" approach: rather than target the return point of the isLicensed function, we will just force the function to always return true - no matter where it is called from. Sure, the program is not reeealy cracked, but then we only have to patch one place - and as long as the popup message is gone, who cares?

You'll have to make these kind of decisions on a target-by-target basis: setting breakpoints, running the program, and using it as usual - figuring out where things are called and where might be the best place to patch.

It's alive!

Enough of this dead list reading - it's finally time to fire up the debugger and test out our ideas on the running program. A debugger (like gdb) will load and run an application, but lets you stop the execution any time and examine the current state. The host program doesn't even realise that time has stopped - so you can poke and prod, changing memory values and machine code instructions! When you continue the program running, all your changes are still in memory. Very cool.

gdb Spannr

This GDB was configured as "x86_64-apple-darwin"...
Reading symbols for shared libraries .............. 
... done
(gdb)

What's going on? Well, our debugger loaded the program up and has now stopped it at the very first line of machine code. It awaits our command.

The most common thing to do is try and set breakpoints near the interesting bits of code we identified earlier. If the program stops when you think it should stop (for example, it would make sense that isLicensed should get called just before a popup opens) then you know you are in the right place. So set a breakpoint on the isLicensed method that we found above. You need to plug in the entire name - not just the method name:

(gdb) break [XMLicensingWindowController isLicensed]
Breakpoint 1 at 0x188bb
(gdb)

Excellent... the breakpoint is set at memory location 0x188bb (thats hexadecimal of course: champion of all bases)! At least the debugger knows about this function: I've had a couple of programs that don't seem to be able to break on symbol names (but you can always break on the actual address by doing break *0x188bb.

With our initial breakpoint in place, we can run the program (with the r command) and see if we hit it. For Spannr the popup box happens as the program loads so hopefully we'll get dropped back into gdb before we see the UI appear:

(gdb) r
Starting program: /Applications/Spannr.app/Contents/MacOS/Spannr 
Reading symbols for shared libraries 
.+++..++.+++++++....................
....................................
....................................
...... done

Breakpoint 1, 0x000188bb in -[XMLicensingWindowController isLicensed] ()

Oh bingo! The program started running, and then someone made a call to isLicensed - so our debugger stopped everything at memory location 0x000188bb. We have frozen time right at the start of the isLicensed function! Let's have a look around.

If the breakpoint you chose doesn't fire on start up, then the program should load as normal. It's up to you to get the breakpoint to trigger - perhaps you'll have to open the "registration" menu item, or even just use it for a while. If you can't get it to trigger, then it's back to the drawing board. You'll have to try a different breakpoint.

Inside the living beast

Now that we've gone all Matrix-y and stopped time, we can have a look at the instructions that live at the current memory location by using the disas command. This will "disassemble" the bytes that make up the program into machine code, as we did above. If you use disas on its own, it will disassemble from the current instruction, but you can also supply a different address if you want to see what other bits of code look like.

(gdb) disas

Dump of assembler code for function -[XML... isLicensed]:
0x000188b8 <-[XML... isLicensed]+0>:	push   %ebp
0x000188b9 <-[XML... isLicensed]+1>:	mov    %esp,%ebp
0x000188bb <-[XML... isLicensed]+3>:	mov    0x8(%ebp),%eax
0x000188be <-[XML... isLicensed]+6>:	movzbl 0x4c(%eax),%eax
0x000188c2 <-[XML... isLicensed]+10>:	leave  
0x000188c3 <-[XML... isLicensed]+11>:	ret    
End of assembler dump.

Wow, that is a very, very small function. Just a few instructions long - that should work in our favour. We can also look at the state of the registers as we've entered the function with info registers:

(gdb) info registers
eax            0x188b8	100536
ecx            0x1	1
edx            0x0	0
ebx            0x96c7a128	-1765301976
eip            0x188bb	0x188bb <-[XML... isLicensed]+3>
eflags         0x246	582
...

There are a dozen or so registers - but we'll only be concerned with the register eax. The eax register is a 32-bit general-purpose register that commonly stores the return value of a function. That sounds relevant to our interests.

Also, we can see that the eip (the "instruction pointer") register is pointing to location 0x188bb and shows us that that location is +3 bytes into the function. So the first 2 statements (the push %ebp and mov %esp,%ebp) are just some housekeeping. That means there are actually only 2 real instructions in this function before it exits!

We can step over those two instructions with the nexti (ni for short) command. This will cause the program to execute the instruction that eip points to. If we do it twice, then we are effectively at the end of the function - and can examine the registers again:

(gdb) info registers
eax            0x0	0
ecx            0x1	1
edx            0x0	0
ebx            0x96c7a128	-1765301976
...

Well there's your problem sonny... your eax register has changed from some crazy value to a 0. And a zero, as we all know, means "false". Therefore, isLicensed = false. Obviously we need change the eax value to a 1. Somehow. The simple way to do that is to just change the register manually with set $eax=1. First, restart the program by entering r - this will get us back to the breakpoint at the start of the function. Again, step over our 2 instructions with ni. Now we can change the eax register:

(gdb) set $eax=1
(gdb) info registers
eax            0x1	1
ecx            0x1	1
edx            0x0	0

Ohh, looks good. Now continue running the program with the c command: the program UI springs to life, and.... no popup screen! FEEL THE POWER!

But don't get too excited - if we exit the program, restart it in the debugger, step over the two instructions again and have a look at the registers - oh, it's back to 0. Having to manually change the value every time is a bit more annoying than clicking away a popup - so we have to figure out a way to make our changes in code.

Writing some code

This bit is going to be tricky if you don't know any x86 assembler. You don't need to be an expert - but it helps to know some basics. For this program I know vaguely what I need to do: I need to get the value 1 into the register eax. This should look something like this: mov $0x1,%eax. "Move the value 1 into the register eax".

But how do I get put this command into the code? I have no idea. In the olden days the debugger I used was called SoftIce and you could just flip into assembler mode and write the code. But I don't know how or if you can do this with gdb. My only alternative then was to find out what the disassembled version would look like in bytes: So I scoured the dead listing from otx we did earlier until I found one! I searched for "1,%eax" and came up with this:

00003cdf  b801000000 movl $0x00000001,%eax

That's him, officer. The first column is the "line number" and the second is the decompiled bytes we need: b8 01 00 00 00.

That's 5 bytes right there that we need to insert into the existing code. But our function only has two instructions that we can change: mov 0x8(%ebp),%eax which is 3 bytes, and movzbl 0x4c(%eax),%eax which is 4 (you can see the byte counts of each instruction in the disassembly).

Here's where you have to be REALLY careful. If we overwrite an instruction with another instruction with an incorrect number of bytes then we will start the program running from in the middle of another instruction's bytes - causing it (and every instruction after it) to become weird, corrupted new instructions. That will crash almost instantly.

So we have to fill up 7 bytes, but we only need 5 for our command. The answer? NOP! NOP (byte value 0x90) means "no operation" and it does absolutely nothing, but it uses 1 byte to do it! We just need to jam in an extra 2 NOPs, and we are good to go!

So now we need to get our two NOPs (0x9090) and our mov instruction (0xb801000000) in the code. To change the instructions we can set bytes in the same way we changed the register manually. We'll set the bytes individually:

(gdb) set {char}0x188bb=0x90 ; our NOPs
(gdb) set {char}0x188bc=0x90
(gdb) set {char}0x188bd=0xb8 ; mov
(gdb) set {char}0x188be=0x01 ; 4 bytes long "1" value
(gdb) set {char}0x188bf=0x00
(gdb) set {char}0x188c0=0x00
(gdb) set {char}0x188c1=0x00

I think there is a way to write bigger chunks at a time (rather than just each character), but the mov command in 5 bytes long, and I don't know how to set 5 bytes in one statement (anyone?). So I just kept it simple and stuck to {char}s. At any rate, the new commands have now been written:

(gdb) disas
Dump of assembler code for function -[XM... isLicensed]:
0x000188b8 <-[XML...  isLicensed]+0>:	push   %ebp
0x000188b9 <-[XML...  isLicensed]+1>:	mov    %esp,%ebp
0x000188bb <-[XML...  isLicensed]+3>:	nop    
0x000188bc <-[XML...  isLicensed]+4>:	nop    
0x000188bd <-[XML...  isLicensed]+5>:	mov    $0x1,%eax
0x000188c2 <-[XML...  isLicensed]+10>:	leave  
0x000188c3 <-[XML...  isLicensed]+11>:	ret    
End of assembler dump.
(gdb)

Seeing this makes me giggle just a lil' bit. We've actually changed the code of a running program - how cool is that?! If you removed our breakpoints now you could test the program and see that the popup now never pops up. The program is cracked.

I'm not 100% on the right way to do this for your own cracks: not being able to easily assemble commands seems crazy to me - and I'm sure there is a way to do it. I guess you could always compile your own asm projects, but that seems like a lot of work! (anyone?)

Making it stick

We are close now - our hard work is almost at an end. All that's left to do is to make our change permanent. This involves replacing the bytes that make up the old instructions, with the bytes that make up our new instructions. With a hex editor (I'm using Hex Fiend) all we have to do is patch the application with a search and replace - but for bytes instead of strings.

You have to be careful because the same pattern of bytes can turn up many times in a file - you have to make sure you patch the right one. To increase the chance of a unique match we'll search for 16 bytes from where we want to change (rather than just for the 7 bytes we changed ).

To see the value of the bytes use the x/x *address command. This handy lil' command has a few different formats: x/s *address shows the value of the address as a string, and x/i *address shows the machine code instruction at the address. You can also specify more bytes to view by adding an integer before the second x:

(gdb) x/8x 0x188bb: located at 100539 bytes
0x188bb <-[XML... isLicensed]+3>: 0x01b89090 0xc9000000 0xe58955c3 0x8b10558b

Here we can see the 16 bytes located at address 0x188bb - the address of the start of our changes. We only changed 7 bytes so a bunch of the bytes are unchanged from the original file. Keep a copy of these bytes handy, then restart the program (r). Now our changes have been erased and the program is back to its original state. Dump the bytes at that address again:

(gdb) x/8x 0x188bb: located at 100539 bytes
0x188bb <-[XML... isLicensed]+3>:	0x0f08458b	0xc94c40b6	0xe58955c3	0x8b10558b

That's the clean bytes. So now we have a before & after and we can go hunting for them in our hex editor. You can quit out of gdb (q) and fire up Hex Fiend. In the find dialog, enter the first bunch of our clean bytes: 0f08458b, and hit search!

Uh oh. 0 results. That's not good... what the heck is going on here? On a hunch I decided to search for the same bytes, but "byte flipped" so that every block is backwards: 8b45080f. Found a match! Oh, of course! On Mac systems, you see, there are big indians and little indians and temporal time-shifts due to resonating, um, no, um. I dunno really: but they're back to front for some reason (anyone?). So instead of search/replacing the bytes we found above, we flip the values around:

clean bytes normal:     0f08458b c94c40b6 e58955c3 8b10558b
hacked bytes normal:    01b89090 c9000000 e58955c3 8b10558b

clean byte flipped:     8b45080f b6404cc9 c35589e5 8b55108b
hacked byte flip:       9090b801 000000c9 c35589e5 8b55108b

Finally comes the time to patch. A time that you can really mess up your application. If you weren't already working on a copy, make one now! Put the clean flipped bytes in the "search" field and ensure that there is just 1 match in the file: if there is more than 1 you'll have to dump more of the bytes in gdb or just make sure the line number address looks correct (for us it's 0x188bb).

If you have 1 match, then that's it - just search and replace with the "hacked flipped bytes". Open up your program normally and enjoy your pop-free application.

Done

I'm all out of fine alcohol and have moved on to the rough stuff, so it must be time to finish. Not too bad though: just a few hours from first search on the subject, to successful cracker! Though as I implied at the beginning, this isn't something you do to save a few bucks (unless you value your time at, like, 10 cents an hour) - it's something you do because it's something you shouldn't do. It's something you do because it helps you understand that our computers do not run on magic - you can get them to do whatever you want them to. And that's good fun. Good, gruelingly difficult, fun.

Thanks for all your help, Fravia.