A binary analysis, count me if you can
by @Jonathan Salwan - 2013-05-02Recently, I analysed a binary using the number of instructions executed to infer the internal binary conception. This will be the subject of this short blog post.
To begin, let's take for example a basic crackme program that needs a password on stdin. Usually, it relies on a simple string-compare function that returns as soon as the function sees a difference between your two strings. Here is what we have in x64 assembler:
Typically we have :
mov rcx, 16h mov rsi, rdx mov rdi, rax repe cmpsb
Or we can have also :
cmp ax, dx jz .1b
In those previous examples, when the first difference occurs, the loop breaks. It means as long as the comparison is correct, the number of instruction executed increases. Therefore, it is possible to guess the good password via the number of instruction executed.
This time let's take a "real" example with my VMNDH (NDH2k13 CTF - Crackme 500). This crackme implements a 16 bits virtual machine and some anti-debugging technics. The purpose of the challenge was to find the password giving you the "good-boy" message.
To make the debugging and the analysis harder the ELF header has been corrupted. If you are interested in knowing how you can read my old post "Linux process execution and the useless ELF header fields"
# nm ./crackme nm: ./crackme: File format not recognized # objdump -d ./crackme objdump: ./crackme: File format not recognized # gdb ./crackme GNU gdb (Gentoo 7.5.1 p2) 7.5.1 "./crackme": not in executable format: File format not recognized gdb-peda$ r Starting program: No executable file specified. gdb-peda$ quit # ./crackme Password: test Bad password
As I said, the virtual machine was written in C, and of course the interesting code was written our custom 16 bits language. This language is highly inspirated from two well known architectures: x86 and ARM. The source was
; Crackme 500 - NDH 2k13 #include inc/stdlib.inc ; strlen .label strlen xor r1, r1 xor r2, r2 ; printf .label printf mov r7, r0 call :strlen mov r2, r7 mov r3, r0 movb r1, #1 movb r0, #4 syscall ret ; bad password .label badpassword movl r0, :badpwd call :printf ret ; good password .label goodpassword movl r0, :goodpwd call :printf ret ; checkserial .label checkserial mov r7, r0 movl r6, :key call :strlen cmpb r0, #9 jz :sok0 call :badpassword end .label sok0 mov r0, [r7] mov r1, [r6] xor r0, r1 cmpb r0, #x53 jz :sok1 call :badpassword end .label sok1 inc r7 inc r6 mov r0, [r7] mov r1, [r6] xor r0, r1 cmpb r0, #x5b jz :sok2 call :badpassword end .label sok2 inc r7 inc r6 mov r0, [r7] mov r1, [r6] xor r0, r1 cmpb r0, #x4b jz :sok3 call :badpassword end .label sok3 inc r7 inc r6 mov r0, [r7] mov r1, [r6] xor r0, r1 cmpb r0, #x29 jz :sok4 call :badpassword end .label sok4 inc r7 inc r6 mov r0, [r7] mov r1, [r6] xor r0, r1 cmpb r0, #x52 jz :sok5 call :badpassword end .label sok5 inc r7 inc r6 mov r0, [r7] mov r1, [r6] xor r0, r1 cmpb r0, #x76 jz :sok6 call :badpassword end .label sok6 inc r7 inc r6 mov r0, [r7] mov r1, [r6] xor r0, r1 cmpb r0, #x5a jz :sok7 call :badpassword end .label sok7 inc r7 inc r6 mov r0, [r7] mov r1, [r6] xor r0, r1 cmpb r0, #x49 jz :endok call :badpassword end .label endok xor r0, r0 ret ; main .label main movl r0, :str1 call :printf subb sp, #32 mov r2, sp movb r1, #0 movb r3, #31 movb r0, #3 syscall mov r0, r2 call :checkserial call :goodpassword end .label key .db 0x12, 0x21, 0x02, 0x19, 0x25, 0x34, 0x29, 0x11 .label str1 .db "Password: ",0x00 .label goodpwd .db "Good password",0x0a,0x00 .label badpwd .db "Bad password",0x0a,0x00
The crackme waits for a password on stdin and compares all characters with constante values.
This source was compiled to the NDH architecture via our personal compiler. The goal of the
challenge was to reverse the virtual machine and to understand how it works to find the secret
password. You can find a great write-up written by @w4kfu on his
blog. Now we will see how it was possible to
solve this challenge in less than 10 minutes with that instruction counting
black-magic!
To count the number of instruction executed, I used Pin developped by Intel. Pin is a dynamic binary instrumentation framework for the IA-32 and x86-64 instruction-set architectures. In the Pin samples, we can find a 'pintool' counting the instructions (./source/tools/ManualExamples/inscount0.cpp).
// The running count of instructions is kept here // make it static to help the compiler optimize docount static UINT64 icount = 0; // This function is called before every instruction is executed VOID docount() { icount++; } // Pin calls this function every time a new instruction is encountered VOID Instruction(INS ins, VOID *v) { // Insert a call to docount before every instruction, no arguments are passed INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END); }
To know the length of the password, we run Pin with the inscount0 pintool. See what happens:
# ../pin -t ./inscount0.so -- ./crackme <<< 'a' >> /dev/null ; cat inscount.out Count 160524 # ../pin -t ./inscount0.so -- ./crackme <<< 'aa' >> /dev/null ; cat inscount.out Count 163320 # ../pin -t ./inscount0.so -- ./crackme <<< 'aaa' >> /dev/null ; cat inscount.out Count 166116 # ../pin -t ./inscount0.so -- ./crackme <<< 'aaaa' >> /dev/null ; cat inscount.out Count 168912 # ../pin -t ./inscount0.so -- ./crackme <<< 'aaaaa' >> /dev/null ; cat inscount.out Count 171708 # ../pin -t ./inscount0.so -- ./crackme <<< 'aaaaaa' >> /dev/null ; cat inscount.out Count 174504 # ../pin -t ./inscount0.so -- ./crackme <<< 'aaaaaaa' >> /dev/null ; cat inscount.out Count 177300 # ../pin -t ./inscount0.so -- ./crackme <<< 'aaaaaaaa' >> /dev/null ; cat inscount.out Count 183021
At each step, we can see the difference between the instruction counters is 2796
; excepted with a 8 characters long password.
>>> 163320 - 160524 2796 >>> 166116 - 163320 2796 >>> 168912 - 166116 2796 >>> 171708 - 168912 2796 >>> 174504 - 171708 2796 >>> 177300 - 174504 2796 >>> 183021 - 177300 5721
We can say that the length of the password is 8 characters. Now let's see what happens if we find the first character of the password:
# ../pin -t ./inscount0.so -- ./crackme <<< 'a_______' >> /dev/null ; cat inscount.out Count 183021 # ../pin -t ./inscount0.so -- ./crackme <<< 'b_______' >> /dev/null ; cat inscount.out Count 183021 # ../pin -t ./inscount0.so -- ./crackme <<< 'c_______' >> /dev/null ; cat inscount.out Count 183021 [...] # ../pin -t ./inscount0.so -- ./crackme <<< '>_______' >> /dev/null ; cat inscount.out Count 183020 # ../pin -t ./inscount0.so -- ./crackme <<< '@_______' >> /dev/null ; cat inscount.out Count 183020 # ../pin -t ./inscount0.so -- ./crackme <<< 'A_______' >> /dev/null ; cat inscount.out Count 187059 # ../pin -t ./inscount0.so -- ./crackme <<< 'B_______' >> /dev/null ; cat inscount.out Count 183020 # ../pin -t ./inscount0.so -- ./crackme <<< 'C_______' >> /dev/null ; cat inscount.out Count 183020
We can see for each step, the number of instructions executed is ~183021
, excepted when the good
characters is found. See the A
character, at this step, we can see 187059
instructions executed,
this is 4038
instructions more than another step. So, We can imagine that the first good character
is A
. Now let's go with the second character.
[...] # ../pin -t ./inscount0.so -- ./crackme <<< 'Ay______' >> /dev/null ; cat inscount.out Count 187058 # ../pin -t ./inscount0.so -- ./crackme <<< 'Az______' >> /dev/null ; cat inscount.out Count 191097 # ../pin -t ./inscount0.so -- ./crackme <<< 'A0______' >> /dev/null ; cat inscount.out Count 187058 [...]
It's the same thing, when the second character is z
, we have 4039
instructions executed more than another step.
Ok, now we can say that we match the good character, we have ~4030
instructions executed more than the wrong character.
Now, it's easy to find the password - we do simple bruteforce. Here the number of combinations is (26*2+10) * 8 = 496
instead of (26*2+10) ** 8 = 218 340 105 584 896
.
import sys import commands if __name__ == "__main__": pwd = "________" base = 0x2e off = 0x00 sav = 0x00 while pwd.find("Good Password") == -1: pwd = pwd[:off] + chr(base) + pwd[off+1:]; cmd = "./pin -t ./inscount0.so -- ./crackme <<< %s > /dev/null; cat inscount.out" %(pwd) res = int(commands.getstatusoutput(cmd)[1].split("Count")[1]) print "insert('%s') = %d ins" %(pwd, res) if sav == 0x00: sav = res if res - sav > 200: off += 1 if off >= len(pwd): break base = 0x2d sav = 0 base += 1 sav = res print "The password is %s" %(pwd) sys.exit(0)
Let's run this script and wait some minutes.
# ./ins_counts_bf.py insert('._______') = 182899 ins insert('/_______') = 182899 ins insert('0_______') = 182899 ins insert('1_______') = 182899 ins insert('2_______') = 182899 ins insert('3_______') = 182899 ins insert('?_______') = 182899 ins [...] insert('@_______') = 182899 ins insert('A_______') = 186938 ins insert('A.______') = 186937 ins insert('A/______') = 186937 ins insert('Az._____') = 190975 ins [...] insert('Az/_____') = 190975 ins insert('Az0_____') = 190975 ins insert('Az1_____') = 190975 ins insert('AzH_____') = 190975 ins [...] insert('AzI_____') = 195014 ins insert('AzI.____') = 195014 ins insert('AzI/____') = 195014 ins insert('AzI0____') = 199052 ins insert('AzI0.___') = 199051 ins insert('AzI0/___') = 199051 ins insert('AzI0w___') = 203089 ins [...] insert('AzI0w.__') = 203089 ins insert('AzI0w/__') = 203089 ins insert('AzI0w0__') = 203089 ins insert('AzI0wA__') = 203089 ins [...] insert('AzI0wB__') = 207128 ins insert('AzI0wB._') = 207127 ins insert('AzI0wB/_') = 207127 ins insert('AzI0wBs_') = 211166 ins [...] insert('AzI0wBs.') = 211165 ins insert('AzI0wBs/') = 211165 ins insert('AzI0wBs0') = 211165 ins insert('AzI0wBsW') = 211165 ins [...] insert('AzI0wBsX') = 215072 ins The password is AzI0wBsX # ./crackme Password: AzI0wBsX Good password
In conclusion, we saw It was possible to solve this challenge with a black-box approach just by analyzing the execution flow. The password is found with only 336 tests (executed in less than 5 minutes). The full dump is here. But you may ask you why 5 minutes ? Because Pin instruments all the instructions of the program, that's why the bruteforce takes some time. Of course this technic doesn't always work for example, if the input_user is hashed or if the string compare function does not break at the first wrong character.