Following with lessons from Didier, I will try to share my experience in reversing.

Didier suggests in his last comment this execercise:

#include <stdio.h>

int f(int a) {
    int b=5;
    char c;
    c = a + b;
    return c;   

int main(int argc, char **argv) {
    int a = 3;
    int b;
    b = f(a);   
    printf("Result: %d\n",b);

I complicate it a bit, but I think It will be easy. If I could, everybody can ;) I use Dev+c++ as IDE with gcc. This detail it's important. We will see later wht. After compiling our example we open with ida pro 4.9. We press Crtl+F12 to display graph of function calls and we get this:


WTF????? We can see calls to functions like malloc, fflush, fprintf,... that we didn't use in our program. So what are that all calls? In IDA names windows, go to start.


.text:00401220                 public start
.text:00401220 start           proc near
.text:00401220 var_14          = dword ptr -14h
.text:00401220 var_8           = dword ptr -8
.text:00401220                 push    ebp
.text:00401221                 mov     ebp, esp
.text:00401223                 sub     esp, 8
.text:00401226                 mov     [esp+8+var_8], 1
.text:0040122D                 call    ds:__set_app_type
.text:00401233                 call    sub_401100
.text:00401238                 nop
.text:00401239                 lea     esi, [esi+0]
.text:00401240                 push    ebp
.text:00401241                 mov     ebp, esp
.text:00401243                 sub     esp, 8
.text:00401246                 mov     [esp+14h+var_14], 2
.text:0040124D                 call    ds:__set_app_type
.text:00401253                 call    sub_401100
.text:00401258                 nop
.text:00401259                 lea     esi, [esi+0]
.text:00401259 start           endp

Here we have the first function call __set_app_type. This is the typical prologue of C++ program using MSVCRT.DLL. So it seems gcc introduce his own code to make our programs more "windowsly". But the really interesting call is sub_401100 (00401233 call sub_401100). If we go the graph, we could see that sub_40100 calls other functions and if we follow in graph the execution path, we could see a final call to sub_401260. Let's go to see what does this function:

.text:00401260                 push    ebp
.text:00401261                 mov     ecx, ds:atexit
.text:00401267                 mov     ebp, esp
.text:00401269                 pop     ebp
.text:0040126A                 jmp     ecx
.text:0040126A sub_401260      endp

It seems to be an exit handler (atexit). We follow tracing statically the program, going from last to our start function. The previous call comes from sub_4013B0:

.text:004013B0 sub_4013B0      proc near               ; CODE XREF: sub_4012A0+13!p
.text:004013B0 var_8           = dword ptr -8
.text:004013B0                 push    ebp
.text:004013B1                 mov     ebp, esp
.text:004013B3                 push    ebx
.text:004013B4                 sub     esp, 4
.text:004013B7                 mov     eax, ds:dword_404020
.text:004013BC                 test    eax, eax
.text:004013BE                 jnz     short loc_4013F6
.text:004013C0                 mov     eax, ds:dword_4018C0
.text:004013C5                 mov     ebx, 1
.text:004013CA                 mov     ds:dword_404020, ebx
.text:004013D0                 cmp     eax, 0FFFFFFFFh
.text:004013D3                 jz      short loc_4013FA

This piece of disassembled code it's a bit confusing, dword_404020 and dword_4013f6 seems like unitilized data, so I think it is preparing the atexit handler, maibe passing the function offset.

Going back to cross reference sub (sub_4012A0) we get:


Finally we arrive at our original main function. But again, the compiler has insert its optimizations:


.text:004012A0 var_8           = dword ptr -8
.text:004012A0 var_4           = dword ptr -4
.text:004012A0                 push    ebp
.text:004012A1                 mov     eax, 10h
.text:004012A6                 mov     ebp, esp
.text:004012A8                 sub     esp, 8          ; char *
.text:004012AB                 and     esp, 0FFFFFFF0h

Here we see the main function prologue. First if all, the program reserves 8 bytes (2 vars). The instruction mov eax,10h (16d) seems useless, or at least, seems not be part of our original program. The last stament I could not yet understand.

.text:004012AE                 call    sub_401710
.text:004012B3                 call    sub_4013B0

The call to 0x004013B0 (call sub_4013B0) was "analyze" yet. But we have another new call to 0x00401710:

.text:00401716                 cmp     eax, 1000h
.text:0040171B                 jb      short loc_40172D
.text:0040171D                 sub     ecx, 1000h
.text:00401723                 or      dword ptr [ecx], 0
.text:00401726                 sub     eax, 1000h
.text:0040172B                 jmp     short loc_401716

Again, this is another code added by the compiler. We forget it:

.text:004012B8                 mov     [esp+8+var_8], offset aResultD ; "Result: %d\n"
.text:004012BF                 mov     eax, 8
.text:004012C4                 mov     [esp+8+var_4], eax
.text:004012C8                 call    printf
.text:004012CD                 leave
.text:004012CE                 retn
.text:004012CE sub_4012A0      endp

And the final result. We can see that mov eax,8 previous to printf call. But our f(x) function is missing. Why? Because the optimizer has calculate the result (5 + 3) and decides not to include our function in the final program.

In the next "lesson" I'll try to introduce function calls using scanf to avoid compiler optimizations.

Ad blocker interference detected!

Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.