Scratchpad

If you are new to Scratchpad, and want full access as a Scratchpad editor, create an account!
If you already have an account, log in and have fun!!

READ MORE

Scratchpad
Register
Advertisement

We will start a series of lessons were we compile some very simple C programs and disassemble them.

Download these:

  • Install the compiler and the disassembler with the default options, but don't start the disassembler IDA Pro yet.

Compile this empty C program (rem001.c) with Borland's C++ compiler:

main(int argc, char **argv)
{
}

You will need the BIN directory of the Borland C++ compiler in your PATH:

PATH=%PATH%;c:\Borland\BCC55\Bin
bcc32 -Ic:\Borland\BCC55\Include -Lc:\Borland\BCC55\Lib rem001.c

This will produce an executable rem001.exe

Start IDA Pro & click OK:

Click OK

Accept the EULA:

Accept the EULA

Select New:

Select New

Open the rem001 executable:

Open the rem001 executable

Click OK:

Click OK

Wait for the end of the autoanalysis:

Wait for the end of the autoanalysis


IDA Pro should show you this, starting at .text:00401150:

.text:00401150 ; int __cdecl main(int argc,const char **argv,const char *envp)
.text:00401150 _main           proc near               ; DATA XREF: .data:004090D0
.text:00401150
.text:00401150 argc            = dword ptr  8
.text:00401150 argv            = dword ptr  0Ch
.text:00401150 envp            = dword ptr  10h
.text:00401150
.text:00401150                 push    ebp
.text:00401151                 mov     ebp, esp
.text:00401153                 pop     ebp
.text:00401154                 retn
.text:00401154 _main           endp

.text is the section of the PE file (Portable Executable) format where the Borland compiler puts the code. Read here (http://en.wikipedia.org/wiki/Portable_Executable) about PE and also the referenced MSDN articles.

The first line is a comment, indicated by ;

_main is a label, used to reference the address of the main function, 00401150 near indicates that the referenced address is in the same section (.text) argc, argv and endp are constants

The first assembly instruction that produces code is push ebp, it is one byte large. This will push the ebp register on the stack. You should read about assembly, look at the books referenced here (http://en.wikipedia.org/wiki/Assembly_language). For now, read about CPU architecture, registers and the stack [1]. Pushing EBP on the stack saves it for later use, so that EBP can be used for other things. Retrieving it from the stack is done with pop ebp (address 00401153). mov ebp, esp will copy the content of register esp to register ebp. esp is the stack pointer. and retn terminates the main function

You'll allways find this code at the beginning of functions:

.text:00401150                 push    ebp
.text:00401151                 mov     ebp, esp

It's the prologue, it saves and sets up the registers to start executing the actual code

And this is the epilogue:

.text:00401153                 pop     ebp
.text:00401154                 retn

You'll find it after the actual code, it restores the registers and returns.

There is no actual code to execute since our main function is empty.


Questions: Type your questions here.

Q: Does IDA always point to the beginning of a program post-analysis?

A: We usually speak about entry points: this is the real start of the program. IDA Pro will point to the entry point, unless it finds another start structure like the main function in a C program.

Q: Are the contents of ebp cleared when it is popped?

A: This pop instruction will retrieve a 32-bit value from the stack and store it in register ebp. Previous value of ebp is overwritten.

Q: What is the significance of values 8, 0ch, 10h assigned to argc, argv, envp respsctively?

A: They can used as offset value (to ebp) to access the value of argc, argv and envp. As these arguments are not referenced in the C code, no real assembler code was emitted that uses these constants.

Advertisement