IT Anawer: Understanding empty main()'s translation into assembly

Hi, Could somebody please explain what GCC is doing for this piece of code? What is it initializing? The original code is:

#include <stdio.h>
int main()
{

}

And it was translated to:

    .file "test1.c"
    .def ___main; .scl 2; .type 32; .endef
    .text
.globl _main
    .def _main; .scl 2; .type 32; .endef
_main:
    pushl %ebp
    movl %esp, %ebp
    subl $8, %esp
    andl $-16, %esp
    movl $0, %eax
    addl $15, %eax
    addl $15, %eax
    shrl $4, %eax
    sall $4, %eax
    movl %eax, -4(%ebp)
    movl -4(%ebp), %eax
    call __alloca
    call ___main
    leave
    ret

I would be grateful if a compiler/assembly guru got me started by explaining the stack, register and the section initializations. I cant make head or tail out of the code.

EDIT: I am using gcc 3.4.5. and the command line argument is gcc -S test1.c

Thank You, kunjaan.

From stackoverflow

Well, dont know much about GAS, and i'm a little rusty on Intel assembly, but it looks like its initializing main's stack frame.

if you take a look, __main is some kind of macro, must be executing initializations. Then, as main's body is empty, it calls leave instruction, to return to the function that called main.

From http://en.wikibooks.org/wiki/X86_Assembly/GAS_Syntax#.22hello.s.22_line-by-line:

This line declares the "_main" label, marking the place that is called from the startup code.
```
    pushl   %ebp
    movl    %esp, %ebp
    subl    $8, %esp
```
These lines save the value of EBP on the stack, then move the value of ESP into EBP, then subtract 8 from ESP. The "l" on the end of each opcode indicates that we want to use the version of the opcode that works with "long" (32-bit) operands;
```
    andl    $-16, %esp
```
This code "and"s ESP with 0xFFFF0000, aligning the stack with the next lowest 16-byte boundary. (neccesary when using simd instructions, not useful here)
```
    movl    $0, %eax
    movl    %eax, -4(%ebp)
    movl    -4(%ebp), %eax
```
This code moves zero into EAX, then moves EAX into the memory location EBP-4, which is in the temporary space we reserved on the stack at the beginning of the procedure. Then it moves the memory location EBP-4 back into EAX; clearly, this is not optimized code.
```
    call    __alloca
    call    ___main
```
These functions are part of the C library setup. Since we are calling functions in the C library, we probably need these. The exact operations they perform vary depending on the platform and the version of the GNU tools that are installed.

Here's a useful link.

http://unixwiz.net/techtips/win32-callconv-asm.html
Here's a good step-by step breakdown of a simple main() function as compiled by GCC, with lots of detailed info: GAS Syntax (Wikipedia)

For the code you pasted, the instructions break down as follows:
- First four instructions (pushl through andl): set up a new stack frame
- Next five instructions (movl through sall): generating a weird value for eax, which will become the return value (I have no idea how it decided to do this)
- Next two instructions (both movl): store the computed return value in a temporary variable on the stack
- Next two instructions (both call): invoke the C library init functions
- leave instruction: tears down the stack frame
- ret instruction: returns to caller (the outer runtime function, or perhaps the kernel function that invoked your program)
It looks like GCC is acting like it is ok to edit main() to include CRT initialization code. I just confirmed that I get the exact same assembly listing from MinGW GCC 3.4.5 here, with your source text.

The command line I used is:
```
gcc -S emptymain.c
```
Interestingly, if I change the name of the function to qqq() instead of main(), I get the following assembly:
```
        .file   "emptymain.c"
        .text
.globl _qqq
        .def    _qqq;      .scl    2;      .type   32;     .endef
_qqq:
        pushl   %ebp
        movl    %esp, %ebp
        popl    %ebp
        ret
```
which makes much more sense for an empty function with no optimizations turned on.

It would really help to know what gcc version you are using and what libc. It looks like you have a very old gcc version or a strange platform or both. What's going on is some strangeness with calling conventions. I can tell you a few things:

Save the frame pointer on the stack according to convention:

pushl       %ebp
movl        %esp, %ebp

Make room for stuff at the old end of the frame, and round the stack pointer down to a multiple of 4 (why this is needed I don't know):

subl        $8, %esp
andl        $-16, %esp

Through an insane song and dance, get ready to return 1 from main:

movl        $0, %eax
addl        $15, %eax
addl        $15, %eax
shrl        $4, %eax
sall        $4, %eax
movl        %eax, -4(%ebp)
movl        -4(%ebp), %eax

Recover any memory allocated with alloca (GNU-ism):

call        __alloca

Announce to libc that main is exiting (more GNU-ism):

call        ___main

Restore the frame and stack pointers:

leave

Return:

ret

Here's what happens when I compile the very same source code with gcc 4.3 on Debian Linux:

        .file   "main.c"
        .text
        .p2align 4,,15
.globl main
        .type   main, @function
main:
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ecx
        popl    %ecx
        popl    %ebp
        leal    -4(%ecx), %esp
        ret
        .size   main, .-main
        .ident  "GCC: (Debian 4.3.2-1.1) 4.3.2"
        .section        .note.GNU-stack,"",@progbits

And I break it down this way:

Tell the debugger and other tools the source file:

        .file   "main.c"

Code goes in the text section:

        .text

Beats me:

        .p2align 4,,15

main is an exported function:

.globl main
        .type   main, @function

main's entry point:

main:

Grab the return address, align the stack on a 4-byte address, and save the return address again (why I can't say):

        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)

Save frame pointer using standard convention:

        pushl   %ebp
        movl    %esp, %ebp

Inscrutable madness:

        pushl   %ecx
        popl    %ecx

Restore the frame pointer and the stack pointer:

        popl    %ebp
        leal    -4(%ecx), %esp

Return:

ret

More info for the debugger?:

        .size   main, .-main
        .ident  "GCC: (Debian 4.3.2-1.1) 4.3.2"
        .section        .note.GNU-stack,"",@progbits

By the way, main is special and magical; when I compile

int f(void) {
  return 17;
}

I get something slightly more sane:

        .file   "f.c"
        .text
        .p2align 4,,15
.globl f
        .type   f, @function
f:
        pushl   %ebp
        movl    $17, %eax
        movl    %esp, %ebp
        popl    %ebp
        ret
        .size   f, .-f
        .ident  "GCC: (Debian 4.3.2-1.1) 4.3.2"
        .section        .note.GNU-stack,"",@progbits

There's still a ton of decoration, and we're still saving the frame pointer, moving it, and restoring it, which is utterly pointless, but the rest of the code make sense.

kunjaan : I am using gcc 3.4.5. which do you recommend?

kunjaan : what are those .scl 2; .type 32; .endef after main btw?

I should preface all my comments by saying, I am still learning assmebly.

I will ignore the section initialization. A explanation for the section initialization and basically everything else I cover can be found here: http://en.wikibooks.org/wiki/X86_Assembly/GAS_Syntax

The ebp register is the stack frame base pointer, hence the BP. It stores a pointer to the beginning of the current stack.

The esp register is the stack pointer. It holds the memory location of the top of the stack. Each time we push something on the stack esp is updated so that it always points to an address the top of the stack.

So ebp points to the base and esp points to the top. So the stack looks like:
```
esp -----> 000a3   fa
           000a4   21
           000a5   66
           000a6   23
esb -----> 000a7   54
```
If you push e4 on the stack this is what happens:
```
esp -----> 000a2   e4
           000a3   fa
           000a4   21
           000a5   66
           000a6   23
esb -----> 000a7   54
```
Notice that the stack grows toward lower addresses, this fact will be important below.

The first two steps are known as the procedure prolog or more commonly the function prolog, they prepare the stack for use by local variables. See procedure prolog quote at the bottom.

In step 1 we save the pointer to the old stack frame on the stack by calling, pushl %ebp. Since main is the first function called, I have no idea what the previous value of %ebp points too.

Step 2, We are entering a new stack frame because we are entering a new function (main). Therefore, we must set a new stack frame base pointer. We use the value in esp to be the beginning of our stack frame.

Step 3. Allocates 8 bytes of space on the stack. As we mentioned above, the stack grows toward lower addresses thus, subtracting by 8, moves the top of the stack by 8 bytes.

Step 4; Alligns the stack, I've found different opinions on this. I'm not really sure exactly what this is done. I suspect it is done to allow large instructions (SIMD) to be allocated on the stack,

http://gcc.gnu.org/ml/gcc/2008-01/msg00282.html

This code "and"s ESP with 0xFFFF0000, aligning the stack with the next lowest 16-byte boundary. An examination of Mingw's source code reveals that this may be for SIMD instructions appearing in the "_main" routine, which operate only on aligned addresses. Since our routine doesn't contain SIMD instructions, this line is unnecessary.

http://en.wikibooks.org/wiki/X86_Assembly/GAS_Syntax

Steps 5 through 11 seem to have no purpose to me. I couldn't find any explanation on google. Could someone who really knows this stuff provide a deeper understanding. I've heard rumors that this stuff is used for C's exception handling.

Step 5, stores the return value of main 0, in eax.

Step 6 and 7 we add 15 in hex to eax for unknown reason. eax = 01111 + 01111 = 11110

Step 8 we shift the bits of eax 4 bits to the right. eax = 00001 because the last bits are shift off the end 00001 | 111.

Step 9 we shift the bits of eax 4 bits to the left, eax = 10000.

Steps 10 and 11 moves the value in the first 4 allocated bytes on the stack into eax and then moves it from eax back.

Steps 12 and 13 setup the c library.

We have reached the function epilogue. That is, the part of the function which returns the stack pointers, esp and ebp to the state they were in before this function was called.

Step 14, leave sets esp to the value of ebp, moving the top of stack to the address it was before main was called. Then it sets ebp to point to the address we saved on the top of the stack during step 1.

Leave can just be replaced with the following instructions:
```
mov  %ebp, %esp
pop  %ebp
```
Step 15, returns and exits the function.
```
1.    pushl       %ebp
2.    movl        %esp, %ebp
3.    subl        $8, %esp
4.    andl        $-16, %esp
5.    movl        $0, %eax
6.    addl        $15, %eax
7.    addl        $15, %eax
8.    shrl        $4, %eax
9.    sall        $4, %eax
10.   movl        %eax, -4(%ebp)
11.   movl        -4(%ebp), %eax
12.   call        __alloca
13.   call        ___main
14.   leave
15.   ret
```
Procedure Prolog:

The first thing a function has to do is called the procedure prolog. It first saves the current base pointer (ebp) with the instruction pushl %ebp (remember ebp is the register used for accessing function parameters and local variables). Now it copies the stack pointer (esp) to the base pointer (ebp) with the instruction movl %esp, %ebp. This allows you to access the function parameters as indexes from the base pointer. Local variables are always a subtraction from ebp, such as -4(%ebp) or (%ebp)-4 for the first local variable, the return value is always at 4(%ebp) or (%ebp)+4, each parameter or argument is at N*4+4(%ebp) such as 8(%ebp) for the first argument while the old ebp is at (%ebp).

http://www.milw0rm.com/papers/52

A really great stack overflow thread exists which answers much of this question. http://stackoverflow.com/questions/499842/why-are-there-extra-instructions-in-my-gcc-output

A good reference on x86 machine code instructions can be found here: http://programminggroundup.blogspot.com/2007/01/appendix-b-common-x86-instructions.html

This a lecture which contains some of the ideas used below: http://csc.colstate.edu/bosworth/cpsc5155/Y2006_TheFall/MySlides/CPSC5155_L23.htm

Here is another take on answering your question: http://www.phiral.net/linuxasmone.htm

None of these sources explain everything.

IT Anawer

Thursday, May 5, 2011

Understanding empty main()'s translation into assembly

0 comments:

Post a Comment

Blog Archive