Understanding compilation stages – Preprocessor, Compiler, Assembler, Linker, Loader

When we compile Any program in Linux using “gcc” for example ” gcc -o helloworld helloworld.c” it creates an executable with “helloworld” name in single command, but actually in background it goes on following first 4 stages as mentioned below,

1) Preprocessor
2) Compiler
3) Assembler
4) Linker
5) Loader

1) Preprocessor – The C preprocessor is the macro preprocessor for the C language. The preprocessor provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control. For example, when we write a code something like below,

#define TEST 5
printf(“%d \n”, TEST);

After the preprocessor steps the same code becomes as,

printf(“%d \n”, 5);

I.e. preprocessor goes on finding all #define, #include etc and add relative source code , definitions directly into the code.

2) Compiler – GCC : GNU project C and C++ compiler
————————————-

Help – “man gcc”

When you invoke GCC, it normally does preprocessing, compilation, assembly and linking.
The “overall options” allow you to stop this process at an intermediate stage.
For example, the -c option says not to run the linker. Then the output consists
of object files output by the assembler.

3) Assembler (as)
———————————-

Help – “man as”

GNU as is really a family of assemblers.
as is primarily intended to assemble the output of the GNU C compiler “gcc” for use by the linker “ld”.

If you are invoking as via the GNU C compiler, you can use the -Wa option to pass arguments through to the assembler.
The assembler arguments must be separated from each other (and the -Wa) by commas. For example:

SHUFFLED :   The GNU configure and build system

gcc -c -g -O -Wa,-alh,-L file.c

This passes two options to the assembler: -alh (emit a listing to standard output with high-level and assembly source)
and -L (retain local symbols in the symbol table).

4) Linker – (ld – The GNU linker)
———————-

ld combines a number of object and archive files, relocates their data and ties up symbol references.
Usually the last step in compiling a program is to run ld.

The Loader, as we seen below is not the step of compilation, but its one of the first stages of execution of a program, in which loader tries to load all the libraries along with the application during start time.

5) Loader
ld.so/ld-linux.so – dynamic linker/loader
——————————————

ld.so loads the shared libraries needed by a program, prepares the program to run, and then runs it.
Unless explicitly specified via the -static option to ld during compilation, all Linux programs are
incomplete and require further linking at run time.

In Next Two posts we will understand how these steps actually works when we tries to compile “helloworld.c” program.
1. understanding gcc compilation steps : linux compilation steps
2. from source code to executable : how executable is created during compilation on linux

Android Android Commands Android Java Applications Application Libraries Bash / Shell Scripts Bluetooth driver Build Frameworks Commands and Packages Core Kernel C Programs Development Environment Setup Documents / Books Errors & Failures File Systems Framebuffer / Display Driver git Go Language Programs Hardware Platforms Home JAVA Programs Kernel & Device Drivers Kernel Booting and Porting Linux, OS Concepts and Networking Linux Device Drivers Linux Host, Ubuntu, SysAdmin Linux Kernel Linux Networking Middleware Libraries, HAL NDK / Middleware / HAL Network Driver OS Concepts PHP Procfs Filesystem Programming Languages RaspberryPi Scripting and Automation Search Engine Optimisation ( SEO ) Socurce Code Management ( SCM ) System Administration, Security Testing and Debugging Uncategorized Userspace Utilities Web design and development Wordpress Yocto / Bitbake / Openembedded

Leave a Reply