Understanding compilation stages – Preprocessor, Compiler, Assembler, Linker, Loader

When we compile Any program in Linux using “gcc” for example ” gcc -o helloworld helloworld.c” it creates an executable with “helloworld” name in single command, but actually in background it goes on following first 4 stages as mentioned below,

  1. Preprocessor
  2. Compiler
  3. Assembler
  4. Linker
  5. Loader

1) Preprocessor

The C preprocessor is the macro preprocessor for the C language. The preprocessor provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control. For example, when we write a code something like below,

#define TEST 5
printf("%d \n", TEST);

After the pre-processor steps the same code becomes as,

printf("%d \n", 5);

I.e. preprocessor goes on finding all #define, #include etc and add relative source code , definitions directly into the code.

2) Compiler – GCC : GNU project C and C++ compile

Help – “man gcc”

When you invoke GCC, it normally does preprocessing, compilation, assembly and linking. The “overall options” allow you to stop this process at an intermediate stage.
For example, the -c option says not to run the linker. Then the output consists of object files output by the assembler.

3) Assembler (as)

GNU as is really a family of assemblers.
“as” is primarily intended to assemble the output of the GNU C compiler “gcc” for use by the linker “ld”.

If you are invoking as via the GNU C compiler, you can use the -Wa option to pass arguments through to the assembler.
The assembler arguments must be separated from each other (and the -Wa) by commas. For example:

 $ gcc -c -g -O -Wa,-alh,-L file.c 

This passes two options to the assembler: -alh (emit a listing to standard output with high-level and assembly source)
and -L (retain local symbols in the symbol table).

4) Linker – ld – The GNU linker

ld combines a number of object and archive files, relocates their data and ties up symbol references.
Usually the last step in compiling a program is to run ld.

Related :   How to resolve error : dtc: command not found

The Loader, as we seen below is not the step of compilation, but its one of the first stages of execution of a program, in which loader tries to load all the libraries along with the application during start time.

5) Loader –ld.so/ld-linux.so – dynamic linker/loader

ld.so loads the shared libraries needed by a program, prepares the program to run, and then runs it.
Unless explicitly specified via the -static option to ld during compilation, all Linux programs are incomplete and require further linking at run time.

In Next Two posts we will understand how these steps actually works when we tries to compile “helloworld.c” program.
1. understanding gcc compilation steps : linux compilation steps
2. from source code to executable : how executable is created during compilation on linux

Android Android Commands Android Java Applications Application Libraries Application Stack / User Interface Bash / Shell Scripts Commands and Packages Compilation Content Management System - CMS Core Kernel C Programs Development & Build Development Environment Setup Errors & Failures Flutter git Go Language Programs Hardware Platforms HTML JAVA Programs Kernel & Device Drivers Linux, OS Concepts and Networking Linux Device Drivers Linux Host, Ubuntu, SysAdmin Linux Kernel Linux Networking Middleware Libraries, HAL Monetization / Google AdSense Multimedia - Audio, Video, Images NDK / Middleware / HAL OS Concepts PHP Programming Languages RaspberryPi Scripting and Automation Search Engine Optimisation ( SEO ) Social Media Socurce Code Management ( SCM ) System Administration, Security Testing and Debugging Uncategorized User Interface Web design and development Wordpress Yocto / Bitbake / Openembedded

Leave a Reply / Ask Question