A compiler is a software tool used to translate source code written in a programming language into executable code that can be run on a computer. This is a process that involves several stages, including lexical analysis, syntax analysis, and semantic analysis of the source code. The compiler then generates executable code that is ready to run. While simple language can be used to explain compilers, understanding their operation and translation process can be essential for software engineers and programmers who write software. That's why we separated this topic from the previous post, see here; and gave it special significance.
The first good question you would probably have for us is, in what program is the compiler for the C programming language written? If we go back a little in history, we know that older computers mostly used assembly language, while higher-level programming languages began to develop when the benefits of reusing software on different processors increased. The first higher-level programming language, Plankalkül, was proposed as early as 1943. Since then, several experimental compilers have been developed. Fortran's team, led by John Backus of IBM, introduced the first complete compiler in 1957. Since then, compilers have become increasingly complex as computer architectures have evolved.
Today, it is common practice to implement compilers in the same language that is being compiled. Therefore, it is assumed that the compiler of the C programming language is coded in the C programming language, as for example all .Net programming languages have an open-source Microsoft compiler called Roslyn which is written in the C# programming language. However, to create the first C compiler, its creator Dennis Ritchie used the previous programming language B, which was developed by Ken Thompson.
Dennis Ritchie later expanded the B programming language and created the C programming language, so the original C compiler was also written in B. We mostly use GCC compiler version 14 or newer, this text was written in 2025; and there is no theoretical chance that you will find any command in the B programming language in it. Because the GCC compiler does not support the B programming language, it considers it too obsolete. But we know that the GCC compiler is written in a combination of the C and C++ programming languages, with the possibility that it may also contain some parts written in other programming languages such as Objective-C and some newer ones.
When you flawlessly write C code in any text editor and create a text file, you can call the C compiler to translate it into machine code so that your program can run. The compiler runs a translator or translation unit, known as a Translation Unit, which consists of the source file and header files that are referenced using #include directives. If your code is correct, the translator creates an Object File, which we recognize by the. o or .obj suffix, and we call such object files modules. The standard library of the C programming language contains translated object files in machine language, which allows faster access to standard functions that we call in our programs.
It is important to note that when we say that a file is translated into machine language in the C programming language, it is first translated into assembly programming language in a temporary file, which is then translated into machine language, after which the temporary file is deleted. When compiling a program, we recognize such a file by the . s suffix. The translator separately translates each source file with all the header files it contains into separate object files, i.e., modules. The translator then calls the Linker, which combines all object files and all used functions from the library into an Executable File. Do not confuse this process with .Net technology. In .Net technology and the C# programming language, things are different.
How the C Programming Language 'Understands' Your Code: A Journey Through the Compilation Stages
Let's remember, a compiler is a software tool that translates program code written in a high-level language, such as C into machine code that a processor can execute. The main functions of a compiler are:
- Checking the syntax and semantics of the code.
- Translating from source code to intermediate code.
- Optimizing the code for better efficiency.
- Generating machine code specific to the processor and operating system.
- GCC - GNU Compiler Collection
- Clang
- MSVC - Microsoft Visual C++
- Intel C Compiler
1. Preprocessing
Processing directives (#include, #define, #ifdef, etc.).
Inserting the contents of header files (#include).
Replacing macros (#define).
Removing comments.
Conditional compilation (#ifdef, #endif).
2. Lexical Analysis
The source code is divided into the smallest logical units - tokens (keywords, identifiers, numbers, operators).
For example, the code:
C
int x = 10;
is broken down into tokens: int, x, =, 10, ;.
3. Syntax Analysis (Parsing)
- AST - Abstract Syntax Tree is generated, which shows the structure of the code.
- Checks the validity of the syntax (e.g., whether if is correctly written).
- If there are syntax errors, the compiler stops the process and reports errors.
4. Semantic Analysis
Checks the meaning of the code:
Are the data types compatible?
Are variables properly declared before use?
Is there a name conflict in the scope of variables?
This phase prevents logical errors, such as using a float variable in a switch statement.
5. Intermediate Code Generation
The compiler creates intermediate code, which is not directly dependent on the processor architecture.
This intermediate code can be in the form of three-address instructions or SSA - Static Single Assignment form.
Example of intermediate code:
Code snippet
MOV R1, #10
STR R1, x
6. Code Optimization
Improving code performance by removing redundant instructions. Optimization techniques include:
- Dead code elimination (code that is never executed).
- Loop optimization (Loop Unrolling).
- Function inlining (Inline Expansion).
- Reduction of redundant expressions.
Transformation of optimized intermediate code into machine code specific to the processor.
For example, for x = x + 5; on the x86 architecture, it might look like this:
Code snippet
ADD EAX, 5
8. Linking
- Linking object code (.o or .obj files) with libraries (libc, math.h, stdio.h, etc.).
- Creating the final executable file (.exe or binary file).
- Static (all libraries are included in the executable file).
- Dynamic (the program uses external *.so or *.dll files).
Compiling C Programs in Practice
All of this may seem complicated to you theoretically, but in practice everything is simple. Just keep in mind that if you compiled a C program into an *.exe file in the Linux operating system, it will not work on the Windows operating system. However, if you really have such a need, it is enough to install MinGW-64 and use a different command:
sudo apt-get install mingw-w64
x86_64-w64-mingw32-gcc -o program.exe program.c
These commands will compile your file to work on the Windows operating system, but then the file will not work on the Linux operating system due to differences in operating systems and their binary formats. Run your terminal and enter the following commands.
manuel@manuel-virtual-machine:~$ sudo apt-get update
manuel@manuel-virtual-machine:~$ sudo apt-get upgrade
manuel@manuel-virtual-machine:~$ clear
manuel@manuel-virtual-machine:~$ ls
manuel@manuel-virtual-machine:~$ cd tutorials
manuel@manuel-virtual-machine:/tutorials$ ls
manuel@manuel-virtual-machine:/tutorials$ cd c_tutorial
manuel@manuel-virtual-machine:/tutorials/c_tutorial$ nano program.c
Write the following C code into a file.
#include <stdio.h>
int main() {
printf("Hello, World!\n");
return 0;
}
Then, compile the program.c file.
manuel@manuel-virtual-machine:/tutorials/c_tutorial$ cat program.c
manuel@manuel-virtual-machine:/tutorials/c_tutorial$ clear
manuel@manuel-virtual-machine:/tutorials/c_tutorial$ gcc program.c -o program
manuel@manuel-virtual-machine:/tutorials/c_tutorial$ ls
manuel@manuel-virtual-machine:/tutorials/c_tutorial$ ./program
You will get the following result.
Hello, World!
You can also see the same example of compiling a file
with C code in the following video.
No comments:
Post a Comment