In this tutorial, I wanna talk about "program" and "process" terms that are fundamental elements in computer-science. This two terms generally are used exchangeably. But in technically, both are so different. Let's begin with explaining these terms and then deeply see the what it includes:
- A program is a file containing a range of information that describes how to construct a "process" at run time [1].
- A process is an instance of an executing program [1].
These are the formal definations. If I try to express with my words:
A program (or binary) is a file that includes the machine code + metadata + debug information (if the program compiled with -g flag) and process is a abstraction point that kernel creates and then allocates hardware resources like RAM.
First thing that you should know is the program format. In recently, there are two type of program formats that the compilers generate:
- ELF (Executable and Linkable Format)
- PE (Portable Executable)
ELF Format
ELF is the standard format for UNIX/Linux systems and PE for Windows. I'm currently on Ubuntu 24.10 (x86_64) so that I will explain and use ELF format. As you guest, I don't know really PE format. But there is a good reference handbook that you can look at. It's "Practical Binary Analysis" written by Dennis Andriesse.
After explained the program formats, right now, let's look at the inside of the program:

In general, any program includes these:
- Executable header
- Program headers
- Sections
- Section headers
Every ELF file starts with an executable header, which is just a structured series of bytes telling you that it's and ELF file, what kind of ELF file it is, and where in the files to find all the other contents [2].
You can exactly see the content of this header with readelf command:
$ readelf -h ./copy
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Position-Independent Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x1160
Start of program headers: 64 (bytes into file)
Start of section headers: 17496 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 13
Size of section headers: 64 (bytes)
Number of section headers: 37
Section header string table index: 36
In there, first four digits (7f 45) of Magic series define the program format. You will see the probably different series if you are on Windows. Other properties is the self-explaining.
The code and data in an ELF program are logically divided into sections. Each section includes a specific part of your source code. And the sections in the binary are contained in the section header table. Some sections are used to execute the machine instructions and some for other information (like symbol table used by debugger). Let's look at the section headers:
$ readelf --sections --wide ./copy
There are 37 section headers, starting at offset 0x4458:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 0000000000000318 000318 00001c 00 A 0 0 1
[ 2] .note.gnu.property NOTE 0000000000000338 000338 000030 00 A 0 0 8
[ 3] .note.gnu.build-id NOTE 0000000000000368 000368 000024 00 A 0 0 4
[ 4] .note.ABI-tag NOTE 000000000000038c 00038c 000020 00 A 0 0 4
[ 5] .gnu.hash GNU_HASH 00000000000003b0 0003b0 000028 00 A 6 0 8
[ 6] .dynsym DYNSYM 00000000000003d8 0003d8 000180 18 A 7 1 8
[ 7] .dynstr STRTAB 0000000000000558 000558 0000d3 00 A 0 0 1
[ 8] .gnu.version VERSYM 000000000000062c 00062c 000020 02 A 6 0 2
[ 9] .gnu.version_r VERNEED 0000000000000650 000650 000030 00 A 7 1 8
[10] .rela.dyn RELA 0000000000000680 000680 0000d8 18 A 6 0 8
[11] .rela.plt RELA 0000000000000758 000758 0000d8 18 AI 6 24 8
[12] .init PROGBITS 0000000000001000 001000 00001b 00 AX 0 0 4
[13] .plt PROGBITS 0000000000001020 001020 0000a0 10 AX 0 0 16
[14] .plt.got PROGBITS 00000000000010c0 0010c0 000010 10 AX 0 0 16
[15] .plt.sec PROGBITS 00000000000010d0 0010d0 000090 10 AX 0 0 16
[16] .text PROGBITS 0000000000001160 001160 00043a 00 AX 0 0 16
[17] .fini PROGBITS 000000000000159c 00159c 00000d 00 AX 0 0 4
[18] .rodata PROGBITS 0000000000002000 002000 000040 00 A 0 0 4
[19] .eh_frame_hdr PROGBITS 0000000000002040 002040 000034 00 A 0 0 4
[20] .eh_frame PROGBITS 0000000000002078 002078 0000a8 00 A 0 0 8
[21] .init_array INIT_ARRAY 0000000000003d78 002d78 000008 08 WA 0 0 8
[22] .fini_array FINI_ARRAY 0000000000003d80 002d80 000008 08 WA 0 0 8
[23] .dynamic DYNAMIC 0000000000003d88 002d88 0001f0 10 WA 7 0 8
[24] .got PROGBITS 0000000000003f78 002f78 000088 08 WA 0 0 8
[25] .data PROGBITS 0000000000004000 003000 000010 00 WA 0 0 8
[26] .bss NOBITS 0000000000004020 003010 000010 00 WA 0 0 32
[27] .comment PROGBITS 0000000000000000 003010 00002b 01 MS 0 0 1
[28] .debug_aranges PROGBITS 0000000000000000 00303b 000030 00 0 0 1
[29] .debug_info PROGBITS 0000000000000000 00306b 000472 00 0 0 1
[30] .debug_abbrev PROGBITS 0000000000000000 0034dd 000184 00 0 0 1
[31] .debug_line PROGBITS 0000000000000000 003661 00017e 00 0 0 1
[32] .debug_str PROGBITS 0000000000000000 0037df 00031a 01 MS 0 0 1
[33] .debug_line_str PROGBITS 0000000000000000 003af9 00012f 01 MS 0 0 1
[34] .symtab SYMTAB 0000000000000000 003c28 000438 18 35 18 8
[35] .strtab STRTAB 0000000000000000 004060 00028c 00 0 0 1
[36] .shstrtab STRTAB 0000000000000000 0042ec 00016a 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
D (mbind), l (large), p (processor specific)
The main topic of this tutorial is the sections. So I wanna explain in deeply each one:
- .init and .fini: The machine code in these sections are used before/after the main() function in program. Don't forget that we're not writing this sections. Compiler itself creates these to handle low-level (or hardware-specific) works. I don't know actually what both of them do!
- .text: It is the main area where we focus on binary analysis and reverse-engineering stuffs and includes the your code in machine instruction representation. If you look at there, you see that it has voluminous area and thousands of lines of machine code even if it is small program.
- .bss, .data and .rodata: These sections are writable and used to hold various variable in source code. .bss includes the static and global uninitialized and .data includes the static and global initialized variables. Also .rodata consists of variables defined with const keyword in code.
- .dynamic: This is the "road map" for the kernel and dynamic linker when loading and setting up an ELF binary for executions.
- .init_array and .fini_array: These contains an array of pointers to functions to use as constructors/destructors. You maybe know that how to create constructor and destructor using compiler specific-tool, line one __attribute__((constructor)) void run_before_main(void);.
- .shstrtab, .symtab, .strtab, .dynsym and .dynstr: The .shstrtab section is simply an array of NULL-terminated strings that contain the names of all the sections in binary. The .symtab contains a symbol table and .strtab contains the symbolic names. The .dynsym and .dynstr are analogous to .symtab and .strtab, except that they contain symbols and strings needed for dynamic linking rather than static linking.
- .debug_: These sections are used primarily by debugger (GDB) so that it has not include and machine code but just metadata about program that debugger can use later. If you don't compile the program with -g option, these sections will not there.
Below is a part of .text section:
$ objdump -j .text -d ./copy
./copy: file format elf64-x86-64
(...)
Disassembly of section .text:
0000000000001249 <main>:
1249:f3 0f 1e fa endbr64
124d:55 push %rbp
124e:48 89 e5 mov %rsp,%rbp
1251:48 81 ec 40 04 00 00 sub $0x440,%rsp
1258:89 bd cc fb ff ff mov %edi,-0x434(%rbp)
125e:48 89 b5 c0 fb ff ff mov %rsi,-0x440(%rbp)
1265:64 48 8b 04 25 28 00 mov %fs:0x28,%rax
126c:00 00
126e:48 89 45 f8 mov %rax,-0x8(%rbp)
1272:31 c0 xor %eax,%eax
1274:83 bd cc fb ff ff 03 cmpl $0x3,-0x434(%rbp)
127b:75 24 jne 12a1 <main+0x58>
127d:48 8b 85 c0 fb ff ff mov -0x440(%rbp),%rax
1284:48 83 c0 08 add $0x8,%rax
1288:48 8b 00 mov (%rax),%rax
128b:48 8d 15 72 0d 00 00 lea 0xd72(%rip),%rdx # 2004 <_IO_stdin_used+0x4>
(...)
In here, we see the three columns that show the machine instructions of your program. At left side, you see the addresses of machine instructions. When running the program, stack pointer tracks these addresses. At center, you see the actual machine instructions corresponding to your code. Program counter register tracks this machine instructions. At right side, you see the assembly-level representations. When dealing with reverse-engineering, we inspect these to understand what program does.
Virtual Memory Layout
After discussed the ELF program format, right now, I will explain the virtual memory layout of a process (running program instance). When you run the program, kernel creates a memory layout as below:

I've explained the .text, .data, .bss, .rodata sections previously. Apart from these, you see the stack and heap areas.
Both are the memory areas that kernel uses to execute the machine instructions. Stack area has LIFO (Last-In First-Out) model. When executing the program, kernel pushs/pops the instructions in here. So it grows up to lower address space. Heap area is completely different! It has FIFO (First-In First-Out) model. Kernel uses this memory space for dynamic allocation with malloc()/free() function family. And it grows up to upper address space. The heap address border is claimed by brk() function.
Until here, I wanna give you a general overview about programs and processes. If you want to dive in more, you can check and look at the below references.
[1]. Kerrisk M., The Linux Programming Interface, no starch press, San Francisco, page 113.
[2]. Andriesse D., Practical Binary Analysis, no starch press, San Francisco, page 33.