Linux process execution and the useless ELF header fields

by @Jonathan Salwan - 2013-01-29

When we are in userland calling execve(), the process uses a software-interrupt to call the syscall. You raise a software interrupt via the INT instruction (x86 Arch), then the CPU consults another table: the IDT (Interrupt Descriptor Table) to know which routines it needs to call.

The only operand taken by the INT instruction is an index in this table. That means when a 'int 80h' is executed, the CPU consults the IDT and executes the function stored at the index 0x80.

In the Linux word, the syscalls handler is stored at the index 0x80. Then the handler look at the RAX content to know which syscall it'll call. To do that, the syscalls handler uses the syscalls table ; and RAX is an index in this table. The following diagram shows what I just said:

After all this operations, the function execve is called. Now, several steps are necessary before execution.

  1. Prepare binprm credentials
  2. Initialization binprm memory
  3. Prepare binprm
  4. Search which function calls for load/parse binary
  5. Parse ELF header
  6. Parse Program header
  7. Setup MM & VMA
  8. Start thread
  9. Free binprm

The main structure which describes the process is 'linux_binprm', it's this structure which contains all information about the process memory. The interesting fields of the structure are described below.

v3.7.4/include/linux/binfmts.h

14  struct linux_binprm {
..
16  #ifdef CONFIG_MMU
17        struct vm_area_struct *vma;
18        unsigned long vma_pages;
19  #else
..
23        struct mm_struct *mm;
..
35        struct file * file;
36        struct cred *cred;      /* new credentials */
39        int argc, envc;
40        const char * filename;  /* Name of binary as seen by procps */
41        const char * interp;    /* Name of the binary really executed. Most
42                                   of the time same as filename, but could be
43
..
48  };

To initialize the cred structure, the kernel calls the prepare_exec_creds function, it creates a new creds structure with the creds of the calling process.

Before execution, the Kernel needs to load and to parse the binary. The function load_elf_binary is called, this function checks if the ELF header is valid, and it also sets several pointers: code, brk and stack. If you want more information about ASLR, you can read this post. The VMAs are also set, the following diagram represents the design of Process Memory.

When the binprm struct has been fully filled the kernel can create a new thread and executes it.

 3.7.4/fs/binfmt_elf.c load_elf_binary()

 945        current->mm->end_code = end_code;
 946        current->mm->start_code = start_code;
 947        current->mm->start_data = start_data;
 948        current->mm->end_data = end_data;
 949        current->mm->start_stack = bprm->p;
 ...
 984        start_thread(regs, elf_entry, bprm->p);

It's quite fun because when the kernel loads the ELF binary, it doesn't use the ElfX_Shdr, it only needs the ElfX_Phdr to set up the VMAs. According to this, we can say that the following ElfX_Ehdr's fields are kinda useless: e_shoff, e_shentsize, e_shnum, e_shstrndx.

Now we can build a very simple antidebug trick if you delete these fields. Check that:

$ cat test.c
#include <stdio.h>

int main(int ac, char *av)
{
  printf("Works\n");
  return 0;
}

$ ./test
Works

$ ./anti-debug-CleanSection ./test
[+] Binary size : 7832 octets
--- Step 1 ---
[+] Change section [DONE]
--- Step 2 ---
[+] Change elf header [DONE]
--- Step 3 ---
[+] Writting binary...
[+] Writting binary [DONE]

$ objdump -d ./test
objdump: ./test: File format not recognized

$ gdb ./test
[...]
"/home/jonathan/w/test": not in executable format: File format not recognized
gdb-peda$ r
No executable file specified.
Use the "file" or "exec-file" command.
gdb-peda$ quit

$ ./test
Works
$

You can find the sources of 'anti-debug' here.