The Design and Implementation of Userland Exec by the grugq Abstract: This paper examines a mechanism to execute a binary without the use of the syscall execve(). The design for this mechanism is simple: clean up the address space; check for, and load, the dynamic linker; load the binary; set up the stack; locate the entry point, and transfer control of execution. One implemenation of this mechanism is used to illustrate this design. Introduction: Userland exec replaces the existing process image within the current address space with a new one. In this, userland exec mimics the behavior of the system call execve()*. However, because it operates in userland, the kernel process structures which describe the process image remain unchanged. This means that the process name reported by system utilities will be the old process name, etc. The ability to load a new process image without the direct aid of the kernel is important in many scenarios. For example: a program (e.g. shellcode) could load a binary off the wire and execute it without first creating a copy on disk; or, a program could extract a binary from an encrypted data store and execute it without creating a plain text image on the disk. Userland exec is useful for any situation where it is preferable not to create a file on the disk when executing a program. This paper will describe the design and implementation of the userland exec code. First it will discuss how execve() creates a new process image, and what this process image looks like (including the stack layout). Based on this knowledge the design of the userland exec code will be examined, using the userland exec implementation to illustrate concepts. *) This implementation of userland exec is specific to ELF binaries on i386 Linux systems, however the methodology outlined is universal across Unix-like systems. Background: When a Unix program wants to execute another program it will use the execve() system call to load the new program as a process image (creating the new process via fork() is ignored here). The man page for the exec family of function calls (execl() and the other wrappers for execve()) describes what happens: "[t]he exec family of functions replaces the current process image with a new process image." A more complete picture of what happens when a process image is replaced can be found in the execve() man page: execve() does not return on success, and the text, data, bss, and stack of the calling process are overwritten by that [sic] of the program loaded. Clearly, the program which calls execve() is taken out of memory and the new program loaded. When this happens, the .text and .data segments as well as the heap and the stack are replaced. Any implementation of userland exec would have to remove the old process' text, data and heap, and reset the stack. To design a userland exec implementation, a more detailed description of how execve() operates is required. This more detailed description of how a new process image is created from an ELF binary would be: First, execve() cleans the address space of the current process by unmapping all of the pages of memory which form the existing image. If the executable is dynamically linked -- that is, if it uses run time libraries provided by the system -- the dynamic linker is loaded into memory. The main ELF executable is then mapped into the process address space. The stack is initialized with the environmental variables and the other parameters to main(). The stack is also initialized with arguments for the dynamic linker to allow it to locate the binary in memory. Finally, the kernel determines the correct entry point: either the dynamic linker, if it is loaded, or the main binary, if it is not.The kernel then transfers execution to this entry point. This more detailed description provides a blueprint for the userland exec implementation. However, some specific details still need to be clarified: in particular the exact layout of the stack just after the image is created. The diagram below roughly illustrates this stack layout. ----------------------------------------- [ 0 ] <-- top of the stack [ envp strings ] [ 0 ] [ argv strings ] [ 0 ] [ auxv ] [ 0 ] [ envp ] [ 0 ] [ argv ] [ argc ] <-- stack pointer ----------------------------------------- At the top of the stack are the environmental and argument strings. The bottom of the stack starts with the argument count, then come the argument pointerarray and the environmental pointer array. In the middle is the Elf_aux vector. This array of data structures provides information to the dynamic linker about the process image, e.g. the location of the program header table. The dynamic linker is responsible for relocating the ELF binary, if required, as well as loading all the necessary libraries and performing run time symbol resolution. For more information on dynamic linkers see [Levine 1999], [grugq 2001] and [grugq & scut 2001]. Design: With a clear understanding of how execve() creates a new process image, it is possible to enumerate a list of steps that a userland implementation must follow. That list is: 1. Clean out the address space 2. If the binary is dynamically linked, load the dynamic linker 3. Load the binary 4. Initialize the stack 5. Determine the entry point (i.e. the dynamic linker or the main executable) 6. Transfer execution to the entry point Note: the second and third step are actually interchangeable. Implementation: Step 0, preserving arguments to main() When userland exec is invoked, it accepts arguments to the main() function of the new program. While building the new process image, userland exec will reset the stack and potentially damage those parameters. For this reason, the first thing that userland exec must do is preserve these arguments for later use. In this implementation, the arguments are stored in tables allocated with mmap(). The contents of these tables can then be copied back into the stack when it is being initialized for the new process image. Step 1, clean out the address space To prepare the address space for a new process image userland exec needs to clean out only certain address ranges, not the entire old image. The address range of primary importance is the one to which the new program binary has been relocated. On i386 Linux systems, this address is usually 0x08048000. The .text segment, and the .data segment, of the current process's binary will be mapped to this location. Following the .data segment will be the .bss, and following that, the heap. Userland exec must therefore map the old executable down, in addition to mapping down the old heap. Depending on the state of the program, the heap can be of varying size. The userland exec implementation must determine where the memory range which comprises the .text, .data and heap, begins and ends so that it can map the range down. There are several different ways of determining this range. In some instances, for a specific target, it is possible to hard-code the values. For a more generic approach, one approach might be to register a signal handler and start probing the address space, e.g. to determine the end of the heap. Using the hard coded start address of 0x08048000 and the probed heap size, userland exec could map down this range. In this implementation, userland exec uses the process maps file maintained by the kernel to determine the location and size of the process' .text and .data segments and the heap. Convieniently, under Linux, the first three lines of the /proc/self/maps file contain the .text segment, the .data segment and the heap. By mapping down the first three memory segments listed in the maps file userland exec can clean the range required for the new .text and .data segments, and the new heap. Step 2, check for, and load, the dynamic linker Determining whether a given ELF binary requires a dynamic linker, or not, is incredibly straightforward. The execve() man pages describe the exact method used by the kernel (and by userland exec) to check for a dynamic linker. If the executable is a dynamically-linked ELF executable, the interpreter named in the PT_INTERP segment is used to load the needed shared libraries. The dynamic linker is only required for those processes which utilize shared libraries from the system. If a binary requires dynamic linking, the path to the desired dynamic linker will be stored in a special program header segment: PT_INTERP. By searching the program header table of the new ELF image, userland exec can determine whether a dynamic linker is required, and, simultaneously, which dynamic linker to load. Loading the linker after this is trivial, requiring only an understanding of how ELF binaries are loaded into memory. The next section will discuss how this is done. Step 3, load the main binary The principle for loading an ELF binary is the same regardless of whether the binary is loaded from a memory buffer or off the disk. For each PT_LOAD segment in the program header table, the loader ensures that the ELF image from p_offset for p_filesz bytes is copied into a memory buffer (with p_flags permissions) at p_vaddr of p_memsz (aligned to p_align) bytes. That is, each PT_LOAD segment essentially says "this part of the ELF goes into this part of the memory image". The userland exec implementation scans the program header table and determines the total amount of memory the loaded ELF will occupy. This simply means aligning the p_memsz values of all the PT_LOAD segments using p_align, then adding them together. The userland exec code then allocates a memory range starting from the first PT_LOAD segment's p_vaddr for the total length previously calculated. With the memory now reserved, all that remains is to copy the PT_LOAD segments into their appropriate ranges and then set the segment permissions based on their p_flags. Step 4, initialize the stack With the main binary, and any dynamic linker, loaded into memory, the last major task which still remains is setting up the stack. The stack, at program start up, has a strict format examined briefly above. To initialize the stack, userland exec must reset the stack pointer. Performing this is simply a matter of calculating the size of the stack contents (the argv strings, environmental strings, auxv, etc. etc.) and subtracting this value from the stack base. With the space allocated for the new parameters, userland exec can begin creating the stack layout. To create the stack layout the userland exec implementation must extract the saved parameters for the main() function (argc, argv and envp) and copy them to the stack. This can trivially be accomplished using the saved parameters from step 0 and some simple pointer arithmetic. The Elf_Aux vector is the only data structure which requires some explanation. This vector contains data which the dynamic linker needs in order to integrate with the process image correctly. There are only a few key pieces of information required by the dynamic linker to perform its task. These are: its own load address (AT_BASE); the location of the program header table (AT_PHDR), as well as the number of program headers (AT_PHNUM); the size of a page of memory (AT_PAGESZ); any flags to alter its run time behavior (AT_FLAGS), and the entry point for the main executable (AT_ENTRY). Step 5, determine entry point Locating the correct entry point for the new process image is very straight forward. If a dynamic linker was loaded, then the entry point is the load address of the linker plus the e_entry value, otherwise it is the main ELF's e_entry. Step 6, transfer control of execution The final step is to start the new process image running by transferring control of execution to the entry point located during step 5. These six steps are all that are required to load an arbitrary ELF binary into an address space and execute it. Conclusion: It has been shown that the execve() system call can be mimicked in userland with a minimum of difficulty. Creating a new process image is a simple matter of: cleaning out the address space; checking for, and loading, the dynamic linker; loading the binary; initializing the stack; determining the entry point and, transferring control of execution. By following these six steps,a new process image can be created and run. Using this implementation as a start point, it is possible to create both smaller, and also more elaborate,userland exec implementations. History of userland exec: The author first developed a userland exec implementation in early 2002. The initial proof of concept implementation demonstrated that it was indeed possible to mimic execve() in userland. However, several limitations in the implementation made it unfit for public release. Recently, some people have posted incredibly crude mechanisms for executing a new binary in an address space occupied by another process image. Given the severe limitations of this technique, and the request of a friend for the old proof of concept implementation, the author felt compelled to implement a fully functional version of userland exec. Design and implementation took less than 48 hours to complete. Acknowledgments: As always, the creation of any work is not possible without the help of other individuals. Here, then, is a list of those who've been involved in some capacity or another in making userland exec: mammon_; gera; duke, and Grendel, PhD. Additionally thanks to those ppl who made .sg so enjoyable Halvar & Evelyn, SK, Saumil, Charl and Dave Aitel. And all the ppl who came to my talk. Bibliography: http://www.phrack.org/phrack/58/p58-0x05 grugq & scut, 2001 http://downloads.securityfocus.com/library/subversiveld.pdf grugq, 2001 http://www.iecc.com/linker/ levine, 1999