Reverse Engineering binary executables - Part 1
This is my attempt at trying to reverse engineer and understand binaries with hexdump
, otool
and try to understand how things works
For this I’m going to use three Hello World
binaries, one made from C
, one from GO
and one from Rust
compiled in the Mach Kernel
with a MacBook Pro
#include <stdio.h>
int main(int argc, char** argv) {
printf("Hello World");
return 0;
}
package main
import "fmt"
func main() {
fmt.Println("Hello World")
}
fn main() {
println!("Hello World!");
}
I’m going in blind with this, let’s see what we find.
Mach-O
First we have to learn about the Mach-O
Mach-O is the executable format used in Mach Kernel.
Every executable we create will be in the following layout
- Headers
- Load Commands
- Data
Headers
Understanding the headers
struct mach_header_64 {
uint32_t magic; /* mach magic number identifier */
cpu_type_t cputype; /* cpu specifier */
cpu_subtype_t cpusubtype; /* machine specifier */
uint32_t filetype; /* type of file */
uint32_t ncmds; /* number of load commands */
uint32_t sizeofcmds; /* the size of all the load commands */
uint32_t flags; /* flags */
uint32_t reserved; /* reserved */
};
cf fa ed fe
is called themagic number
indicating the file format, we read this as0xfeedfacf
ncmds
andsizeofcmds
represents the next section in the layoutfiletype
represents the type of Mach-O(exec, dylib). As an example runningotool
on a dylib
❯ otool -h libeuler.so
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
0xfeedfacf 16777223 3 0x00 6 12 664 0x00100085
Checking the headers of the binaries with otool
Let’s read the headers with the help of otool
otool -h main
// C
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
0xfeedfacf 16777223 3 0x00 2 16 1368 0x00200085
// Go
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
0xfeedfacf 16777223 3 0x00 2 11 2512 0x00000000
// Rust
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
0xfeedfacf 16777223 3 0x00 2 17 2064 0x00a00085
Raw binary
Go
00000000 cf fa ed fe 07 00 00 01 03 00 00 00 02 00 00 00 |................|
00000010 0b 00 00 00 d0 09 00 00 00 00 00 00 00 00 00 00 |................|
C
00000000 cf fa ed fe 07 00 00 01 03 00 00 00 02 00 00 00 |................|
00000010 10 00 00 00 58 05 00 00 85 00 20 00 00 00 00 00 |....X..... .....|
Rust
00000000 cf fa ed fe 07 00 00 01 03 00 00 00 02 00 00 00 |................|
00000010 11 00 00 00 10 08 00 00 85 00 a0 00 00 00 00 00 |................|
The hex below is part of the header
cf fa ed fe 07 00 00 01 03 00 00 00 02 00 00 00
11 00 00 00 10 08 00 00 85
cf fa ed fe
- Magic Number
07 00 00 01
- CPU Type
03 00
- Cpusubtype
00 00
- caps
02 00
- filetype
11 00
- ncmd
Note: I tried to read the header from the binary manually, there could be some errors in the above. If you want to understand the structure I suggest hexdump the header into the struct
Load Commands
The Load Commands consists of Sections
and Segments
. This is quite a huge topic which I’ll cover in a separate blog.
I’m skipping a lot of lines and I’ll use the Load Command Section from a single binary. Let’s explore some of the interesting sections here
- Load command 0 is
__PAGEZERO
, this is the first segment of the executable file. It is located at virtual memory 0. This provides access toNULL
, and causes null pointers dereferences to crash __TEXT
segment contains executable code and read-only data- Loading of the shared library is also mentioned in the
__text
section
- Loading of the shared library is also mentioned in the
__DATA
segment contains variables
Load command 0
cmd LC_SEGMENT_64
cmdsize 72
segname __PAGEZERO
vmaddr 0x0000000000000000
vmsize 0x0000000100000000
fileoff 0
filesize 0
maxprot 0x00000000
initprot 0x00000000
nsects 0
flags 0x0
Load command 1
cmd LC_SEGMENT_64
cmdsize 472
segname __TEXT
vmaddr 0x0000000100000000
vmsize 0x0000000000001000
fileoff 0
filesize 4096
maxprot 0x00000005
initprot 0x00000005
nsects 5
flags 0x0
Section
sectname __text
segname __TEXT
addr 0x0000000100000f50
size 0x0000000000000031
offset 3920
align 2^4 (16)
reloff 0
nreloc 0
flags 0x80000400
reserved1 0
reserved2 0
Load command 8
cmd LC_LOAD_DYLINKER
cmdsize 32
name /usr/lib/dyld (offset 12)
Load command 13
cmd LC_LOAD_DYLIB
cmdsize 56
name /usr/lib/libSystem.B.dylib (offset 24)
time stamp 2 Thu Jan 1 05:30:02 1970
current version 1281.100.1
compatibility version 1.0.0
Shared libraries
> otool -L main
// C
main:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.100.1)
// Go
main:
/usr/lib/libSystem.B.dylib (compatibility version 0.0.0, current version 0.0.0)
// Rust
main:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.100.1)
/usr/lib/libresolv.9.dylib (compatibility version 1.0.0, current version 1.0.0)
Note: Seems the Rust binary is also using libresolv
Raw data
I’ve showed the references __PAGEZERO
and __TEXT
, and the shared libraries
Go
00000020 19 00 00 00 48 00 00 00 5f 5f 50 41 47 45 5a 45 |....H...__PAGEZE|
00000030 52 4f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |RO..............|
00000040 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000060 00 00 00 00 00 00 00 00 19 00 00 00 78 02 00 00 |............x...|
00000070 5f 5f 54 45 58 54 00 00 00 00 00 00 00 00 00 00 |__TEXT..........|
00000080 00 00 00 01 00 00 00 00 00 70 16 00 00 00 00 00 |.........p......|
...
00000990 00 00 00 00 00 00 00 00 0e 00 00 00 20 00 00 00 |............ ...|
000009a0 0c 00 00 00 2f 75 73 72 2f 6c 69 62 2f 64 79 6c |..../usr/lib/dyl|
000009b0 64 00 00 00 00 00 00 00 0c 00 00 00 38 00 00 00 |d...........8...|
000009c0 18 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000009d0 2f 75 73 72 2f 6c 69 62 2f 6c 69 62 53 79 73 74 |/usr/lib/libSyst|
000009e0 65 6d 2e 42 2e 64 79 6c 69 62 00 00 00 00 00 00 |em.B.dylib......|
000009f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
C
00000020 19 00 00 00 48 00 00 00 5f 5f 50 41 47 45 5a 45 |....H...__PAGEZE|
00000030 52 4f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |RO..............|
00000040 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000060 00 00 00 00 00 00 00 00 19 00 00 00 d8 01 00 00 |................|
00000070 5f 5f 54 45 58 54 00 00 00 00 00 00 00 00 00 00 |__TEXT..........|
00000080 00 00 00 00 01 00 00 00 00 10 00 00 00 00 00 00 |................|
...
0000490 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000004a0 0e 00 00 00 20 00 00 00 0c 00 00 00 2f 75 73 72 |.... ......./usr|
000004b0 2f 6c 69 62 2f 64 79 6c 64 00 00 00 00 00 00 00 |/lib/dyld.......|
000004c0 1b 00 00 00 18 00 00 00 0b a2 a0 69 ef b8 32 81 |...........i..2.|
000004d0 87 55 72 32 fb 24 ba d3 32 00 00 00 20 00 00 00 |.Ur2.$..2... ...|
000004e0 01 00 00 00 00 0f 0a 00 04 0f 0a 00 01 00 00 00 |................|
000004f0 03 00 00 00 00 06 2c 02 2a 00 00 00 10 00 00 00 |......,.*.......|
00000500 00 00 00 00 00 00 00 00 28 00 00 80 18 00 00 00 |........(.......|
00000510 50 0f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |P...............|
00000520 0c 00 00 00 38 00 00 00 18 00 00 00 02 00 00 00 |....8...........|
00000530 01 64 01 05 00 00 01 00 2f 75 73 72 2f 6c 69 62 |.d....../usr/lib|
00000540 2f 6c 69 62 53 79 73 74 65 6d 2e 42 2e 64 79 6c |/libSystem.B.dyl|
00000550 69 62 00 00 00 00 00 00 26 00 00 00 10 00 00 00 |ib......&.......|
Rust
00000020 19 00 00 00 48 00 00 00 5f 5f 50 41 47 45 5a 45 |....H...__PAGEZE|
00000030 52 4f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |RO..............|
00000040 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000060 00 00 00 00 00 00 00 00 19 00 00 00 c8 02 00 00 |................|
00000070 5f 5f 54 45 58 54 00 00 00 00 00 00 00 00 00 00 |__TEXT..........|
00000080 00 00 00 00 01 00 00 00 00 40 02 00 00 00 00 00 |.........@......|
...
00000710 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000720 0e 00 00 00 20 00 00 00 0c 00 00 00 2f 75 73 72 |.... ......./usr|
00000730 2f 6c 69 62 2f 64 79 6c 64 00 00 00 00 00 00 00 |/lib/dyld.......|
00000740 1b 00 00 00 18 00 00 00 2f 73 d5 8d 39 42 31 ab |......../s..9B1.|
00000750 b8 94 b4 f9 ec d5 8e 55 32 00 00 00 20 00 00 00 |.......U2... ...|
00000760 01 00 00 00 00 0f 0a 00 04 0f 0a 00 01 00 00 00 |................|
00000770 03 00 00 00 00 06 2c 02 2a 00 00 00 10 00 00 00 |......,.*.......|
00000780 00 00 00 00 00 00 00 00 28 00 00 80 18 00 00 00 |........(.......|
00000790 60 11 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |`...............|
000007a0 0c 00 00 00 38 00 00 00 18 00 00 00 02 00 00 00 |....8...........|
000007b0 01 64 01 05 00 00 01 00 2f 75 73 72 2f 6c 69 62 |.d....../usr/lib|
000007c0 2f 6c 69 62 53 79 73 74 65 6d 2e 42 2e 64 79 6c |/libSystem.B.dyl|
000007d0 69 62 00 00 00 00 00 00 0c 00 00 00 38 00 00 00 |ib..........8...|
000007e0 18 00 00 00 02 00 00 00 00 00 01 00 00 00 01 00 |................|
000007f0 2f 75 73 72 2f 6c 69 62 2f 6c 69 62 72 65 73 6f |/usr/lib/libreso|
00000800 6c 76 2e 39 2e 64 79 6c 69 62 00 00 00 00 00 00 |lv.9.dylib......|
00000810 26 00 00 00 10 00 00 00 f0 9f 02 00 28 02 00 00 |&...........(...|
00000820 29 00 00 00 10 00 00 00 18 a2 02 00 d8 00 00 00 |)...............|
00000830 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
Data
The content of this section varies depending on how or what creates or binary. I’ll go over each binary separately in detail later, but for now some highlights
C
This one is straight forward, found the Hello World
string within the binary
00000fa0 ff ff 48 65 6c 6c 6f 20 57 6f 72 6c 64 00 00 00 |..Hello World...|
Found a reference to the main
and the printf
00003000 11 23 00 51 00 00 00 00 11 40 64 79 6c 64 5f 73 |.#.Q.....@dyld_s|
00003010 74 75 62 5f 62 69 6e 64 65 72 00 51 72 00 90 00 |tub_binder.Qr...|
00003020 73 00 11 40 5f 70 72 69 6e 74 66 00 90 00 00 00 |s..@_printf.....|
00003030 00 01 5f 00 05 00 02 5f 6d 68 5f 65 78 65 63 75 |.._...._mh_execu|
00003040 74 65 5f 68 65 61 64 65 72 00 21 6d 61 69 6e 00 |te_header.!main.|
00003050 25 02 00 00 00 03 00 d0 1e 00 00 00 00 00 00 00 |%...............|
Go
Go has a build ID in the binary
00001000 ff 20 47 6f 20 62 75 69 6c 64 20 49 44 3a 20 22 |. Go build ID: "|
00001010 67 68 73 2d 76 57 69 35 5f 34 6f 39 54 6a 4f 4e |ghs-vWi5_4o9TjON|
00001020 77 30 62 66 2f 79 43 36 61 51 6a 67 62 31 35 6e |w0bf/yC6aQjgb15n|
00001030 57 51 45 70 6d 43 59 79 48 2f 6e 45 77 6f 67 77 |WQEpmCYyH/nEwogw|
00001040 6b 55 6b 4b 39 63 42 72 6e 6e 6d 55 4e 4b 2f 4c |kUkK9cBrnnmUNK/L|
00001050 4f 59 34 75 38 4a 52 53 6c 4b 54 47 50 4e 6e 78 |OY4u8JRSlKTGPNnx|
00001060 67 70 6b 22 0a 20 ff cc cc cc cc cc cc cc cc cc |gpk". ..........|
The binary seems to have some references to which file was used to create it, and the dependencies (in this case fmt).
...
00163670 68 65 6c 6c 6f 2d 77 6f 72 6c 64 2d 67 6f 2f 6d |hello-world-go/m|
00163680 61 69 6e 2e 67 6f 00 2f 55 73 65 72 73 2f 6e 69 |ain.go./Users/ni|
...
001636a0 74 61 6c 6c 73 2f 67 6f 6c 61 6e 67 2f 31 2e 31 |talls/golang/1.1|
001636b0 34 2e 36 2f 67 6f 2f 73 72 63 2f 66 6d 74 2f 73 |4.6/go/src/fmt/s|
001636c0 63 61 6e 2e 67 6f 00 2f 55 73 65 72 73 2f 6e 69 |can.go./Users/ni|
And the Hello World
string, interestingly there doesn’t seem to be any bytes separating the near-by strings
000cd870 54 52 41 43 45 42 41 43 4b 48 65 6c 6c 6f 20 57 |TRACEBACKHello W|
000cd880 6f 72 6c 64 49 64 65 6f 67 72 61 70 68 69 63 4d |orldIdeographicM|
Rust
I expect this should be straight forward like the see binary, and yes I found the Hello World!
00020a60 48 65 6c 6c 6f 20 57 6f 72 6c 64 21 0a 00 00 00 |Hello World!....|
This area of the Virtual memory also seems to be the place other strings are present, so I found a couple of error message strings
00020aa0 61 6c 72 65 61 64 79 20 62 6f 72 72 6f 77 65 64 |already borrowed|
00020ab0 63 6f 6e 6e 65 63 74 69 6f 6e 20 72 65 73 65 74 |connection reset|
00020ac0 65 6e 74 69 74 79 20 6e 6f 74 20 66 6f 75 6e 64 |entity not found|
Summary
For the first run with binary we’ve found some really good stuff. Still there are a lot of things to explore in the Load Command
and Data
sections