in iOS ~ read.

Mach-O文件分析 - header

本文地址:https://chihoc.com/mach-o-header/
欢迎转载,请注明出处,谢谢

最近想玩玩逆向工程。总不能上来就想着hook。下了大血本买了Hopper(最近更新了Hopper v4)的Lisence。先写个最简单的demo分析一下可执行文件。

#include <stdio.h>
int main(int argc, char *argv[]) {
    printf("Hello World!\n");
    return 0;
}

简单得不能再简单的demo了。只有一个main函数和输出代码。

命令行编译文件xcrun clang hello.c得到a.out执行文件。

找到app包所在的位置。直接拉进hopper分析。

Mach-O

Mach-O为Mach Object文件格式的缩写,它是一种用于可执行文件,目标代码,动态库,内核转储的文件格式。作为a.out格式的替代,Mach-O提供了更强的扩展性,并提升了符号表中信息的访问速度。

典型的Mach-O文件通常由3部分组成:

· Header: 指定文件的目标架构、运行环境等,例如PPC, PPC64, IA-32x86-64等等。同时也存储cpu类型,Load commands数量等等。

· Load commands: 指定文件的逻辑结构和在虚拟内存中的内存分布。当中会包含段位置,标识符表,动态标识符表等等。每个Load command都会包含一个元信息,当中包含命令类型,名字,位置等。

· Raw segment data: 包含在段中使用的原始数据。它通常是占据这个文件最大内容的部分。

光看定义很生涩,下面用上面的例子一一分析。

Mach-O header

我们可以在系统的/usr/include/mach-o/.h中找到Mach-O的一些列C结构体的定义。

上面是Hopper分析出来的Mach-O的头部,Headers的主要作用就是帮助系统迅速的定位Mach-O文件的运行环境。

对应loader.h的结构体如下:

/*
 * The 32-bit mach header appears at the very beginning of the object file for
 * 32-bit architectures.
 */
struct mach_header {
    uint32_t    magic;      /* mach magic number identifier */
    cpu_type_t  cputype;    /* cpu specifier */
    cpu_subtype_t   cpusubtype; /* machine specifier */
    uint32_t    filetype;   /* type of file */
    uint32_t    ncmds;      /* number of load commands */
    uint32_t    sizeofcmds; /* the size of all the load commands */
    uint32_t    flags;      /* flags */
};
/*
 * The 64-bit mach header appears at the very beginning of object files for
 * 64-bit architectures.
 */
struct mach_header_64 {
    uint32_t    magic;      /* mach magic number identifier */
    cpu_type_t  cputype;    /* cpu specifier */
    cpu_subtype_t   cpusubtype; /* machine specifier */
    uint32_t    filetype;   /* type of file */
    uint32_t    ncmds;      /* number of load commands */
    uint32_t    sizeofcmds; /* the size of all the load commands */
    uint32_t    flags;      /* flags */
    uint32_t    reserved;   /* reserved */
};

我们依次往下看:

magic 魔数

整型数值。MH_MAGIC(0xfeedface)表示当前主机CPU的字节顺序与程序编译的字节顺序一致。MH_CIGAM(0xcefaedfe)表示当前主机CPU的字节顺序与程序编译的字节顺序相反。同理,对于64位机器,有MH_MAGIC_64(0xfeedfac)MH_CIGAM_64(0xcffaedfe)两种。

/* Constant for the magic field of the mach_header (32-bit architectures) */
#define MH_MAGIC    0xfeedface  /* the mach magic number */
#define MH_CIGAM    0xcefaedfe  /* NXSwapInt(MH_MAGIC) */
/* Constant for the magic field of the mach_header_64 (64-bit architectures) */
#define MH_MAGIC_64 0xfeedfacf /* the 64-bit mach magic number */
#define MH_CIGAM_64 0xcffaedfe /* NXSwapInt(MH_MAGIC_64) */

cputype cpu类型

整型数值,指定运行文件的CPU架构。如CPU_ARCH_ABI64CPU_TYPE_ARM等。

cpusubtype cpu子类型

整型数值,指定运行文件的CPU架构。如CPU_SUBTYPE_ARM_V6CPU_SUBTYPE_ARM_V7等。

filetype 文件类型

整型数值,指定文件的类型,如MH_OBJECTMH_EXECUTEMH_BUNDLE等。

#define MH_OBJECT   0x1     /* relocatable object file */
#define MH_EXECUTE  0x2     /* demand paged executable file */
#define MH_FVMLIB   0x3     /* fixed VM shared library file */
#define MH_CORE     0x4     /* core file */
#define MH_PRELOAD  0x5     /* preloaded executable file */
#define MH_DYLIB    0x6     /* dynamically bound shared library */
#define MH_DYLINKER 0x7     /* dynamic link editor */
#define MH_BUNDLE   0x8     /* dynamically bound bundle file */
#define MH_DYLIB_STUB   0x9     /* shared library stub for static */
                    /*  linking only, no section contents */
#define MH_DSYM     0xa     /* companion file with only debug */
                    /*  sections */
#define MH_KEXT_BUNDLE  0xb     /* x86_64 kexts */

ncmds 加载命令数

整型数值,指定文件加载命令数。

sizeofcmds 命令大小

整型数值,指定文件加载命令的字节数。

flags 标志

整型数值,该标志包含一些列的bit标志,指定这个文件的一些特性,如MH_NOUNDEFSMH_INCRLINK

#define MH_NOUNDEFS 0x1     /* the object file has no undefined
                       references */
#define MH_INCRLINK 0x2     /* the object file is the output of an
                       incremental link against a base file
                       and can't be link edited again */
#define MH_DYLDLINK 0x4     /* the object file is input for the
                       dynamic linker and can't be staticly
                       link edited again */
#define MH_BINDATLOAD   0x8     /* the object file's undefined
                       references are bound by the dynamic
                       linker when loaded. */
#define MH_PREBOUND 0x10        /* the file has its dynamic undefined
                       references prebound. */
#define MH_SPLIT_SEGS   0x20        /* the file has its read-only and
                       read-write segments split */
#define MH_LAZY_INIT    0x40        /* the shared library init routine is
                       to be run lazily via catching memory
                       faults to its writeable segments
                       (obsolete) */
#define MH_TWOLEVEL 0x80        /* the image is using two-level name
                       space bindings */
#define MH_FORCE_FLAT   0x100       /* the executable is forcing all images
                       to use flat name space bindings */
#define MH_NOMULTIDEFS  0x200       /* this umbrella guarantees no multiple
                       defintions of symbols in its
                       sub-images so the two-level namespace
                       hints can always be used. */
#define MH_NOFIXPREBINDING 0x400    /* do not have dyld notify the
                       prebinding agent about this
                       executable */
#define MH_PREBINDABLE  0x800           /* the binary is not prebound but can
                       have its prebinding redone. only used
                                           when MH_PREBOUND is not set. */
#define MH_ALLMODSBOUND 0x1000      /* indicates that this binary binds to
                                           all two-level namespace modules of
                       its dependent libraries. only used
                       when MH_PREBINDABLE and MH_TWOLEVEL
                       are both set. */ 
#define MH_SUBSECTIONS_VIA_SYMBOLS 0x2000/* safe to divide up the sections into
                        sub-sections via symbols for dead
                        code stripping */
#define MH_CANONICAL    0x4000      /* the binary has been canonicalized
                       via the unprebind operation */
#define MH_WEAK_DEFINES 0x8000      /* the final linked image contains
                       external weak symbols */
#define MH_BINDS_TO_WEAK 0x10000    /* the final linked image uses
                       weak symbols */

#define MH_ALLOW_STACK_EXECUTION 0x20000/* When this bit is set, all stacks 
                       in the task will be given stack
                       execution privilege.  Only used in
                       MH_EXECUTE filetypes. */
#define MH_ROOT_SAFE 0x40000           /* When this bit is set, the binary 
                      declares it is safe for use in
                      processes with uid zero */

#define MH_SETUID_SAFE 0x80000         /* When this bit is set, the binary 
                      declares it is safe for use in
                      processes when issetugid() is true */

#define MH_NO_REEXPORTED_DYLIBS 0x100000 /* When this bit is set on a dylib, 
                      the static linker does not need to
                      examine dependent dylibs to see
                      if any are re-exported */
#define MH_PIE 0x200000         /* When this bit is set, the OS will
                       load the main executable at a
                       random address.  Only used in
                       MH_EXECUTE filetypes. */
#define MH_DEAD_STRIPPABLE_DYLIB 0x400000 /* Only for use on dylibs.  When
                         linking against a dylib that
                         has this bit set, the static linker
                         will automatically not create a
                         LC_LOAD_DYLIB load command to the
                         dylib if no symbols are being
                         referenced from the dylib. */
#define MH_HAS_TLV_DESCRIPTORS 0x800000 /* Contains a section of type 
                        S_THREAD_LOCAL_VARIABLES */

#define MH_NO_HEAP_EXECUTION 0x1000000  /* When this bit is set, the OS will
                       run the main executable with
                       a non-executable heap even on
                       platforms (e.g. i386) that don't
                       require it. Only used in MH_EXECUTE
                       filetypes. */

reserved 保留位

保留位。