一、应用启动初探
1.打印顺序
先看下这段代码,试想一下语句的输出顺序
@interface Person : NSObject
@end
@implementation Person
+ (void)load {
printf("----------load-----------: %s\n", __func__);
}
@end
__attribute__((constructor)) void cc_func () {
printf("--------cc_func----------: %s\n", __func__);
}
int main(int argc, const char * argv[]) {
@autoreleasepool {
// insert code here...
NSLog(@"Hello, World!");
}
return 0;
}
复制代码
你猜的没错,输出顺序如下:
----------load-----------: +[Person load]
--------cc_func----------: cc_func
Dyld[40374:1115383] Hello, World!
复制代码
它的顺序是:
load --> C++ constructor 方法 --> main()
复制代码
2.main 函数之前
你是否有所疑惑?
main不是入口函数吗?为什么不是main最先执行?
通常在 main 函数之前,还有一系列的事情要做
上面的图片已经清楚展示了启动的流程和阶段。
3. 断点 load 方法
在 load 方法中断点,然后打印堆栈
输出:
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 8.1
* frame #0: 0x0000000100003e60 Dyld`+[Person load](self=Person, _cmd="load") at main.m:17:5
frame #1: 0x00007fff203ab4d6 libobjc.A.dylib`load_images + 1556
frame #2: 0x0000000100016527 dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 425
frame #3: 0x000000010002c794 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 474
frame #4: 0x000000010002a55f dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 191
frame #5: 0x000000010002a600 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82
frame #6: 0x00000001000168b7 dyld`dyld::initializeMainExecutable() + 199
frame #7: 0x000000010001ceb8 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 8702
frame #8: 0x0000000100015224 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 450
frame #9: 0x0000000100015025 dyld`_dyld_start + 37
复制代码
4.启动过程(Pre-main)
这里先看一下 main 函数之前(pre-main)都做了哪些事,大概有个印象,后面会具体探究
5.启动过程(main)
main 函数及后面的阶段大家应该比较熟悉啦
二、dyld
1.dyld 是什么?
dyld(The dynamic link editor) , 动态链接器。dyld 是一个用户态的进程,是 Apple 维护的 Darwin 的一部分(dyld),位于:/usr/lib/dyld
,用它来加载动态库。
2.dyld 作用
- 负责程序的链接及加载工作。应用被编译打包成可执行文件
Mach-O
后,启动时候,由dyld
负责链接、加载程序到内存。 - 符号绑定(binding)。因为,在OS X 上几乎所有的程序都是动态链接的,
Mach-O
文件中有很多地方是外部库和符号的引用,因此需要在启动的时候进行索引填充,这个工作就是dyld
来执行的。这个过程也被称为是符号绑定(binding
)。
3.dyld 加载流程
- dyld 是如何加载的?
- 程序是如何进行初始化的?
在前面断点bt图,我们看到 dyld 有一个 _dyld_start
方法,当我分析它的时候,发现它是汇编实现的,我们一起来看一看。
当任何一个新的进程开始时,内核设置用户模式的入口点到 __dyld_start。
具体的调用示意图如下:
4._dyld_start
dyldStartup.s 这个是汇编代码,我们简单看一下
#if __arm64__ && !TARGET_OS_SIMULATOR
.text
.align 2
.globl __dyld_start
__dyld_start:
mov x28, sp
and sp, x28, #~15 // force 16-byte alignment of stack
mov x0, #0
mov x1, #0
stp x1, x0, [sp, #-16]! // make aligned terminating frame
mov fp, sp // set up fp to point to terminating frame
sub sp, sp, #16 // make room for local variables
#if __LP64__
ldr x0, [x28] // get app's mh into x0
ldr x1, [x28, #8] // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
add x2, x28, #16 // get argv into x2
#else
ldr w0, [x28] // get app's mh into x0
ldr w1, [x28, #4] // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
add w2, w28, #8 // get argv into x2
#endif
adrp x3,___dso_handle@page
add x3,x3,___dso_handle@pageoff // get dyld's mh in to x4
mov x4,sp // x5 has &startGlue
// call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
bl __ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm
mov x16,x0 // save entry point address in x16
...
复制代码
通过注释,可以看到,调用了 dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
方法,这个方法在上一节的截图中也有看到。
5.dyldbootstrap::start
这个方法是在 C++ namespace
为 其实 dyldbootstrap
下的 start
方法。代码如下:
dyldInitialization.cpp 实现
namespace dyldbootstrap {
...
//
// This is code to bootstrap dyld. This work in normally done for a program by dyld and crt.
// In dyld we have to do this manually.
//
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{
// Emit kdebug tracepoint to indicate dyld bootstrap has started <rdar://46878536>
dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);
// if kernel had to slide dyld, we need to fix up load sensitive locations
// we have to do this before using any global variables
rebaseDyld(dyldsMachHeader);
// kernel sets up env pointer to be just past end of agv array
const char** envp = &argv[argc+1];
// kernel sets up apple pointer to be just past end of envp array
const char** apple = envp;
while(*apple != NULL) { ++apple; }
++apple;
// set up random value for stack canary
__guard_setup(apple);
#if DYLD_INITIALIZER_SUPPORT
// run all C++ initializers inside dyld
runDyldInitializers(argc, argv, envp, apple);
#endif
_subsystem_init(apple);
// now that we are done bootstrapping dyld, call dyld's main
uintptr_t appsSlide = appsMachHeader->getSlide();
return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}
}
复制代码
函数最终执行了 dyld::_main()
第一个参数,我们看到是 macho_header
这个我们如果清楚 mach-o 结构的话,对这个可能不太陌生。dyld 就是用来加载 Mach-O 文件的,到这里应该能看出一二了。
start 函数操作
- 根据
dyldsMachHeader
计算出slide
, 进而判断是否需要重定位(rebaseDyld函数中) - mach_init() 初始化操作 (rebaseDyld函数中)
- 溢出保护
- 计算
appsMachHeader
偏移, 调用dyld::_main
函数
接下来重点看一下 dyld::_main
的操作
6.dyld::_main()
dyld::main 函数实现
//
// Entry point for dyld. The kernel loads dyld and jumps to __dyld_start which
// sets up some registers and call this function.
//
// Returns address of main() in target program which __dyld_start jumps to
//
uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide,
int argc, const char* argv[], const char* envp[], const char* apple[],
uintptr_t* startGlue)
{
if (dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE)) {
launchTraceID = dyld3::kdebug_trace_dyld_duration_start(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, (uint64_t)mainExecutableMH, 0, 0);
}
//Check and see if there are any kernel flags
dyld3::BootArgs::setFlags(hexToUInt64(_simple_getenv(apple, "dyld_flags"), nullptr));
#if __has_feature(ptrauth_calls)
// Check and see if kernel disabled JOP pointer signing (which lets us load plain arm64 binaries)
if ( const char* disableStr = _simple_getenv(apple, "ptrauth_disabled") ) {
if ( strcmp(disableStr, "1") == 0 )
sKeysDisabled = true;
}
else {
// needed until kernel passes ptrauth_disabled for arm64 main executables
if ( (mainExecutableMH->cpusubtype == CPU_SUBTYPE_ARM64_V8) || (mainExecutableMH->cpusubtype == CPU_SUBTYPE_ARM64_ALL) )
sKeysDisabled = true;
}
#endif
// Grab the cdHash of the main executable from the environment
uint8_t mainExecutableCDHashBuffer[20];
const uint8_t* mainExecutableCDHash = nullptr;
if ( const char* mainExeCdHashStr = _simple_getenv(apple, "executable_cdhash") ) {
unsigned bufferLenUsed;
if ( hexStringToBytes(mainExeCdHashStr, mainExecutableCDHashBuffer, sizeof(mainExecutableCDHashBuffer), bufferLenUsed) )
mainExecutableCDHash = mainExecutableCDHashBuffer;
}
getHostInfo(mainExecutableMH, mainExecutableSlide);
#if !TARGET_OS_SIMULATOR
// Trace dyld's load
notifyKernelAboutImage((macho_header*)&__dso_handle, _simple_getenv(apple, "dyld_file"));
// Trace the main executable's load
notifyKernelAboutImage(mainExecutableMH, _simple_getenv(apple, "executable_file"));
#endif
uintptr_t result = 0;
sMainExecutableMachHeader = mainExecutableMH;
sMainExecutableSlide = mainExecutableSlide;
...
return result;
}
复制代码
代码比较长,我们抛去无用或非主流程代码,分析主要流程:
- 环境变量配置
- 根据环境变量设置相应的值、获取当前运行架构
- 共享缓存
- 检查是否开启了共享缓存,以及共享缓存是否映射到共享区域
- 主程序的初始化
- 调用
instantiateFromLoadedImage
函数实例化了一个ImageLoader
对象
- 调用
- 插入动态库
- 遍历
DYLD_INSERT_LIBRARIES
环境变量,调用loadInsertedDylib
加载
- 遍历
- link 主程序
- link 动态库
- 弱符号绑定
- 执行初始化方法
- 寻找主程序入口,即
main
函数
图示如下:
1).dyld 环境变量
- 从环境变量中获取主要可执行文件的cdHash
- 获取
Mach-O
头文件中平台、架构等信息 - 检查设置环境变量:
checkEnvironmentVariables(envp)
- 在
DYLD_FALLBACK
为空时设置默认值:defaultUninitializedFallbackPaths(envp)
相关代码
// Line: 6366
// Grab the cdHash of the main executable from the environment
// 从环境中获取主要可执行文件的 cdHash
uint8_t mainExecutableCDHashBuffer[20];
const uint8_t* mainExecutableCDHash = nullptr;
if ( const char* mainExeCdHashStr = _simple_getenv(apple, "executable_cdhash") ) {
unsigned bufferLenUsed;
if ( hexStringToBytes(mainExeCdHashStr, mainExecutableCDHashBuffer, sizeof(mainExecutableCDHashBuffer), bufferLenUsed) )
mainExecutableCDHash = mainExecutableCDHashBuffer;
}
// 从 Mach-O 头部获取当前运行环境架构信息
getHostInfo(mainExecutableMH, mainExecutableSlide);
// Line: 6453
CRSetCrashLogMessage("dyld: launch started");
// 根据可执行文件头部,参数等设置上下文
setContext(mainExecutableMH, argc, argv, envp, apple);
// Pickup the pointer to the exec path.
// 获取可执行文件路径
sExecPath = _simple_getenv(apple, "executable_path");
// Line: 6535
{
checkEnvironmentVariables(envp); // 检查设置环境变量
defaultUninitializedFallbackPaths(envp); // 在DYLD_FALLBACK为空时设置默认值
}
复制代码
可以通过在 Scheme 设置环境变量进行配置, 详见 dyld2.cpp 文件
dyld 环境变量
struct EnvironmentVariables {
const char* const * DYLD_FRAMEWORK_PATH;
const char* const * DYLD_FALLBACK_FRAMEWORK_PATH;
const char* const * DYLD_LIBRARY_PATH;
const char* const * DYLD_FALLBACK_LIBRARY_PATH;
const char* const * DYLD_INSERT_LIBRARIES;
const char* const * LD_LIBRARY_PATH; // for unix conformance
const char* const * DYLD_VERSIONED_LIBRARY_PATH;
const char* const * DYLD_VERSIONED_FRAMEWORK_PATH;
bool DYLD_PRINT_LIBRARIES_POST_LAUNCH;
bool DYLD_BIND_AT_LAUNCH;
bool DYLD_PRINT_STATISTICS;
bool DYLD_PRINT_STATISTICS_DETAILS;
bool DYLD_PRINT_OPTS;
bool DYLD_PRINT_ENV;
bool DYLD_DISABLE_DOFS;
bool hasOverride;
...
};
复制代码
示例:
- DYLD_PRINT_OPTS = YES
- DYLD_PRINT_ENV = YES , 打印所有环境变量
- OBJC_PRINT_LOAD_METHODS 打印 Class 及 Category 的 + (void)load 方法的调用信息
- OBJC_PRINT_INITIALIZE_METHODS 打印 Class 的 + (void)initialize 的调用信息
2).共享缓存 SharedCache
App 可能会用到很多的系统动态库,如 UIKit
、Foundation
等都是系统动态库,在 APP
启动后,如果在需要相应动态库能力的时候才加载动态库,会比较耗时,因此系统已经提前将 iOS
用到的动态库放入了动态库缓存
,将这个大的缓存文件放入到 iOS
系统目录(/System/Library/Caches/com.apple.dyld/
)下,以提升应用启动的性能,这就是动态库缓存的作用。
从动态共享缓存抽取动态库
其实是有方法从动态共享缓存中抽取动态库的,可以使用 dyld 源码中的 launch-cache/dsc_extractor.cpp
进行抽取
- 将
#if 0
代码和#endif
删掉 - 编译 `dsc_extractor.cpp
clang++ -o desc_extractor desc_extractor.cpp
复制代码
- 使用 desc_extractor
./desc_extractor 动态库共享缓存文件目录 存放结果文件夹
复制代码
代码中涉及共享缓存的有:
- checkSharedRegionDisable 检查是否开启共享缓存(在iOS中必须开启)
- mapSharedCache 加载共享缓存库
- 仅加载到当前进程
mapCachePrivate
(模拟器仅支持加载到当前进程) - 共享缓存是第一次被加载,就去做加载操作
mapCacheSystemWide
- 共享缓存不是第一次被加载,那么就不做任何处理
- 仅加载到当前进程
mapSharedCache --> loadDyldCache --> mapCachePrivate
└-> mapCacheSystemWide
复制代码
相关代码
// Line: 6584
// load shared cache
// 检查共享缓存是否开启,iOS 为必须
checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide);
if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion ) {
#if TARGET_OS_SIMULATOR
if ( sSharedCacheOverrideDir)
mapSharedCache(mainExecutableSlide);
#else
// 检查共享缓存是否映射到了共享区域
mapSharedCache(mainExecutableSlide);
#endif
}
// Line: 4078
static void mapSharedCache(uintptr_t mainExecutableSlide)
{
dyld3::SharedCacheOptions opts;
opts.cacheDirOverride = sSharedCacheOverrideDir;
opts.forcePrivate = (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion);
#if __x86_64__ && !TARGET_OS_SIMULATOR
opts.useHaswell = sHaswell;
#else
opts.useHaswell = false;
#endif
opts.verbose = gLinkContext.verboseMapping;
// <rdar://problem/32031197> respect -disable_aslr boot-arg
// <rdar://problem/56299169> kern.bootargs is now blocked
opts.disableASLR = (mainExecutableSlide == 0) && dyld3::internalInstall(); // infer ASLR is off if main executable is not slid
loadDyldCache(opts, &sSharedCacheLoadInfo);
// update global state
if ( sSharedCacheLoadInfo.loadAddress != nullptr ) {
gLinkContext.dyldCache = sSharedCacheLoadInfo.loadAddress;
dyld::gProcessInfo->processDetachedFromSharedRegion = opts.forcePrivate;
dyld::gProcessInfo->sharedCacheSlide = sSharedCacheLoadInfo.slide;
dyld::gProcessInfo->sharedCacheBaseAddress = (unsigned long)sSharedCacheLoadInfo.loadAddress;
sSharedCacheLoadInfo.loadAddress->getUUID(dyld::gProcessInfo->sharedCacheUUID);
dyld3::kdebug_trace_dyld_image(DBG_DYLD_UUID_SHARED_CACHE_A, sSharedCacheLoadInfo.path, (const uuid_t *)&dyld::gProcessInfo->sharedCacheUUID[0], {0,0}, {{ 0, 0 }}, (const mach_header *)sSharedCacheLoadInfo.loadAddress);
}
}
// Line: 858
bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results)
{
results->loadAddress = 0;
results->slide = 0;
results->errorMessage = nullptr;
#if TARGET_OS_SIMULATOR
// simulator only supports mmap()ing cache privately into process
return mapCachePrivate(options, results);
#else
if ( options.forcePrivate ) {
// mmap cache into this process only 仅加载当前进程
return mapCachePrivate(options, results);
}
else {
// fast path: when cache is already mapped into shared region
bool hasError = false;
if ( reuseExistingCache(options, results) ) {
hasError = (results->errorMessage != nullptr); // 已经被加载过
} else {
// slow path: this is first process to load cache
hasError = mapCacheSystemWide(options, results); // 第一次加载
}
return hasError;
}
#endif
}
复制代码
3).主程序初始化
- 通过
instantiateFromLoadedImage
获得ImageLoader
ImageLoaderMachO::instantiateMainExecutable
创建ImageLoader
(主程序)sniffLoadCommands
函数会获取Mach-O
文件的Load Command
进行各种校验
相关代码
// Line: 6860
CRSetCrashLogMessage(sLoadingCrashMessage);
// instantiate ImageLoader for main executable
// 加载可执行文件,生成 ImageLoader 实例
sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
gLinkContext.mainExecutable = sMainExecutable;
gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
// Line: 3092
// The kernel maps in main executable before dyld gets control. We need to
// make an ImageLoader* for the already mapped in main executable.
static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
// try mach-o loader
// if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
addImage(image);
return (ImageLoaderMachO*)image;
// }
// throw "main executable not a known format";
}
// ImageLoaderMachO.cpp Line: 566
// create image for main executable
ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context)
{
//dyld::log("ImageLoader=%ld, ImageLoaderMachO=%ld, ImageLoaderMachOClassic=%ld, ImageLoaderMachOCompressed=%ld\n",
// sizeof(ImageLoader), sizeof(ImageLoaderMachO), sizeof(ImageLoaderMachOClassic), sizeof(ImageLoaderMachOCompressed));
bool compressed;
unsigned int segCount;
unsigned int libCount;
const linkedit_data_command* codeSigCmd;
const encryption_info_command* encryptCmd;
sniffLoadCommands(mh, path, false, &compressed, &segCount, &libCount, context, &codeSigCmd, &encryptCmd);
// instantiate concrete class based on content of load commands
if ( compressed )
return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
else
#if SUPPORT_CLASSIC_MACHO
return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
#else
throw "missing LC_DYLD_INFO load command";
#endif
}
复制代码
4).插入动态库
这一步,会调用 loadInsertedDylib
加载遍历到的库,可以进行安全攻防,loadInsertedDylib
内部会从 DYLD_ROOT_PATH
、LD_LIBRARY_PATH
、DYLD_FRAMEWORK_PATH
等路径查找 dylib
并且检查代码签名,无效则直接抛出异常。
相关代码
// Line: 6974
// load any inserted libraries
// 加载所有 DYLD_INSERT_LIBRARIES 指定的库
if ( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib)
loadInsertedDylib(*lib);
}
// record count of inserted libraries so that a flat search will look at
// inserted libraries, then main, then others.
sInsertedDylibCount = sAllImages.size()-1;
// Line: 5176
static void loadInsertedDylib(const char* path)
{
unsigned cacheIndex;
try {
LoadContext context;
context.useSearchPaths = false;
context.useFallbackPaths = false;
context.useLdLibraryPath = false;
context.implicitRPath = false;
context.matchByInstallName = false;
context.dontLoad = false;
context.mustBeBundle = false;
context.mustBeDylib = true;
context.canBePIE = false;
context.origin = NULL; // can't use @loader_path with DYLD_INSERT_LIBRARIES
context.rpath = NULL;
load(path, context, cacheIndex);
}
catch (const char* msg) {
if ( gLinkContext.allowInsertFailures )
dyld::log("dyld: warning: could not load inserted library '%s' into hardened process because %s\n", path, msg);
else
halt(dyld::mkstringf("could not load inserted library '%s' because %s\n", path, msg));
}
catch (...) {
halt(dyld::mkstringf("could not load inserted library '%s'\n", path));
}
}
复制代码
5).链接主程序
相关代码
// Line: 6982
// link main executable
// 链接主程序
gLinkContext.linkingMainExecutable = true;
#if SUPPORT_ACCELERATE_TABLES
if ( mainExcutableAlreadyRebased ) {
// previous link() on main executable has already adjusted its internal pointers for ASLR
// work around that by rebasing by inverse amount
sMainExecutable->rebase(gLinkContext, -mainExecutableSlide);
}
#endif
link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
sMainExecutable->setNeverUnloadRecursive();
if ( sMainExecutable->forceFlat() ) {
gLinkContext.bindFlat = true;
gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
}
复制代码
6).链接动态库
相关代码
// Line: 6999
// link any inserted libraries
// do this after linking main executable so that any dylibs pulled in by inserted
// dylibs (e.g. libSystem) will not be in front of dylibs the program uses
// 链接所有插入的动态库
if ( sInsertedDylibCount > 0 ) {
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
image->setNeverUnloadRecursive();
}
if ( gLinkContext.allowInterposing ) {
// only INSERTED libraries can interpose
// register interposing info after all inserted libraries are bound so chaining works
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
image->registerInterposing(gLinkContext); // 注册符号插入
}
}
}
复制代码
7).弱符号绑定
相关代码
// Line: 7060
// apply interposing to initial set of images
for(int i=0; i < sImageRoots.size(); ++i) {
// 应用符号插入
sImageRoots[i]->applyInterposing(gLinkContext);
}
ImageLoader::applyInterposingToDyldCache(gLinkContext);
// Bind and notify for the main executable now that interposing has been registered
uint64_t bindMainExecutableStartTime = mach_absolute_time();
// 注意:
sMainExecutable->recursiveBindWithAccounting(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true);
uint64_t bindMainExecutableEndTime = mach_absolute_time();
ImageLoaderMachO::fgTotalBindTime += bindMainExecutableEndTime - bindMainExecutableStartTime;
gLinkContext.notifyBatch(dyld_image_state_bound, false);
// Bind and notify for the inserted images now interposing has been registered
if ( sInsertedDylibCount > 0 ) {
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
image->recursiveBind(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true, nullptr);
}
}
// <rdar://problem/12186933> do weak binding only after all inserted images linked
// 弱符号绑定
sMainExecutable->weakBind(gLinkContext);
gLinkContext.linkingMainExecutable = false;
sMainExecutable->recursiveMakeDataReadOnly(gLinkContext);
复制代码
8).执行初始化方法
相关代码
// Line: 7087
CRSetCrashLogMessage("dyld: launch, running initializers");
#if SUPPORT_OLD_CRT_INITIALIZATION
// Old way is to run initializers via a callback from crt1.o
if ( ! gRunInitializersOldWay )
initializeMainExecutable();
#else
// run all initializers
// 执行初始化
initializeMainExecutable();
#endif
// Line: 1636
void initializeMainExecutable()
{
// record that we've reached this step
gLinkContext.startedInitializingMainExecutable = true;
// run initialzers for any inserted dylibs
ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
initializerTimes[0].count = 0;
const size_t rootCount = sImageRoots.size();
if ( rootCount > 1 ) {
for(size_t i=1; i < rootCount; ++i) {
sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
}
}
// run initializers for main executable and everything it brings up
sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
// register cxa_atexit() handler to run static terminators in all loaded images when this process exits
if ( gLibSystemHelpers != NULL )
(*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL);
// dump info if requested
if ( sEnv.DYLD_PRINT_STATISTICS )
ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
}
// ImageLoader.cpp Line: 609
void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
uint64_t t1 = mach_absolute_time();
mach_port_t thisThread = mach_thread_self();
ImageLoader::UninitedUpwards up;
up.count = 1;
up.imagesAndPaths[0] = { this, this->getPath() };
processInitializers(context, thisThread, timingInfo, up);
context.notifyBatch(dyld_image_state_initialized, false);
mach_port_deallocate(mach_task_self(), thisThread);
uint64_t t2 = mach_absolute_time();
fgTotalInitTime += (t2 - t1);
}
// ImageLoader.cpp Line: 587
// <rdar://problem/14412057> upward dylib initializers can be run too soon
// To handle dangling dylibs which are upward linked but not downward, all upward linked dylibs
// have their initialization postponed until after the recursion through downward dylibs
// has completed.
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
uint32_t maxImageCount = context.imageCount()+2;
ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
ImageLoader::UninitedUpwards& ups = upsBuffer[0];
ups.count = 0;
// Calling recursive init on all images in images list, building a new list of
// uninitialized upward dependencies.
for (uintptr_t i=0; i < images.count; ++i) {
images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
}
// If any upward dependencies remain, init them.
if ( ups.count > 0 )
processInitializers(context, thisThread, timingInfo, ups);
}
// ImageLoader.cpp Line: 1595
// 获取到镜像的初始化
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
recursive_lock lock_info(this_thread);
recursiveSpinLock(lock_info);
if ( fState < dyld_image_state_dependents_initialized-1 ) {
uint8_t oldState = fState;
// break cycles
fState = dyld_image_state_dependents_initialized-1;
try {
// initialize lower level libraries first
for(unsigned int i=0; i < libraryCount(); ++i) {
ImageLoader* dependentImage = libImage(i);
if ( dependentImage != NULL ) {
// don't try to initialize stuff "above" me yet
if ( libIsUpward(i) ) {
uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
uninitUps.count++;
}
else if ( dependentImage->fDepth >= fDepth ) {
dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps);
}
}
}
// record termination order
if ( this->needsTermination() )
context.terminationRecorder(this);
// let objc know we are about to initialize this image
uint64_t t1 = mach_absolute_time();
fState = dyld_image_state_dependents_initialized;
oldState = fState;
context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
// initialize this image
bool hasInitializers = this->doInitialization(context);
// let anyone know we finished initializing this image
fState = dyld_image_state_initialized;
oldState = fState;
context.notifySingle(dyld_image_state_initialized, this, NULL);
if ( hasInitializers ) {
uint64_t t2 = mach_absolute_time();
timingInfo.addTime(this->getShortName(), t2-t1);
}
}
catch (const char* msg) {
// this image is not initialized
fState = oldState;
recursiveSpinUnLock();
throw;
}
}
recursiveSpinUnLock();
}
复制代码
notifySingle 函数
相关代码
// dyld2.cpp Line: 985
static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
{
...
if ( state == dyld_image_state_mapped ) {
// <rdar://problem/7008875> Save load addr + UUID for images from outside the shared cache
// <rdar://problem/50432671> Include UUIDs for shared cache dylibs in all image info when using private mapped shared caches
if (!image->inSharedCache()
|| (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion)) {
dyld_uuid_info info;
if ( image->getUUID(info.imageUUID) ) {
info.imageLoadAddress = image->machHeader();
addNonSharedCacheImageUUID(info);
}
}
}
if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit != NULL) && image->notifyObjC() ) {
uint64_t t0 = mach_absolute_time();
dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
// 注意这一句
(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
uint64_t t1 = mach_absolute_time();
uint64_t t2 = mach_absolute_time();
uint64_t timeInObjC = t1-t0;
uint64_t emptyTime = (t2-t1)*100;
if ( (timeInObjC > emptyTime) && (timingInfo != NULL) ) {
timingInfo->addTime(image->getShortName(), timeInObjC);
}
}
...
}
// Line: 4643
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
// record functions to call
sNotifyObjCMapped = mapped;
sNotifyObjCInit = init; // 赋值操作
sNotifyObjCUnmapped = unmapped;
// call 'mapped' function with all images mapped so far
try {
notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true);
}
catch (const char* msg) {
// ignore request to abort during registration
}
// <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem)
for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(); it != sAllImages.end(); it++) {
ImageLoader* image = *it;
if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) {
dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
}
}
}
// dyldAPIs.cpp line: 2188
// 这个函数只在运行时提供给objc使用
// dyld_objc_notify_register 的函数需要在 libobjc 源码中搜索
void _dyld_objc_notify_register(_dyld_objc_notify_mapped mapped,
_dyld_objc_notify_init init,
_dyld_objc_notify_unmapped unmapped)
{
dyld::registerObjCNotifiers(mapped, init, unmapped); // 此处调用了
}
复制代码
在 objc4
源码中搜索 _dyld_objc_notify_register
,发现在 _objc_init
中调用了该方法,并传入了参数。
所以 sNotifyObjCInit
的赋值的就是 objc
中的 load_images
,而 load_images
会调用所有的 +load
方法,notifySingle
是一个回调函数。
说明
初始化流程链路相对比较长,此处不过多赘述,我们将在下一小节重点聊。
9).主程序入口
dyld2.cpp 中关于程序入口的代码如下:
// Line: 7104
#if TARGET_OS_OSX
if ( gLinkContext.driverKit ) {
result = (uintptr_t)sEntryOverride;
if ( result == 0 )
halt("no entry point registered");
*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
}
else
#endif
{
// find entry point for main executable
result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN();
if ( result != 0 ) {
// main executable uses LC_MAIN, we need to use helper in libdyld to call into main()
if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
else
halt("libdyld.dylib support not present for LC_MAIN");
}
else {
// main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD();
*startGlue = 0;
}
}
复制代码
那如何证明是在 load 方法和 C++ constructor
方法之后调用的呢?
最简单的方法就是断点啦。
可以看到当前断点在 load
方法中
当前backtrace
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 8.1
* frame #0: 0x0000000100003e60 Dyld`+[Person load](self=Person, _cmd="load") at main.m:17:5
frame #1: 0x00007fff203ab4d6 libobjc.A.dylib`load_images + 1556
frame #2: 0x0000000100016527 dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 425
frame #3: 0x000000010002c794 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 474
frame #4: 0x000000010002a55f dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 191
frame #5: 0x000000010002a600 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82
frame #6: 0x00000001000168b7 dyld`dyld::initializeMainExecutable() + 199
frame #7: 0x000000010001ceb8 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 8702
frame #8: 0x0000000100015224 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 450
frame #9: 0x0000000100015025 dyld`_dyld_start + 37
复制代码
继续走一步,进入了 c++ __attribute__((constructor))void cc_func()
最后进入了 main()
当前backtrace
(ll(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 11.1
* frame #0: 0x0000000100003eb6 Dyld`main(argc=1, argv=0x00007ffeefbff3e8) at main.m:27:22
frame #1: 0x00007fff20528f3d libdyld.dylib`start + 1
frame #2: 0x00007fff20528f3d libdyld.dylib`start + 1
复制代码
可见执行完前2步方法后,又回到了_dyld_start,然后调用main()函数。
三、初始化流程
上面我们已经对 App 的加载有了一个清楚的认识,但这是否是全部流程呢?
当然不是,前面我们已经挖了一个坑,现在我们来抽丝剥茧,搞清楚 App 的加载和初始化流程。
1.回顾
再回顾一下前面的断点的方式,一步一步来进行探索。
断点在 +load 方法 bt
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 8.1
frame #0: 0x0000000100003e60 Dyld`+[Person load](self=Person, _cmd="load") at main.m:17:5
frame #1: 0x00007fff203ab4d6 libobjc.A.dylib`load_images + 1556
frame #2: 0x0000000100016527 dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 425
frame #3: 0x000000010002c794 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 474
frame #4: 0x000000010002a55f dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 191
frame #5: 0x000000010002a600 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82
* frame #6: 0x00000001000168b7 dyld`dyld::initializeMainExecutable() + 199
frame #7: 0x000000010001ceb8 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 8702
frame #8: 0x0000000100015224 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 450
frame #9: 0x0000000100015025 dyld`_dyld_start + 37
复制代码
2.dyld start
根据堆栈信息,咱们从 dyld_start开始跟踪
通过汇编可以看到,接下来来到了 bootstrap::start
,
之后调用了 dyld::_main
方法
再之后,调用了 initializeMainExecutalbe()
之后调用了 ImageLoader::runInitializers
之后调用了 ImageLoader::processInit
内部调用了 ImageLoader::recursiveInitial
接着看到了 notifySingle
继续往上追,能看到,是在dyld registerObjcNoti
注册的
再往上,能看到 _dyld_objc_notify_register
内部调用了
到这里,线索断了,没有其他的信息了
回过头看到,这个地方漏掉了一个方法 ImageLoader::doInitialization
它内部如何实现的呢?
doInit
实现
3.libSystem init
接着只能看看 libSystem 里面的调用了
可以看到,它内部调用了 Dispatch 的函数
4.dispatch init
在 Dispatch 库里面往往下追,libdispatch_init()
这里看到了调用 _objc_init
当我们下一个符号断点 objc_init
时候,这下发现了新天地
这个方法调用的是 objc
的 _objc_init
5.objc init
接着到 objc 中探索,发现了我们前面疑惑的函数 _dyld_objc_notify_register
可以看出,它就是在这里被调用的
啊哈
到这里,你会惊奇的发现,notifySingle 是一个回调函数
它将 load_images
作为第二个参数传入了,因此在执行完之后,就做了 load_images 的操作
看到 loadImages
里面的调用,就明白,为什么 Person load
方法被调用了吧。
到这里,你对 dyld 的加载和 应用程序的初始化过程就清楚很多了吧。
各个 lib 直接的关系如图:
接下来我们去分析下 objc_init
都做了什么。