标签: App启动 dyld
本章节 主要介绍 dyld
的加载流程,了解在main函数之前,底层还做了什么
准备工作 dyld源码 libdispatch-1271.120.2 源码 Libsystem-1292.60.1 objc4-818.2
1. dyld
1.1 简介
dyld
(the dynamic link editor)是苹果的动态链接器
,是苹果操作系统的重要组成部分,在app被编译打包成可执行文件格式的Mach-O
文件后,交由dyld负责连接,加载程序
- dyld贯穿了App启动的过程,包含加载依赖库、主程序,如果我们需要进行性能优化、启动优化等,不可避免的需要和dyld打交道
1.2 dyld的发展史
1.2.1 dyld 1.0 1996-2004
dyld 1
包含在NeXTStep 3.3
中,在此之前的NeXT使用静态二进制
数据。作用并不是很大,dyld 1
是在系统广泛使用C++动态库之前编写的,由于C++有许多特性,例如其初始化器的工作,在静态环境工作良好,但是在动态环境中可能会降低性能。因此大型的C++动态库会导致dyld需要完成大量的工作,速度变慢- 在发布
macOS 10.0
和Cheetah
前,还增加了一个特性,即Prebinding预绑定
。我们可以使用Prebinding技术为系统中的所有dylib
和应用程序找到固定的地址
。dyld将会加载这些地址的所有内容。如果加载成功,将会编辑所有dylib和程序的二进制数据,来获得所有预计算。当下次需要将所有数据放入相同地址时就不需要进行额外操作了,将大大的提高速度。但是这也意味着每次启动都需要编辑这些二进制数据,至少从安全性来说,这种方式并不友好。
1.2.2 dyld 2 2004-2017
dyld 2
从2004年发布至今,已经经过了多个版本迭代,我们现在常见的一些特性,例如ASLR
、Code Sign
、share cache
等技术,都是在dyld 2中引入的
dyld 2.0(2004-2007)
- 2004年在
macOS Tiger
中推出了dyld 2
dyld 2
是dyld 1
完全重写的版本,可以正确支持C++初始化器语义,同时扩展了mach-o格式并更新dyld。从而获得了高效率C++库的支持。- dyld 2具有完成的
dlopen
和dlsym
(主要用于动态加载库和调用函数)实现,且具有正确的语义,因此弃用了旧版的APIdlopen
:打开一个库,获取句柄dlsym
:在打开的库中查找符号的值dlclose
:关闭句柄。dlerror
:返回一个描述最后一次调用dlopen、dlsym,或 dlclose 的错误信息的字符串。
dyld
的设计目标
是提升启动速度
。因此仅进行有限的健全性检查。主要是因为以前的恶意程序比较少- 同时dyld也有一些安全问题,因此对一些功能进行了改进,来提高dyld在平台上的安全性
- 由于启动速度的大幅提升,因此我们可以
减少Prebinding的工作量
。与编辑程序数据
的区别在于,在这里我们仅编辑系统库,且可以仅在软件更新时做这些事情。因此在软件更新过程中,可能会看到“优化系统性能”类似的文字。这就是在更新时进行Prebinding
。现在dyld用于所有优化,其用途就是优化。因此后面有了dyld 2
dyld 2.x(2007-2017)
- 在2004-20017这几年间进行了大量改进,dyld 2的性能显著提高
- 首先,
增加
了大量的基础架构
和平台
。- 自从dyld 2在PowerPC发布之后,增加了
x86
、x86_64
、arm
、arm64
和许多的衍生平台。 - 还推出了
iOS
、tvOS
和watchOS
,这些都需要新的dyld功能
- 自从dyld 2在PowerPC发布之后,增加了
- 通过多种方式增加安全性
- 增加
codeSigning
代码签名、 ASLR(Address space layout randomization)
地址空间配置随机加载:每次加载库时,可能位于不同的地址bound checking
边界检查:mach-o文件中增加了Header的边界检查功能,从而避免恶意二进制数据的注入
- 增加
- 增强了性能
- 可以消除Prebinding,用
share cache
共享代码代替
- 可以消除Prebinding,用
ASLR
ASLR
是一种防范内存损坏漏洞被利用的计算机安全技术
,ASLR通过随机放置进程关键数据区域的地址空间来防止攻击者跳转到内存特定位置来利用函数- Linux已在内核版本2.6.12中添加ASLR
- Apple在
Mac OS X Leopard 10.5
(2007年十月发行)中某些库导入了随机地址偏移
,但其实现并没有提供ASLR所定义的完整保护能力。而Mac OS X Lion 10.7则对所有的应用程序均提供了ASLR支持。 - Apple在
iOS 4.3
内导入了ASLR
。
bounds checking 边界检查
- 对mach-o header中的许多内容添加了重要的
边界检查
功能,从而可以避免恶意二进制数据的注入
share cache 共享代码
share cache
最早实在iOS3.1
和macOS Snow Leopard
中被引入,用于完全取代Prebindingshare cache
是一个单文件
,包含大多数系统dylib
,由于这些dylib合并成了一个文件,所以可以进行优化。- 重新调整所有
文本段(_TEXT)
和数据段(_DATA)
,并重写整个符号表,以此来减小文件的大小,从而在每个进程中仅挂载少量的区域。允许我们打包二进制数据段,从而节省大量的RAM - 本质是一个
dylib预链接器
,它在RAM上的节约是显著的,在普通的iOS程序中运行可以节约500-1g
内存 - 还可以
预生成数据结构
,用来供dyld和Ob-C在运行时使用。从而不必在程序启动时做这些事情,这也会节约更多的RAM和时间
- 重新调整所有
share cache
在macOS上本地生成,运行dyld共享代码,将大幅优化系统性能
1.2.3 dyld 3 2017-至今
dyld 3
是2017年WWDC推出的全新的动态链接器,它完全改变了动态链接的概念,且将成为大多数macOS系统程序的默认设置。2017 Apple OS平台上的所有系统程序都会默认使用dyld 3.dyld 3
最早是在2017年的iOS 11
中引入,主要用来优化系统库。- 而在
iOS 13
系统中,iOS全面采用新的dyld 3来替代之前的dyld 2,因为dyld 3完全兼容dyld 2
,其API接口也是一样的,所以,在大部分情况下,开发者并不需要做额外的适配就能平滑过渡。
2. 通过bt
堆栈信息查看app启动是从哪里开始的
在load
方法处加一个断点
,通过bt
堆栈信息查看app启动是从哪里开始的
通过程序运行发现,是从dyld
中的_dyld_start
开始的
int main(int argc, char * argv[]) {
NSString * appDelegateClassName;
@autoreleasepool {
// Setup code that might create autoreleased objects goes here.
appDelegateClassName = NSStringFromClass([AppDelegate class]);
}
return UIApplicationMain(argc, argv, nil, appDelegateClassName);
}
__attribute__((constructor)) void ypyFunc(){
printf("来了 : %s \n",__func__);
}
@interface ViewController ()
@end
@implementation ViewController
+ (void)load{
NSLog(@"%s",__func__);
}
- (void)viewDidLoad {
[super viewDidLoad];
// Do any additional setup after loading the view.
}
@end
复制代码
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x0000000104ca5f24 002-应用程加载分析`+[ViewController load](self=ViewController, _cmd="load") at ViewController.m:17:5
frame #1: 0x00000001aafd735c libobjc.A.dylib`load_images + 984
frame #2: 0x0000000104e0a190 dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 448
frame #3: 0x0000000104e1a0d8 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 512
frame #4: 0x0000000104e18520 dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 184
frame #5: 0x0000000104e185e8 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 92
frame #6: 0x0000000104e0a658 dyld`dyld::initializeMainExecutable() + 216
frame #7: 0x0000000104e0eeb0 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 4400
frame #8: 0x0000000104e09208 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 396
frame #9: 0x0000000104e09038 dyld`_dyld_start + 56
(lldb)
复制代码
3. _dyld_start 流程分析
在dyld-852
源码中查找_dyld_start
,查找arm64架构
发现,是由汇编实现,通过汇编注释发现会调用dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
方法,是一个C++
方法
3.1 dyldbootstrap::start源码
源码中搜索dyldbootstrap
找到命名作用空间
,再在这个文件中查找start
方法,其核心是返回值的调用了dyld
的main
函数,其中macho_header
是Mach-O
的头部,而dyld
加载的文件就是Mach-O类型
的,即Mach-O类型是可执行文件类型
,由四部分组成:Mach-O头部、Load Command、section、Other Data
,可以通过MachOView
查看可执行文件信息
//
// This is code to bootstrap dyld. This work in normally done for a program by dyld and crt.
// In dyld we have to do this manually.
//
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{
// Emit kdebug tracepoint to indicate dyld bootstrap has started <rdar://46878536>
dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);
// if kernel had to slide dyld, we need to fix up load sensitive locations
// we have to do this before using any global variables
rebaseDyld(dyldsMachHeader);
// kernel sets up env pointer to be just past end of agv array
const char** envp = &argv[argc+1];
// kernel sets up apple pointer to be just past end of envp array
const char** apple = envp;
while(*apple != NULL) { ++apple; }
++apple;
// set up random value for stack canary
__guard_setup(apple);
#if DYLD_INITIALIZER_SUPPORT
// run all C++ initializers inside dyld
runDyldInitializers(argc, argv, envp, apple);
#endif
_subsystem_init(apple);
// now that we are done bootstrapping dyld, call dyld's main
uintptr_t appsSlide = appsMachHeader->getSlide();
return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}
复制代码
3.2 dyld::_main函数源码分析
进入dyld::_main
的源码实现,特别长,大约600多行,如果对dyld加载流程不太了解的童鞋,可以根据_main
函数的返回值进行反推,这里就多作说明。在_main函数中主要做了一下几件事情:
-
【第一步:
条件准备:环境,平台,版本,路径,主机信息
】:根据环境变量设置相应的值以及获取当前运行架构 -
【第二步:
加载共享缓存
】:检查是否开启了共享缓存,以及共享缓存是否映射到共享区域,例如UIKit
、CoreFoundation
等 -
【第三步:
主程序的初始化
】:调用instantiateFromLoadedImage
函数实例化了一个ImageLoader
对象 -
【第四步:
插入动态库
】:遍历DYLD_INSERT_LIBRARIES
环境变量,调用loadInsertedDylib
加载 -
【第五步:
link 主程序
】 -
【第六步:
link 动态库
】 -
【第七步:
弱引用绑定
】 -
【第八步:
执行初始化方法
】 -
【第九步:
寻找主程序入口
即main
函数】:从Load Command
读取LC_MAIN
入口,如果没有,就读取LC_UNIXTHREAD
,这样就来到了日常开发中熟悉的main
函数了
下面主要分析下【第三步】和【第八步】
3.2.1 重点介绍 第三步:主程序初始化
-
sMainExecutable
表示主程序变量,查看其赋值,是通过instantiateFromLoadedImage
方法初始化// instantiate ImageLoader for main executable // 【第三步:主程序的初始化】 sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath); gLinkContext.mainExecutable = sMainExecutable; gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH); 复制代码
instantiateFromLoadedImage初始化主程序
-
进入
instantiateFromLoadedImage
源码,其中创建一个ImageLoader
实例对象,通过instantiateMainExecutable
方法创建// The kernel maps in main executable before dyld gets control. We need to // make an ImageLoader* for the already mapped in main executable. static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path) { // try mach-o loader // if ( isCompatibleMachO((const uint8_t*)mh, path) ) { ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext); addImage(image); return (ImageLoaderMachO*)image; // } // throw "main executable not a known format"; } 复制代码
-
进入
instantiateMainExecutable
源码,其作用是为主可执行文件创建映像,返回一个ImageLoader
类型的image对象,即主程序
。其中sniffLoadCommands
函数时获取Mach-O类型文件
的Load Command
的相关信息,并对其进行各种校验// create image for main executable ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context) { //dyld::log("ImageLoader=%ld, ImageLoaderMachO=%ld, ImageLoaderMachOClassic=%ld, ImageLoaderMachOCompressed=%ld\n", // sizeof(ImageLoader), sizeof(ImageLoaderMachO), sizeof(ImageLoaderMachOClassic), sizeof(ImageLoaderMachOCompressed)); bool compressed; unsigned int segCount; unsigned int libCount; const linkedit_data_command* codeSigCmd; const encryption_info_command* encryptCmd; sniffLoadCommands(mh, path, false, &compressed, &segCount, &libCount, context, &codeSigCmd, &encryptCmd); // instantiate concrete class based on content of load commands if ( compressed ) return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context); else #if SUPPORT_CLASSIC_MACHO return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context); #else throw "missing LC_DYLD_INFO load command"; #endif } 复制代码
3.2.2 重点介绍 第八步:执行初始化方法
-
进入
initializeMainExecutable
源码,主要是循环遍历
,都会执行runInitializers
方法void initializeMainExecutable() { // record that we've reached this step gLinkContext.startedInitializingMainExecutable = true; // run initialzers for any inserted dylibs ImageLoader::InitializerTimingList initializerTimes[allImagesCount()]; initializerTimes[0].count = 0; const size_t rootCount = sImageRoots.size(); if ( rootCount > 1 ) { for(size_t i=1; i < rootCount; ++i) { sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]); } } // run initializers for main executable and everything it brings up sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]); // register cxa_atexit() handler to run static terminators in all loaded images when this process exits if ( gLibSystemHelpers != NULL ) (*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL); // dump info if requested if ( sEnv.DYLD_PRINT_STATISTICS ) ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]); if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS ) ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]); } 复制代码
-
全局搜索
runInitializers(cons
,找到如下源码,其核心代码是processInitializers
函数的调用void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo) { uint64_t t1 = mach_absolute_time(); mach_port_t thisThread = mach_thread_self(); ImageLoader::UninitedUpwards up; up.count = 1; up.imagesAndPaths[0] = { this, this->getPath() }; processInitializers(context, thisThread, timingInfo, up); context.notifyBatch(dyld_image_state_initialized, false); mach_port_deallocate(mach_task_self(), thisThread); uint64_t t2 = mach_absolute_time(); fgTotalInitTime += (t2 - t1); } 复制代码
-
进入
processInitializers
函数的源码实现,其中对镜像列表调用recursiveInitialization
函数进行递归实例化// <rdar://problem/14412057> upward dylib initializers can be run too soon // To handle dangling dylibs which are upward linked but not downward, all upward linked dylibs // have their initialization postponed until after the recursion through downward dylibs // has completed. void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread, InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images) { uint32_t maxImageCount = context.imageCount()+2; ImageLoader::UninitedUpwards upsBuffer[maxImageCount]; ImageLoader::UninitedUpwards& ups = upsBuffer[0]; ups.count = 0; // Calling recursive init on all images in images list, building a new list of // uninitialized upward dependencies. //在镜像列表中的所有镜像上调用递归实例化,以建立未初始化的向上依赖关系的新列表 for (uintptr_t i=0; i < images.count; ++i) { images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups); } // If any upward dependencies remain, init them.如果还有任何向上的依赖关系,请将其初始化 if ( ups.count > 0 ) processInitializers(context, thisThread, timingInfo, ups); } 复制代码
-
全局搜索
recursiveInitialization(cons
函数,其源码实现如下void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) { recursive_lock lock_info(this_thread); recursiveSpinLock(lock_info);//递归加锁 if ( fState < dyld_image_state_dependents_initialized-1 ) { uint8_t oldState = fState; // break cycles 结束递归循环 fState = dyld_image_state_dependents_initialized-1; try { // initialize lower level libraries first for(unsigned int i=0; i < libraryCount(); ++i) { ImageLoader* dependentImage = libImage(i); if ( dependentImage != NULL ) { // don't try to initialize stuff "above" me yet if ( libIsUpward(i) ) { uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) }; uninitUps.count++; } else if ( dependentImage->fDepth >= fDepth ) { dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps); } } } // record termination order if ( this->needsTermination() ) context.terminationRecorder(this); // let objc know we are about to initialize this image // 让objc 知道我们要加载此镜像 uint64_t t1 = mach_absolute_time(); fState = dyld_image_state_dependents_initialized; oldState = fState; context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo); // initialize this image 初始化镜像 bool hasInitializers = this->doInitialization(context); // let anyone know we finished initializing this image // 让任何 都知道我们已经完成了初始化此镜像 fState = dyld_image_state_initialized; oldState = fState; context.notifySingle(dyld_image_state_initialized, this, NULL); if ( hasInitializers ) { uint64_t t2 = mach_absolute_time(); timingInfo.addTime(this->getShortName(), t2-t1); } } catch (const char* msg) { // this image is not initialized fState = oldState; recursiveSpinUnLock(); throw; } } recursiveSpinUnLock();//递归解锁 } 复制代码
在这里,需要分成两部分探索,一部分是notifySingle
函数,一部分是doInitialization
函数,首先探索notifySingle
函数
3.2.2.1 notifySingle 函数
-
全局搜索
notifySingle(
函数,其重点是(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
这句static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo) { //dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath()); std::vector<dyld_image_state_change_handler>* handlers = stateToHandlers(state, sSingleHandlers); if ( handlers != NULL ) { dyld_image_info info; info.imageLoadAddress = image->machHeader(); info.imageFilePath = image->getRealPath(); info.imageFileModDate = image->lastModified(); for (std::vector<dyld_image_state_change_handler>::iterator it = handlers->begin(); it != handlers->end(); ++it) { const char* result = (*it)(state, 1, &info); if ( (result != NULL) && (state == dyld_image_state_mapped) ) { //fprintf(stderr, " image rejected by handler=%p\n", *it); // make copy of thrown string so that later catch clauses can free it const char* str = strdup(result); throw str; } } } if ( state == dyld_image_state_mapped ) {//是否被映射 // <rdar://problem/7008875> Save load addr + UUID for images from outside the shared cache // <rdar://problem/50432671> Include UUIDs for shared cache dylibs in all image info when using private mapped shared caches // 保存来自共享混存外部的镜像地址 + UUID if (!image->inSharedCache() || (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion)) { dyld_uuid_info info; if ( image->getUUID(info.imageUUID) ) { info.imageLoadAddress = image->machHeader(); addNonSharedCacheImageUUID(info); } } } if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit != NULL) && image->notifyObjC() ) { uint64_t t0 = mach_absolute_time(); dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0); (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); uint64_t t1 = mach_absolute_time(); uint64_t t2 = mach_absolute_time(); uint64_t timeInObjC = t1-t0; uint64_t emptyTime = (t2-t1)*100; if ( (timeInObjC > emptyTime) && (timingInfo != NULL) ) { timingInfo->addTime(image->getShortName(), timeInObjC); } } // mach message csdlc about dynamically unloaded images if ( image->addFuncNotified() && (state == dyld_image_state_terminated) ) { notifyKernel(*image, false); const struct mach_header* loadAddress[] = { image->machHeader() }; const char* loadPath[] = { image->getPath() }; notifyMonitoringDyld(true, 1, loadAddress, loadPath); } } 复制代码
-
全局搜索
sNotifyObjCInit
,发现没有找到实现,有赋值操作void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped) { // record functions to call sNotifyObjCMapped = mapped; sNotifyObjCInit = init;//重点 sNotifyObjCUnmapped = unmapped; // call 'mapped' function with all images mapped so far try { notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true); } catch (const char* msg) { // ignore request to abort during registration } // <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem) for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(); it != sAllImages.end(); it++) { ImageLoader* image = *it; if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) { dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0); (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); } } } 复制代码
-
搜索
registerObjCNotifiers
在哪里调用了,发现在_dyld_objc_notify_register
进行了调用注意:_dyld_objc_notify_register
的函数需要在libobjc
源码中搜索void _dyld_objc_notify_register(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped) { dyld::registerObjCNotifiers(mapped, init, unmapped); } 复制代码
-
在
objc4-818.2
源码中搜索_dyld_objc_notify_register
,发现在_objc_init
源码中调用了该方法,并传入了参数,所以sNotifyObjCInit
的赋值
的就是objc
中的load_images
,而load_images
会调用所有的+load
方法。所以综上所述,notifySingle
是一个回调函数
/*********************************************************************** * _objc_init * Bootstrap initialization. Registers our image notifier with dyld. * Called by libSystem BEFORE library initialization time **********************************************************************/ void _objc_init(void) { static bool initialized = false; if (initialized) return; initialized = true; // fixme defer initialization until an objc-using image is found? environ_init(); tls_init(); static_init(); runtime_init(); exception_init(); #if __OBJC2__ cache_t::init(); #endif _imp_implementationWithBlock_init(); _dyld_objc_notify_register(&map_images, load_images, unmap_image);//重点 #if __OBJC2__ didCallDyldNotifyRegister = true; #endif } 复制代码
load函数加载
下面我们进入load_images
的源码看看其实现,以此来证明load_images
中调用了所有的load
函数
-
通过objc源码中_objc_init源码实现,进入
load_images
的源码实现void load_images(const char *path __unused, const struct mach_header *mh) { if (!didInitialAttachCategories && didCallDyldNotifyRegister) { didInitialAttachCategories = true; loadAllCategories(); } // Return without taking locks if there are no +load methods here. if (!hasLoadMethods((const headerType *)mh)) return; recursive_mutex_locker_t lock(loadMethodLock); // Discover load methods { mutex_locker_t lock2(runtimeLock); prepare_load_methods((const headerType *)mh); } // Call +load methods (without runtimeLock - re-entrant) call_load_methods(); } 复制代码
-
进入
call_load_methods
源码实现,可以发现其核心是通过do-while
循环调用+load
方法/*********************************************************************** * call_load_methods * Call all pending class and category +load methods. * Class +load methods are called superclass-first. * Category +load methods are not called until after the parent class's +load. * * This method must be RE-ENTRANT, because a +load could trigger * more image mapping. In addition, the superclass-first ordering * must be preserved in the face of re-entrant calls. Therefore, * only the OUTERMOST call of this function will do anything, and * that call will handle all loadable classes, even those generated * while it was running. * * The sequence below preserves +load ordering in the face of * image loading during a +load, and make sure that no * +load method is forgotten because it was added during * a +load call. * Sequence: * 1. Repeatedly call class +loads until there aren't any more * 2. Call category +loads ONCE. * 3. Run more +loads if: * (a) there are more classes to load, OR * (b) there are some potential category +loads that have * still never been attempted. * Category +loads are only run once to ensure "parent class first" * ordering, even if a category +load triggers a new loadable class * and a new loadable category attached to that class. * * Locking: loadMethodLock must be held by the caller * All other locks must not be held. **********************************************************************/ void call_load_methods(void) { static bool loading = NO; bool more_categories; loadMethodLock.assertLocked(); // Re-entrant calls do nothing; the outermost call will finish the job. if (loading) return; loading = YES; void *pool = objc_autoreleasePoolPush(); do { // 1. Repeatedly call class +loads until there aren't any more while (loadable_classes_used > 0) { call_class_loads(); } // 2. Call category +loads ONCE more_categories = call_category_loads(); // 3. Run more +loads if there are classes OR more untried categories } while (loadable_classes_used > 0 || more_categories); objc_autoreleasePoolPop(pool); loading = NO; } 复制代码
-
进入
call_class_loads
源码实现,了解到这里调用的load
方法证实我们前文提及的类的load
方法/*********************************************************************** * call_class_loads * Call all pending class +load methods. * If new classes become loadable, +load is NOT called for them. * * Called only by call_load_methods(). **********************************************************************/ static void call_class_loads(void) { int i; // Detach current loadable list. struct loadable_class *classes = loadable_classes; int used = loadable_classes_used; loadable_classes = nil; loadable_classes_allocated = 0; loadable_classes_used = 0; // Call all +loads for the detached list. for (i = 0; i < used; i++) { Class cls = classes[i].cls; load_method_t load_method = (load_method_t)classes[i].method; if (!cls) continue; if (PrintLoading) { _objc_inform("LOAD: +[%s load]\n", cls->nameForLogging()); } (*load_method)(cls, @selector(load)); } // Destroy the detached list. if (classes) free(classes); } 复制代码
所以,load_images
调用了所有的load
函数,以上的源码分析过程正好对应堆栈的打印信息
【总结】load的源码链为:_dyld_start
–> dyldbootstrap::start
–> dyld::_main
–> dyld::initializeMainExecutable
–> ImageLoader::runInitializers
–> ImageLoader::processInitializers
–> ImageLoader::recursiveInitialization
–> dyld::notifySingle
(是一个回调处理) –> sNotifyObjCInit
–> load_images(libobjc.A.dylib)
那么问题又来了,_objc_init是什么时候调用的呢?请接着往下看
3.2.2.2 doInitialization 函数
-
走到
objc
的_objc_init
函数,发现走不通了,我们回退到recursiveInitialization
递归函数的源码实现,发现我们忽略了一个函数doInitialization
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) { recursive_lock lock_info(this_thread); recursiveSpinLock(lock_info);//递归加锁 if ( fState < dyld_image_state_dependents_initialized-1 ) { uint8_t oldState = fState; // break cycles 结束递归循环 fState = dyld_image_state_dependents_initialized-1; try { // initialize lower level libraries first for(unsigned int i=0; i < libraryCount(); ++i) { ImageLoader* dependentImage = libImage(i); if ( dependentImage != NULL ) { // don't try to initialize stuff "above" me yet if ( libIsUpward(i) ) { uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) }; uninitUps.count++; } else if ( dependentImage->fDepth >= fDepth ) { dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps); } } } // record termination order if ( this->needsTermination() ) context.terminationRecorder(this); // let objc know we are about to initialize this image // 让objc 知道我们要加载此镜像 uint64_t t1 = mach_absolute_time(); fState = dyld_image_state_dependents_initialized; oldState = fState; context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo); // initialize this image 初始化镜像 //重点 bool hasInitializers = this->doInitialization(context); // let anyone know we finished initializing this image // 让任何 都知道我们已经完成了初始化此镜像 fState = dyld_image_state_initialized; oldState = fState; context.notifySingle(dyld_image_state_initialized, this, NULL); if ( hasInitializers ) { uint64_t t2 = mach_absolute_time(); timingInfo.addTime(this->getShortName(), t2-t1); } } catch (const char* msg) { // this image is not initialized fState = oldState; recursiveSpinUnLock(); throw; } } recursiveSpinUnLock();//递归解锁 } 复制代码
-
进入
doInitialization
函数的源码实现这里也需要分成两部分,一部分是doImageInit
函数,一部分是doModInitFunctions
函数bool ImageLoaderMachO::doInitialization(const LinkContext& context) { CRSetCrashLogMessage2(this->getPath()); // mach-o has -init and static initializers doImageInit(context); doModInitFunctions(context); CRSetCrashLogMessage2(NULL); return (fHasDashInit || fHasInitializers); } 复制代码
-
进入
doImageInit
源码实现,其核心主要是for循环加载方法的调用
,这里需要注意的一点是,libSystem
的初始化必须先运行
void ImageLoaderMachO::doImageInit(const LinkContext& context) { if ( fHasDashInit ) { const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds; const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)]; const struct load_command* cmd = cmds; for (uint32_t i = 0; i < cmd_count; ++i) { switch (cmd->cmd) { case LC_ROUTINES_COMMAND: Initializer func = (Initializer)(((struct macho_routines_command*)cmd)->init_address + fSlide); #if __has_feature(ptrauth_calls) func = (Initializer)__builtin_ptrauth_sign_unauthenticated((void*)func, ptrauth_key_asia, 0); #endif // <rdar://problem/8543820&9228031> verify initializers are in image if ( ! this->containsAddress(stripPointer((void*)func)) ) { dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath()); } if ( ! dyld::gProcessInfo->libSystemInitialized ) {//libSystem初始化程序必须先运行,优先级很高 // <rdar://problem/17973316> libSystem initializer must run first dyld::throwf("-init function in image (%s) that does not link with libSystem.dylib\n", this->getPath()); } if ( context.verboseInit ) dyld::log("dyld: calling -init function %p in %s\n", func, this->getPath()); { dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0); func(context.argc, context.argv, context.envp, context.apple, &context.programVars); } break; } cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize); } } } 复制代码
-
进入
doModInitFunctions
源码实现,这个方法中加载了所有Cxx
文件可以通过测试程序的堆栈信息来验证,在C++方法处加一个断点void ImageLoaderMachO::doModInitFunctions(const LinkContext& context) { if ( fHasInitializers ) { const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds; const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)]; const struct load_command* cmd = cmds; for (uint32_t i = 0; i < cmd_count; ++i) { if ( cmd->cmd == LC_SEGMENT_COMMAND ) { const struct macho_segment_command* seg = (struct macho_segment_command*)cmd; const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command)); const struct macho_section* const sectionsEnd = §ionsStart[seg->nsects]; for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) { const uint8_t type = sect->flags & SECTION_TYPE; if ( type == S_MOD_INIT_FUNC_POINTERS ) {....} else if ( type == S_INIT_FUNC_OFFSETS ) {....} } cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize); } } } 复制代码
-
走到这里,还是没有找到_objc_init的调用?怎么办呢?放弃吗?当然不行,我们还可以通过_objc_init
加一个符号断点来查看调用_objc_init前的堆栈信息,
-
_objc_init
加一个符号断点,运行程序,查看_objc_init
断住后的堆栈信息 -
在
libsystem
Libsystem-1292.60.1 中查找libSystem_initializer
,查看其中的实现libSystem_initializer源码实现
// libsyscall_initializer() initializes all of libSystem.dylib // <rdar://problem/4892197> __attribute__((constructor)) static void libSystem_initializer(int argc, const char* argv[], const char* envp[], const char* apple[], const struct ProgramVars* vars) { .... _libSystem_ktrace0(ARIADNE_LIFECYCLE_libsystem_init | DBG_FUNC_START); __libkernel_init(&libkernel_funcs, envp, apple, vars); _libSystem_ktrace_init_func(KERNEL); __libplatform_init(NULL, envp, apple, vars); _libSystem_ktrace_init_func(PLATFORM); __pthread_init(&libpthread_funcs, envp, apple, vars); _libSystem_ktrace_init_func(PTHREAD); _libc_initializer(&libc_funcs, envp, apple, vars); _libSystem_ktrace_init_func(LIBC); // TODO: Move __malloc_init before __libc_init after breaking malloc's upward link to Libc // Note that __malloc_init() will also initialize ASAN when it is present __malloc_init(apple); _libSystem_ktrace_init_func(MALLOC); #if TARGET_OS_OSX /* <rdar://problem/9664631> */ __keymgr_initializer(); _libSystem_ktrace_init_func(KEYMGR); #endif _dyld_initializer();//dyld 初始化 _libSystem_ktrace_init_func(DYLD); libdispatch_init();// dispatch 初始化 _libSystem_ktrace_init_func(LIBDISPATCH); #if !TARGET_OS_DRIVERKIT _libxpc_initializer(); _libSystem_ktrace_init_func(LIBXPC); #if CURRENT_VARIANT_asan setenv("DT_BYPASS_LEAKS_CHECK", "1", 1); #endif #endif // !TARGET_OS_DRIVERKIT // must be initialized after dispatch _libtrace_init(); _libSystem_ktrace_init_func(LIBTRACE); #if !TARGET_OS_DRIVERKIT #if defined(HAVE_SYSTEM_SECINIT) _libsecinit_initializer(); _libSystem_ktrace_init_func(SECINIT); #endif #if defined(HAVE_SYSTEM_CONTAINERMANAGER) _container_init(apple); _libSystem_ktrace_init_func(CONTAINERMGR); #endif __libdarwin_init(); _libSystem_ktrace_init_func(DARWIN); #endif // !TARGET_OS_DRIVERKIT __stack_logging_early_finished(&malloc_funcs); ..... } 复制代码
-
根据前面的堆栈信息,我们发现走的是
libSystem_initializer
中会调用libdispatch_init
函数,而这个函数的源码是在libdispatch
开源库中的, libdispatch-1271.120.2 源码 在libdispatch
中搜索libdispatch_init
DISPATCH_EXPORT DISPATCH_NOTHROW void libdispatch_init(void) { dispatch_assert(sizeof(struct dispatch_apply_s) <= DISPATCH_CONTINUATION_SIZE); if (_dispatch_getenv_bool("LIBDISPATCH_STRICT", false)) { _dispatch_mode |= DISPATCH_MODE_STRICT; } #if DISPATCH_DEBUG || DISPATCH_PROFILE #if DISPATCH_USE_KEVENT_WORKQUEUE if (getenv("LIBDISPATCH_DISABLE_KEVENT_WQ")) { _dispatch_kevent_workqueue_enabled = false; } #endif #endif #if HAVE_PTHREAD_WORKQUEUE_QOS dispatch_qos_t qos = _dispatch_qos_from_qos_class(qos_class_main()); _dispatch_main_q.dq_priority = _dispatch_priority_make(qos, 0); #if DISPATCH_DEBUG if (!getenv("LIBDISPATCH_DISABLE_SET_QOS")) { _dispatch_set_qos_class_enabled = 1; } #endif #endif #if DISPATCH_USE_THREAD_LOCAL_STORAGE _dispatch_thread_key_create(&__dispatch_tsd_key, _libdispatch_tsd_cleanup); #else _dispatch_thread_key_create(&dispatch_priority_key, NULL); _dispatch_thread_key_create(&dispatch_r2k_key, NULL); _dispatch_thread_key_create(&dispatch_queue_key, _dispatch_queue_cleanup); _dispatch_thread_key_create(&dispatch_frame_key, _dispatch_frame_cleanup); _dispatch_thread_key_create(&dispatch_cache_key, _dispatch_cache_cleanup); _dispatch_thread_key_create(&dispatch_context_key, _dispatch_context_cleanup); _dispatch_thread_key_create(&dispatch_pthread_root_queue_observer_hooks_key, NULL); _dispatch_thread_key_create(&dispatch_basepri_key, NULL); #if DISPATCH_INTROSPECTION _dispatch_thread_key_create(&dispatch_introspection_key , NULL); #elif DISPATCH_PERF_MON _dispatch_thread_key_create(&dispatch_bcounter_key, NULL); #endif _dispatch_thread_key_create(&dispatch_wlh_key, _dispatch_wlh_cleanup); _dispatch_thread_key_create(&dispatch_voucher_key, _voucher_thread_cleanup); _dispatch_thread_key_create(&dispatch_deferred_items_key, _dispatch_deferred_items_cleanup); #endif pthread_key_create(&_os_workgroup_key, _os_workgroup_tsd_cleanup); #if DISPATCH_USE_RESOLVERS // rdar://problem/8541707 _dispatch_main_q.do_targetq = _dispatch_get_default_queue(true); #endif _dispatch_queue_set_current(&_dispatch_main_q); _dispatch_queue_set_bound_thread(&_dispatch_main_q); #if DISPATCH_USE_PTHREAD_ATFORK (void)dispatch_assume_zero(pthread_atfork(dispatch_atfork_prepare, dispatch_atfork_parent, dispatch_atfork_child)); #endif _dispatch_hw_config_init(); _dispatch_time_init(); _dispatch_vtable_init(); _os_object_init();//重点 _voucher_init(); _dispatch_introspection_init(); } 复制代码
-
进入
_os_object_init
源码实现,其源码实现调用了_objc_init
函数结合上面的分析,从初始化_objc_init
注册的_dyld_objc_notify_register
的参数2,即load_images
,到sNotifySingle
–>sNotifyObjCInie=参数2
到sNotifyObjcInit()
调用,形成了一个闭环
void _os_object_init(void) { _objc_init();// 重点 Block_callbacks_RR callbacks = { sizeof(Block_callbacks_RR), (void (*)(const void *))&objc_retain, (void (*)(const void *))&objc_release, (void (*)(const void *))&_os_objc_destructInstance }; _Block_use_RR2(&callbacks); #if DISPATCH_COCOA_COMPAT const char *v = getenv("OBJC_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); v = getenv("DISPATCH_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); v = getenv("LIBDISPATCH_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); #endif } 复制代码
所以可以简单的理解为sNotifySingle
这里是添加通知即addObserver
,_objc_init
中调用_dyld_objc_notify_register
相当于发送通知,即push
,而sNotifyObjcInit
相当于通知的处理函数,即selector
【总结】:_objc_init的源码链:_dyld_start
–> dyldbootstrap::start
–> dyld::_main
–> dyld::initializeMainExecutable
–> ImageLoader::runInitializers
–> ImageLoader::processInitializers
–> ImageLoader::recursiveInitialization
–> doInitialization
–>libSystem_initializer
(libSystem.B.dylib) –> _os_object_init
(libdispatch.dylib) –> _objc_init
(libobjc.A.dylib)
3.2.3 重点介绍 第九步:寻找主入口函数
-
汇编调试,可以看到显示来到
+[ViewController load]
方法 -
继续执行,来到
ypyFunc
的C++函数 -
点击
stepover
,继续往下,跑完了整个流程,会回到_dyld_start
,然后调用main()
函数,通过汇编完成main
的参数赋值等操作dyld
汇编源码实现汇编调试回到_dyld_start
LC_MAIN case, set up stack for call to main()
#if __arm64__ && !TARGET_OS_SIMULATOR .text .align 2 .globl __dyld_start __dyld_start: mov x28, sp and sp, x28, #~15 // force 16-byte alignment of stack mov x0, #0 mov x1, #0 stp x1, x0, [sp, #-16]! // make aligned terminating frame mov fp, sp // set up fp to point to terminating frame sub sp, sp, #16 // make room for local variables #if __LP64__ ldr x0, [x28] // get app's mh into x0 ldr x1, [x28, #8] // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment) add x2, x28, #16 // get argv into x2 #else ldr w0, [x28] // get app's mh into x0 ldr w1, [x28, #4] // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment) add w2, w28, #8 // get argv into x2 #endif adrp x3,___dso_handle@page add x3,x3,___dso_handle@pageoff // get dyld's mh in to x4 mov x4,sp // x5 has &startGlue // call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue) bl __ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm mov x16,x0 // save entry point address in x16 #if __LP64__ ldr x1, [sp] #else ldr w1, [sp] #endif cmp x1, #0 b.ne Lnew // LC_UNIXTHREAD way, clean up stack and jump to result #if __LP64__ add sp, x28, #8 // restore unaligned stack pointer without app mh #else add sp, x28, #4 // restore unaligned stack pointer without app mh #endif #if __arm64e__ braaz x16 // jump to the program's entry point #else br x16 // jump to the program's entry point #endif // LC_MAIN case, set up stack for call to main() Lnew: mov lr, x1 // simulate return address into _start in libdyld.dylib #if __LP64__ ldr x0, [x28, #8] // main param1 = argc add x1, x28, #16 // main param2 = argv add x2, x1, x0, lsl #3 add x2, x2, #8 // main param3 = &env[0] mov x3, x2 Lapple: ldr x4, [x3] add x3, x3, #8 #else ldr w0, [x28, #4] // main param1 = argc add x1, x28, #8 // main param2 = argv add x2, x1, x0, lsl #2 add x2, x2, #4 // main param3 = &env[0] mov x3, x2 Lapple: ldr w4, [x3] add x3, x3, #4 #endif cmp x4, #0 b.ne Lapple // main param4 = apple #if __arm64e__ braaz x16 #else br x16 #endif #endif // __arm64__ && !TARGET_OS_SIMULATOR 复制代码
dyld中main部分的汇编源码实现
注意:main
是写定的函数,写入内存,读取到dyld
,如果修改了main函数的名称
,会报错
所以,综上所述,最终dyld加载流程
,如下图所示,图中也诠释了前文中的问题:为什么是load-->Cxx-->main
的调用顺序
? 喜欢就点个赞吧??
? 觉得有收获的,可以来一波,收藏+关注,评论 + 转发,以免你下次找不到我??
?欢迎大家留言交流,批评指正,互相学习?,提升自我?