dyld加载流程

dyld是一个动态链接器,那它是怎么去加载我们的主程序,以及我们用到的动态库呢,还有load方法是什么时候调用的

dyld_start: DYLD入口

首先,我们新建一个iOS工程dyldDemo,我们在ViewController.m中重写 +(void)load方法,并在 load方法中设置一个断点,使用bt指令查看此处函数调用栈

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x000000010e58ce4c dyldDemo`+[ViewController load](self=ViewController, _cmd="load") at ViewController.m:23:1
    frame #1: 0x00007fff201804e3 libobjc.A.dylib`load_images + 1442
    frame #2: 0x000000010e5a0e54 dyld_sim`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 425
    frame #3: 0x000000010e5af887 dyld_sim`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 437
    frame #4: 0x000000010e5adbb0 dyld_sim`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 188
    frame #5: 0x000000010e5adc50 dyld_sim`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82
    frame #6: 0x000000010e5a12a9 dyld_sim`dyld::initializeMainExecutable() + 199
    frame #7: 0x000000010e5a5d50 dyld_sim`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 4431
    frame #8: 0x000000010e5a01c7 dyld_sim`start_sim + 122
    frame #9: 0x000000010f84857a dyld`dyld::useSimulatorDyld(int, macho_header const*, char const*, int, char const**, char const**, char const**, unsigned long*, unsigned long*) + 2093
    frame #10: 0x000000010f845df3 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 1199
    frame #11: 0x000000010f84022b dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 457
    frame #12: 0x000000010f840025 dyld`_dyld_start + 37
复制代码

我们可以看出,dyld是从 _dyld_start,在 dyldbootstrap中调用了start方法

    dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);
    rebaseDyld(dyldsMachHeader); // 1
	const char** apple = envp;
	while(*apple != NULL) { ++apple; }
	++apple;

	// set up random value for stack canary
	__guard_setup(apple); // 2

#if DYLD_INITIALIZER_SUPPORT
	// run all C++ initializers inside dyld
	runDyldInitializers(argc, argv, envp, apple);
#endif

	_subsystem_init(apple);

	// now that we are done bootstrapping dyld, call dyld's main
	uintptr_t appsSlide = appsMachHeader->getSlide();
	return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
复制代码
  • 1,重定位dyld ,APP启动的时候,系统会给该进程提供一个虚拟的内存偏移ASLR,对dyld进行修正。
  • 2,对栈空间进行保护。

主程序环境配置

dyld::_main中,进行主程序的环境配置,初始化mainExecutableCDHash(主函数的哈希)
sMainExecutableMachHeader(Mach-O的header)
sMainExecutableSlide(主程序的ASLR内存偏移值),这样就得到了主程序的Mach-Oheader的地址和 内存偏移值。

初始化完环境变量后,将变量信息存入到上下文 ImageLoader::LinkContext
如果我们设置了 DYLD_PRINT_OPTS或者 DYLD_PRINT_ENV两个环境变量值为1,那么就会打印出来所有 dyld配置和环境变量,如下图所示:

环境变量.png
输出结果如下所示:

opt[0] = "/Users/bel/Library/Developer/CoreSimulator/Devices/B428260A-1DE5-4090-A8E9-52E8EE6F2F2E/data/Containers/Bundle/Application/E364F774-F594-4E77-8622-DE78B7742C0D/dyldDemo.app/dyldDemo"
IOS_SIMULATOR_SYSLOG_SOCKET=/tmp/com.apple.CoreSimulator.SimDevice.B428260A-1DE5-4090-A8E9-52E8EE6F2F2E/syslogsock
SIMULATOR_SHARED_RESOURCES_DIRECTORY=/Users/bel/Library/Developer/CoreSimulator/Devices/B428260A-1DE5-4090-A8E9-52E8EE6F2F2E/data
XPC_SIMULATOR_LAUNCHD_NAME=com.apple.CoreSimulator.SimDevice.B428260A-1DE5-4090-A8E9-52E8EE6F2F2E
DYLD_ROOT_PATH=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot
esources/capabilities.plist
SIMULATOR_FRAMEBUFFER_FRAMEWORK=/Library/Developer/PrivateFrameworks/CoreSimulator.framework/Resources/Platforms/iphoneos/Library/PrivateFrameworks/SimFramebuffer.framework/SimFramebuffer
DYLD_LIBRARY_PATH=/Users/bel/Library/Developer/Xcode/DerivedData/dyldDemo-aekwxanrcjrzwagsdjsahkghypfj/Build/Products/Debug-iphonesimulator:/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/system/introspection
....
DYLD_PRINT_OPTS=1
TESTMANAGERD_SIM_SOCK=/private/tmp/com.apple.launchd.j1MbmWtF0N/com.apple.testmanagerd.unix-domain.socket
DYLD_PRINT_ENV=1
XPC_FLAGS=1
复制代码

加载共享缓存

读取到主程序的header和配置完环境变量后,就开始load shared cache(加载共享缓存)了,在iOS中,必须要有共享缓存,系统中的库,例如UIKit,Foundation等都存放在 共享缓存中。

static void mapSharedCache(uintptr_t mainExecutableSlide) {
    ...
    loadDyldCache(opts, &sSharedCacheLoadInfo);
    ...
}

bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results)
{
    ......
    if ( options.forcePrivate ) {
        // mmap cache into this process only
        return mapCachePrivate(options, results);
    }
    else {
        // fast path: when cache is already mapped into shared region
        bool hasError = false;
        if ( reuseExistingCache(options, results) // 1 ) {
            hasError = (results->errorMessage != nullptr); 
        } else {
            // slow path: this is first process to load cache
            hasError = mapCacheSystemWide(options, results); // 2
        }
        return hasError;
    }
#endif
}
....
复制代码
  • 1,如果共享缓存里面已经有了,就不做任何处理。
  • 2,如果是第一次加载,共享缓存里面没有,该进程就去加载系统库

从这里我们可以看出,共享缓存是第一个被加载的,然后加载依赖的动态库

实例化主程序

我们接着往下看,

// instantiate ImageLoader for main executable
sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);

static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
	// try mach-o loader
//	if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
		ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
		addImage(image);
		return (ImageLoaderMachO*)image;
//	}
	
//	throw "main executable not a known format";
}
复制代码

在这里开始实例化主程序对象了,首先获取到 主程序的header的地址,Slide(ASLR值)Mach-O文件的path(路径),得到一个镜像文件 ImageLoader,这里加载的第一个镜像文件,就是我们的主程序。

加载完所有的库之后,进行代码签名

链接主程序和动态库

加载完所有库之后,就开始链接库文件了

link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
复制代码

通过link方法进行链接

void ImageLoader::link(const LinkContext& context, bool forceLazysBound, bool preflightOnly, bool neverUnload, const RPathChain& loaderRPaths, const char* imagePath)
{
	//dyld::log("ImageLoader::link(%s) refCount=%d, neverUnload=%d\n", imagePath, fDlopenReferenceCount, fNeverUnload);
	
	// clear error strings
	(*context.setErrorStrings)(0, NULL, NULL, NULL);
	// 起始时间。用于记录时间间隔
	uint64_t t0 = mach_absolute_time();
	//递归加载主程序依赖的库.完成之后发通知。
	this->recursiveLoadLibraries(context, preflightOnly, loaderRPaths, imagePath);
	context.notifyBatch(dyld_image_state_dependents_mapped, preflightOnly);

	// we only do the loading step for preflights
	if ( preflightOnly )
		return;

	uint64_t t1 = mach_absolute_time();
	context.clearAllDepths();
	this->updateDepth(context.imageCount());

	__block uint64_t t2, t3, t4, t5;
	{
		dyld3::ScopedTimer(DBG_DYLD_TIMING_APPLY_FIXUPS, 0, 0, 0);
		t2 = mach_absolute_time();
		//Rebase修正ASLR!
		this->recursiveRebaseWithAccounting(context);
		context.notifyBatch(dyld_image_state_rebased, false);

		t3 = mach_absolute_time();
		if ( !context.linkingMainExecutable )
			//绑定NoLazy符号
			this->recursiveBindWithAccounting(context, forceLazysBound, neverUnload);

		t4 = mach_absolute_time();
		if ( !context.linkingMainExecutable )
			//绑定弱符号!
			this->weakBind(context);
		t5 = mach_absolute_time();
	}

	// interpose any dynamically loaded images
	if ( !context.linkingMainExecutable && (fgInterposingTuples.size() != 0) ) {
		dyld3::ScopedTimer timer(DBG_DYLD_TIMING_APPLY_INTERPOSING, 0, 0, 0);
		//递归应用插入的动态库
		this->recursiveApplyInterposing(context);
	}

	// now that all fixups are done, make __DATA_CONST segments read-only
	if ( !context.linkingMainExecutable )
		this->recursiveMakeDataReadOnly(context);

    if ( !context.linkingMainExecutable )
        context.notifyBatch(dyld_image_state_bound, false);
	uint64_t t6 = mach_absolute_time();

	if ( context.registerDOFs != NULL ) {
		std::vector<DOFInfo> dofs;
		this->recursiveGetDOFSections(context, dofs);
		//注册
		context.registerDOFs(dofs);
	}
	//计算结束时间.
	uint64_t t7 = mach_absolute_time();

	// clear error strings
	(*context.setErrorStrings)(0, NULL, NULL, NULL);

	fgTotalLoadLibrariesTime += t1 - t0;
	fgTotalRebaseTime += t3 - t2;
	fgTotalBindTime += t4 - t3;
	fgTotalWeakBindTime += t5 - t4;
	fgTotalDOF += t7 - t6;
	
	// done with initial dylib loads
	fgNextPIEDylibAddress = 0;
}
复制代码

我们可以看出,这里记录了很多时间,在工程中,如果我们配置了 DYLD_PRINT_STATISTICS环境变量,就会输出在链接动态库时,所有的耗时,耗时时间是在ImageLoader::link方法调用的时候,进行统计的。

  • 1,先记录一个开始时间。
  • 2,Rebase, 修正ASLR。
  • 3,绑定符号,先绑定 NoLazy符号,在绑定弱符号。懒加载符号是在使用的时候绑定的,不是在启动的时候进行绑定的。
  • 4,进行注册

经过以上步骤,动态库和主程序已经加载,链接,注册完成,接下来就要初始化调用主函数了

初始化Main方法

dyld加载回调

initializeMainExecutable方法中,开始初始化主程序了

void initializeMainExecutable()
{
	gLinkContext.startedInitializingMainExecutable = true;
	ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
	initializerTimes[0].count = 0;
	const size_t rootCount = sImageRoots.size();
	if ( rootCount > 1 ) {
		for(size_t i=1; i < rootCount; ++i) {
			sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
		}
	}
	sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
	if ( gLibSystemHelpers != NULL ) 
		(*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL);
	if ( sEnv.DYLD_PRINT_STATISTICS )
		ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
	if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
		ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
}
复制代码

在文章的开头,我们对load之前的调用栈,进行了打印输出

 thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x000000010e58ce4c dyldDemo`+[ViewController load](self=ViewController, _cmd="load") at ViewController.m:23:1
    frame #1: 0x00007fff201804e3 libobjc.A.dylib`load_images + 1442
    frame #2: 0x000000010e5a0e54 dyld_sim`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 425
    frame #3: 0x000000010e5af887 dyld_sim`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 437
    frame #4: 0x000000010e5adbb0 dyld_sim`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 188
    frame #5: 0x000000010e5adc50 dyld_sim`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82
    frame #6: 0x000000010e5a12a9 dyld_sim`dyld::initializeMainExecutable() + 199
复制代码

initializeMainExecutable()方法之后,调用了 dyld::notifySingle方法,在notifySignle中,调用了 (*sNotifyObjCInit)(image->getRealPath(), image->machHeader());回调,将镜像文件的路径machHeader传递出去。我们全局搜索该回调的初始化, 其在 _dyld_objc_notify_register方法中,进行赋值的,我们在dyld中,未查找到_dyld_objc_notify_register方法的调用。回到我们刚开始探索的地方,

load_images.png
load_images方法是在 libobjc中调用的,在objc4源码中我们可以看到该实现,

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    runtime_init();
    exception_init();
#if __OBJC2__
    cache_t::init();
#endif
    _imp_implementationWithBlock_init();

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);

#if __OBJC2__
    didCallDyldNotifyRegister = true;
#endif
}
复制代码

_objc_init,对 _dyld_objc_notify_init进行了初始化赋值,从这里我们可以看出,在_objc_init方法中,注册了dyld的加载完成后的回调

load方法的调用

我们看下 load_images方法的实现

void
load_images(const char *path __unused, const struct mach_header *mh)
{
    if (!didInitialAttachCategories && didCallDyldNotifyRegister) {
        didInitialAttachCategories = true;
        loadAllCategories();
    }

    // Return without taking locks if there are no +load methods here.
    if (!hasLoadMethods((const headerType *)mh)) return;

    recursive_mutex_locker_t lock(loadMethodLock);

    // Discover load methods
    {
        mutex_locker_t lock2(runtimeLock);
        prepare_load_methods((const headerType *)mh);
    }

    // Call +load methods (without runtimeLock - re-entrant)
    call_load_methods(); // 1
}
复制代码
  • 1,依次调用每个class的load方法

cxx构造方法的调用

我们回到dyld工程,继续向下探索

void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
										  InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
	recursive_lock lock_info(this_thread);
	recursiveSpinLock(lock_info);

	if ( fState < dyld_image_state_dependents_initialized-1 ) {
		uint8_t oldState = fState;
		// break cycles
		fState = dyld_image_state_dependents_initialized-1;
		try {
			// initialize lower level libraries first
			for(unsigned int i=0; i < libraryCount(); ++i) {
				ImageLoader* dependentImage = libImage(i);
				if ( dependentImage != NULL ) {
					// don't try to initialize stuff "above" me yet
					if ( libIsUpward(i) ) {
						uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
						uninitUps.count++;
					}
					else if ( dependentImage->fDepth >= fDepth ) {
						dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps);
					}
                }
			}
			
			// record termination order
			if ( this->needsTermination() )
				context.terminationRecorder(this);

			// let objc know we are about to initialize this image
			uint64_t t1 = mach_absolute_time();
			fState = dyld_image_state_dependents_initialized;
			oldState = fState;
			context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
			bool hasInitializers = this->doInitialization(context);
			fState = dyld_image_state_initialized;
			oldState = fState;
			context.notifySingle(dyld_image_state_initialized, this, NULL);			
			if ( hasInitializers ) {
				uint64_t t2 = mach_absolute_time();
				timingInfo.addTime(this->getShortName(), t2-t1);
			}
		}
		catch (const char* msg) {
			// this image is not initialized
			fState = oldState;
			recursiveSpinUnLock();
			throw;
		}
	}	
	recursiveSpinUnLock();
}
复制代码

notifySignle方法之后,执行doInitialization方法,我们看看其实现

bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
	CRSetCrashLogMessage2(this->getPath());

	// mach-o has -init and static initializers
	doImageInit(context);
	doModInitFunctions(context); //1
	
	CRSetCrashLogMessage2(NULL);
	
	return (fHasDashInit || fHasInitializers);
}
复制代码
  • 1,执行全局的Cxx构造函数。

我们在main.m中,使用__attribute__,创建两个全局Cxx构造函数

__attribute__((constructor)) void func1(){
    printf("fun1 来了!");
}

__attribute__((constructor)) void func2(){
    printf("fun2 来了!");
}
复制代码

在编译过后,Mach-O文件中,会增加一个__mod_init_func段,

__mod_init_func.png

doModInitFunctions(context)这个方法就是调用 我们的全局Cxx构造函数

我们可以看出来其调用顺序为 +(void)load -> Cxx构造函数 -> main,

查找main入口

在初始化(initializeMainExecutable)完成之后,在Mach—O文件中,查找main函数的地址

result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN();
复制代码

main函数的地址赋值给result,并返回。

总结:

经过探索,我们可知,dyld的加载流程为:

  • 1,从 _dyld_start开始,进入dyldbootstrap::start
  • 2,进入dyld::main函数
  • 3,配置环境变量,根据ASLR对动态库进行重定位。
  • 4,加载共享缓存,在共享缓存中加载动态库。
  • 5,实例化主程序,加载动态库,链接动态库,进行符号绑定。
  • 6,初始化主程序,动态库加载链接完成后,在Objc中调用load方法。
  • 7,调用mode_init_function,即调用全局的cxx构造方法。
  • 8,在Mach-O文件中,读取main函数的地址,并将该地址返回。
© 版权声明
THE END
喜欢就支持一下吧
点赞0 分享