系列:
本文是系列的第三篇–镜像篇。
1. 之前小 demo 的缺点
前面做完会有一种感觉容器已经实现,差的只是各种细节功能的感觉。但是其实有很多问题。
- 比如说使用 ls,会发现还是在父线程的目录。
- 挂载点也都是继承自父进程的。
这和我们平时使用 docker 不一样。这就是缺少了镜像这个玩意。
2. 本次 demo –busybox 镜像
本次 demo 要实现的是 busybox 这个镜像。busybox 提供了很多 unix 工具。算是一个非常有用的镜像。这次我们就去实现这个镜像。
3. 实现 rootfs
3.1 init.go
这有很长,我们一步一步来:
- 首先我们把所有的 mount 步骤全部放在一个函数 setupMount 中。
- 在其中,除了之前就已经 mount 过的 / 和 proc,还调用了一个 pivotRoot。
- 在 pivot_root 中,我们做了如下的事情:
- 首先我们 remount root,让其 bind 自己,这个 root 指的其实就是镜像文件的根目录。
- ?这么做有什么意义:
- 因为我们要保证 root 是一个 mount point,这在之后会有用。
- ?这么做有什么意义:
- 然后我们新建一个文件夹叫 .put_old,随便叫什么名字都行,只是用于临时储存用的。
- 然后我们调用 syscall.pivotRoot(root, putOld),这个函数会把当前的 rootfs mount 移动到第二个参数,这里我们第二个参数就是上一步新建的文件夹。也就是说我们把当前 rootfs mount 移动到了 .put_old 上。
- 如果此时进入 .put_old 文件夹,调用一下 ls,你会发现这正是你最熟悉的你的操作系统上的那个根目录。
- 除此之外,pivotRoot 还会把第一个参数作为新的 rootfs mount。而我们的第一个参数正是 root(镜像文件的根目录)
- unmount 旧的 rootfs mount,也就是 .put_old
- 删除 .put_old。
- 这样一套下来,我们的新 rootfs mount 就再也见不到旧 rootfs mount 的踪影了。
- 首先我们 remount root,让其 bind 自己,这个 root 指的其实就是镜像文件的根目录。
// already in container
// initiate the container
func InitProcess() error {
// read command from pipe, will plug if write side is not ready
containerCmd := readCommand()
if containerCmd == nil || len(containerCmd) == 0 {
return fmt.Errorf("Init process fails, containerCmd is nil")
}
// setup all mount commands
if err:= setupMount(); err != nil {
logrus.Errorf("setup mount fails: %v", err)
return err
}
// look for the path of container command
// so we don't need to type "/bin/ls", but "ls"
commandPath, err := exec.LookPath(containerCmd[0])
if err != nil {
logrus.Errorf("initProcess look path fails: %v", err)
return err
}
// log commandPath info
// if you type "ls", it will be "/bin/ls"
logrus.Infof("Find commandPath: %v", commandPath)
if err := syscall.Exec(commandPath, containerCmd, os.Environ()); err != nil {
logrus.Errorf(err.Error())
}
return nil
}
func readCommand() []string {
// 3 is the index of readPipe
pipe := os.NewFile(uintptr(3), "pipe")
msg, err := ioutil.ReadAll(pipe)
if err != nil {
logrus.Errorf("read pipe fails: %v", err)
return nil
}
return strings.Split(string(msg), " ")
}
// integration of all mount commands
func setupMount() error {
// ensure that container mount and parent mount has no shared propagation
if err := syscall.Mount("", "/", "", syscall.MS_PRIVATE|syscall.MS_REC, ""); err != nil {
logrus.Errorf("mount / fails: %v", err)
return err
}
// get current directory
pwd, err := os.Getwd()
if err != nil {
return err
}
logrus.Infof("current location is: %v", pwd)
// use current directory as the root
if err:= pivotRoot(pwd); err != nil {
logrus.Errorf("pivot root fails: %v", err)
return err
}
// mount proc filesystem
defaultMountFlags := syscall.MS_NOEXEC | syscall.MS_NOSUID | syscall.MS_NODEV
if err := syscall.Mount("proc", "/proc", "proc", uintptr(defaultMountFlags), ""); err != nil {
logrus.Errorf("mount /proc fails: %v", err)
return err
}
// mount tmpfs
if err := syscall.Mount("tmpfs", "/dev", "tmpfs", syscall.MS_NOSUID | syscall.MS_STRICTATIME, "mode=755"); err != nil {
logrus.Errorf("mount /dev fails: %v", err)
return err
}
return nil
}
// change the container rootfs to image rootfs
func pivotRoot(root string) error {
// what it does?
// remember root is just a parameter now,it's not the rootfs, it's what we want to create
// this command ensure that root is a mount point which bind itself
if err:= syscall.Mount(root, root, "bind", syscall.MS_BIND | syscall.MS_REC, ""); err != nil {
return fmt.Errorf("remount root fails: %v", err)
}
// create the putOld directory to store old
putOld := path.Join(root, ".put_old")
if err:= os.Mkdir(putOld, 0777); err != nil {
return fmt.Errorf("create putOld directory fails: %v", err)
}
// pivot old root mount to putOld
// and mount the first parameter as the new root mount
// which means, '/.put_old/' is the old rootfs
// the first parameter must be a mount point, that's why we remount root itself at the beginning
if err := syscall.PivotRoot(root, putOld); err != nil {
return fmt.Errorf("pivot_root fails: %v", err)
}
// chdir do exactly the same as cd. chdir is a syscall, cd is a program
// change to root directory
if err:= syscall.Chdir("/"); err != nil {
return fmt.Errorf("chdir fails: %v", err)
}
// after the previous process, the current filesystem is the new root
// the old filesystem is .put_old
// finally, we need to unmount the old root mount before remove it
// change the putOld dir, because we are in the new rootfs now
// the root became "/"
putOld = path.Join("/", ".put_old")
if err:= syscall.Unmount(putOld, syscall.MNT_DETACH); err != nil {
return fmt.Errorf("unmount fails: %v", err)
}
// remove the old mount point
return os.Remove(putOld)
}
复制代码
4. 实现 AUFS
4.1 首先实现 aufs 挂载
这里虽然代码多,但是其实主要还是去实现我第一篇文章里的最后一节的内容。
文件为 containerProcess.go
// containerProcess.go
func NewProcess(tty bool) (*exec.Cmd, *os.File) {
readPipe, writePipe, err := os.Pipe()
if err != nil {
logrus.Errorf("New Pipe Error: %v", err)
return nil, nil
}
// create a new command which run itself
// the first arguments is `init` which is in the "container/init.go" file
// so, the <cmd> will be interpret as "docker init <containerCmd>"
cmd := exec.Command("/proc/self/exe", "init")
// new namespaces, thanks to Linux
cmd.SysProcAttr = &syscall.SysProcAttr{
Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWIPC | syscall.CLONE_NEWPID | syscall.CLONE_NEWNS | syscall.CLONE_NEWNET,
}
// this is what presudo terminal means
// link the container's stdio to os
if tty {
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
}
cmd.ExtraFiles = []*os.File{readPipe}
imagesRootURL := "./images/"
mntURL := "./mnt/"
newWorkspace(imagesRootURL, mntURL)
cmd.Dir = mntURL
return cmd, writePipe
}
func newWorkspace(imagesRootURL string, mntURL string) {
createReadOnlyLayer(imagesRootURL)
createWriteLayer(imagesRootURL)
createMountPoint(imagesRootURL, mntURL)
}
func createReadOnlyLayer(imagesRootURL string) {
readOnlyLayerURL := imagesRootURL + "busybox/"
imageTarURL := imagesRootURL + "busybox.tar"
isExist, err := pathExist(readOnlyLayerURL)
if err != nil {
logrus.Infof("fail to judge whether path exist: %v", err)
}
if isExist == false {
if err := os.Mkdir(readOnlyLayerURL, 0777); err != nil {
logrus.Errorf("fail to create dir %s: %v", readOnlyLayerURL, err)
}
if _, err := exec.Command("tar", "-xvf", imageTarURL, "-C", readOnlyLayerURL).CombinedOutput(); err != nil {
logrus.Errorf("fail to untar %s: %v", imageTarURL, err)
}
}
}
func createWriteLayer(imagesRootURL string) {
writeLayerURL := imagesRootURL + "writeLayer/"
if err := os.Mkdir(writeLayerURL, 0777); err != nil {
logrus.Errorf("fail to create dir %s: %v", writeLayerURL, err)
}
}
func createMountPoint(imagesRootURL string, mntURL string) {
if err := os.Mkdir(mntURL, 0777); err != nil {
logrus.Errorf("fail to create dir %s: %v", mntURL, err)
}
// mount the readOnly layer and writeLayer on the mntURL
dirs := "dirs=" + imagesRootURL + "writeLayer:" + imagesRootURL + "busybox"
cmd := exec.Command("mount", "-t", "aufs", "-o", dirs, "none", mntURL)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err:=cmd.Run(); err != nil {
logrus.Error(err)
}
}
func pathExist(path string) (bool, error){
_, err := os.Stat(path)
if err == nil {
return true, nil
}
if os.IsNotExist(err) {
return false, nil
}
return false, err
}
复制代码
4.2 删除 AUFS
在 docker 中,退出容器也伴随着删除 write 层。这里分为三步:
- unmount mnt目录(注意 linux 命令行里的 unmount 叫做 umount,没有 n)
- 删除 mnt 目录
- 删除 write 层
代码如下:
4.2.1 containerProcess.go
// delete AUFS (delete write layer)
func DeleteWorkspace(imagesRootURL string, mntURL string) {
deleteMount(mntURL)
deleteWriteLayer(imagesRootURL)
}
func deleteMount(mntURL string) {
cmd := exec.Command("umount", mntURL)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
logrus.Errorf("umount fails: %v", err)
}
if err := os.RemoveAll(mntURL); err != nil {
logrus.Errorf("remove dir %s error: %v", mntURL, err)
}
}
func deleteWriteLayer(imagesRootURL string) {
writeLayerURL := imagesRootURL + "writeLayer/"
if err:= os.RemoveAll(writeLayerURL); err != nil {
logrus.Errorf("remove dir %s error: %v", writeLayerURL, err)
}
}
复制代码
4.2.2 run.go
- 当容器进程结束后,调用上面定义的函数。
// This is the function what `docker run` will call
func Run(tty bool, containerCmd []string, res *subsystems.ResourceConfig) {
// this is "docker init <containerCmd>"
initProcess, writePipe := container.NewProcess(tty)
// start the init process
if err := initProcess.Start(); err != nil{
logrus.Error(err)
}
// create container manager to control resource config on all hierarchies
// this is the cgroupPath
cm := cgroups.NewCgroupManager("oyishyi-docker-first-cgroup")
defer cm.Remove()
if err := cm.Set(res); err != nil {
logrus.Error(err)
}
if err := cm.AddProcess(initProcess.Process.Pid); err != nil {
logrus.Error(err)
}
// send command to write side
// will close the plug
sendInitCommand(containerCmd, writePipe)
if err := initProcess.Wait(); err != nil {
logrus.Error(err)
}
imagesRootURL := "./images/"
mntURL := "./mnt/"
container.DeleteWorkspace(imagesRootURL, mntURL)
os.Exit(-1)
}
复制代码
5. 实现 volume 数据持久化
5.1 添加 volume
volume 只需要在之前定义的 newWorkspace 上添加一个 createVolume 函数即可。
func createVolume(imagesRootURL string, mntURL string, volume string) {
if volume == "" {
return
}
// extract url from volume input
volumeURls := strings.Split(volume, ":")
if len(volumeURls) == 2 && volumeURls[0] != "" && volumeURls[1] != "" {
mountVolume(imagesRootURL, mntURL, volumeURls)
logrus.Infof("volume created: %q", volumeURls)
} else {
logrus.Warn("volume path not valid")
}
}
func mountVolume(imagesRootURL string, mntURL string, volumeURls []string) {
hostVolumeURL := volumeURls[0]
containerVolumeURL := mntURL + volumeURls[1]
// create host dir, which store the real data
if err:= os.Mkdir(hostVolumeURL, 0777); err != nil {
logrus.Errorf("create host volume directory fails: %v", err)
}
// create container dir, which is a mount point
// currently not in container(container root is not "/"), so mntURL prefix is needed
if err:= os.Mkdir(containerVolumeURL, 0777); err != nil {
logrus.Errorf("create container %s volume fails: %v", containerVolumeURL, err)
}
dirs :="dirs=" + hostVolumeURL
cmd := exec.Command("mount", "-t", "aufs", dirs, "none", containerVolumeURL)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err:= cmd.Run(); err != nil {
logrus.Errorf("mount container volume fails: %v", err)
}
}
复制代码
5.2 删除 volume
删除 volume 可以和之前的 deleteMount 函数合并起来,其名为 deleteMountWithVolume:
- umount volume mount
- umount whole container mount
- delete whole container mount(consequently deleting volume mount)
func deleteMountWithVolume(mntURL string, volume string) {
// umount volume mount
volumeURls := strings.Split(volume, ":")
if len(volumeURls) == 2 && volumeURls[0] != "" && volumeURls[1] != "" {
volumeMountURL := mntURL + volumeURls[1]
cmd := exec.Command("umount", volumeMountURL)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
logrus.Errorf("umount volume mount %s fails: %v", volumeMountURL, err)
}
}
// umount container mount
cmd := exec.Command("umount", mntURL)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
logrus.Errorf("umount container mount %s fails: %v", mntURL, err)
}
// delete whole container mount
if err := os.RemoveAll(mntURL); err != nil {
logrus.Errorf("remove dir %s error: %v", mntURL, err)
}
}
复制代码
5.3 docker commit 的实现
顾名思义,实现 docker commit,因为很简单。
其实就是把 mnt 文件夹打个包,完事。
5.3.1 commands.go
var commitCommand = cli.Command{
Name: "commit",
Usage: "commit the container into image",
Action: func(context *cli.Context) error {
args := context.Args()
if args.Len() == 0 {
return errors.New("Commit what?")
}
imageName := args.Get(0)
dockerCommands.CommitContainer(imageName)
return nil
},
}
复制代码
5.3.2 dockerCommands/commit.go
package dockerCommands
import (
"github.com/sirupsen/logrus"
"os/exec"
)
func CommitContainer(imageName string) {
mntURL := "./mnt"
storeURL := "./images/" + imageName + ".tar"
logrus.Infof("stored path: %v", storeURL)
cmd := exec.Command("tar", "-czf", storeURL, "-C", mntURL, ".")
if err := cmd.Run(); err != nil {
logrus.Errorf("Tar folder %s fails %v", mntURL, err)
}
}
复制代码
© 版权声明
文章版权归作者所有,未经允许请勿转载。
THE END