ForkJoin框架简介

ForkJoin是在JDK1.7后提供多线并发处理框架。ForkJoin的框架的基本思想是化整为零, 是一个把大任务分割成若干个小任务,最终汇总每个小任务结果后得到大任务结果的框架。类似Hadoop的MapReduce思想。

我们再通过Fork和Join这两个单词来理解下 Fork/Join 框架，Fork 就是把一个大任务切分为若干子任务并行的执行，Join 就是合并这些子任务的执行结果，最后得到这个大任务的结果。比如计算 1+2+…＋10000，可以分割成 10 个子任务，每个子任务分别对 1000 个数进行求和，最终汇总这 10 个子任务的结果。Fork/Join 的运行流程图如下：

工作窃取算法

工作窃取（work-stealing）算法是指某个线程从其他队列里窃取任务来执行。
Fork-Join框架使用ForkJoinPool这个特殊的线程池来处理任务之间有依赖的情况，其实现了“work-stealing”算法（工作量窃取算法）并执行ForkJoinTask对象。ForkJoinPool保持多个线程，其线程数量按照CPU核心数来设置。每个线程都有一个特殊类型的deques队列（双端队列），放置该线程的所有任务，而不是所有线程共享一个公共队列。

工作窃取的运行流程图如下：

那么为什么需要使用工作窃取算法呢？假如我们需要做一个比较大的任务，我们可以把这个任务分割为若干互不依赖的子任务，为了减少线程间的竞争，于是把这些子任务分别放到不同的队列里，并为每个队列创建一个单独的线程来执行队列里的任务，线程和队列一一对应，比如 A 线程负责处理 A 队列里的任务。但是有的线程会先把自己队列里的任务干完，而其他线程对应的队列里还有任务等待处理。干完活的线程与其等着，不如去帮其他线程干活，于是它就去其他线程的队列里窃取一个任务来执行。而在这时它们会访问同一个队列，所以为了减少窃取任务线程和被窃取任务线程之间的竞争，通常会使用双端队列，被窃取任务线程永远从双端队列的头部拿任务执行（LIFO 后进先出），而窃取任务的线程永远从双端队列的尾部拿任务执行（FIFO 先进先出）。

工作窃取算法的优点是充分利用线程进行并行计算，并减少了线程间的竞争，其缺点是在某些情况下还是存在竞争，比如双端队列里只有一个任务时。并且消耗了更多的系统资源，比如创建多个线程和多个双端队列。

如果没看懂，再来举个例子。

如下图所示：

上图表示，每一个线程都拥有自己的任务队列，线程可以执行该任务队列中的任务，允许线程将被阻塞的任务暂时放到一边去。也就是说，如果当前任务无法继续（可能需要依赖子任务），那么就会将该阻塞任务放到该队列中挂起，直到所有依赖都就绪。

新任务通过Fork操作被添加到线程的队列中，每个线程总是处理“最后添加到队列中的任务（LIFO 后进先出）”

如上图所示，线程1的任务队列中，“任务1”先于“任务2”进入任务队列，因此任务2优先被线程1执行，然后再执行任务1。如果可行，任何空闲线程都可以从其他线程队列中获取（窃取）任务。一个线程总是从其他线程中窃取“最老”的任务（FIFO 先进先出）。

如上图所示，线程2从线程1的队列中窃取了最老的任务1。线程总是试图从相邻的线程中窃取，以减少窃取任务时可能产生的争用。

任务执行和窃取的顺序十分重要，通常情况下，任务窃取这个动作不会频繁发生，因为这窃取动作时比较耗成本的，当一个任务从一个线程移到另一个线程时，与该任务相关的上下文需要从一个线程的堆栈移到另一个线程的堆栈。如果两个线程又恰好不在一个CPU中，还会导致上下文切换在CPU之间进行，成本更高。因此Fork-Join框架尽量减小这种情况的发生。

ForkJoin 框架的介绍

ForkJoin框架的基本原理就是对大任务的分割和对分割后的小任务的合并。也就是fork和join。

1.分割任务 – fork。需要有个类将大任务分割成子任务，直到将子任务分割到足够小为止。

2.执行任务并合并结果。分割的子任务分别放到deques中，然后启动线程分别执行deques中的任务。子任务执行完的结果都统一放在一个队列里，启动一个线程从队列里拿数据，然后合并这些数据。

ForkJoin 使用两个类来完成以上两件事情：

ForkJoinPool ：ForkJoinPool是ForkJoin的核心，它实现了AbstractExecutorService，并且管理着ForkJoinWorkerThread。ForkJoinPool并不会为每一个ForkJoinTask分配一个线程，而是为每一个工作线程分配一个deque。ForkJoinTask 需要通过 ForkJoinPool 来执行。任务分割出的子任务会添加到当前工作线程所维护的双端队列中，进入队列的头部。当一个工作线程的队列里暂时没有任务时，它会随机从其他工作线程的队列的尾部获取一个任务。

ForkJoinTask：我们要使用 ForkJoin 框架，必须首先创建一个 ForkJoin 任务。它提供在任务中执行 fork() 和 join() 操作的机制，通常情况下我们不需要直接继承 ForkJoinTask 类，而只需要继承它的子类，Fork/Join 框架提供了以下两个子类：
- RecursiveAction：用于没有返回结果的任务。
- RecursiveTask ：用于有返回结果的任务。

ForkJoinPool 构造函数

ForkJoinPool 构造函数：

    public ForkJoinPool() {
        this(Math.min(MAX_CAP, Runtime.getRuntime().availableProcessors()),
             defaultForkJoinWorkerThreadFactory, null, false);
    }
复制代码

ForkJoinPool() 以Runtime.availableProcessors()方法的返回值作为parallelism参数来创建ForkJoinPool。

    public ForkJoinPool(int parallelism) {
        this(parallelism, defaultForkJoinWorkerThreadFactory, null, false);
    }
复制代码

ForkJoinPool(int parallelism) 创建一个包含parallelism个并行线程的ForkJoinPool。

其中参数最全的public构造函数：

public ForkJoinPool(int parallelism,
                        ForkJoinWorkerThreadFactory factory,
                        UncaughtExceptionHandler handler,
                        boolean asyncMode) {
        this(checkParallelism(parallelism),
             checkFactory(factory),
             handler,
             asyncMode ? FIFO_QUEUE : LIFO_QUEUE,
             "ForkJoinPool-" + nextPoolId() + "-worker-");
        checkPermission();
    }
复制代码

parallelism：可并行级别，Fork/Join框架将依据这个并行级别的设定，决定框架内并行执行的线程数量。并行的每一个任务都会有一个线程进行处理，但是千万不要将这个属性理解成Fork/Join框架中最多存在的线程数量，也不要将这个属性和ThreadPoolExecutor线程池中的corePoolSize、maximumPoolSize属性进行比较，因为ForkJoinPool的组织结构和工作方式与后者完全不一样。而后续的讨论中，有读者还可以发现Fork/Join框架中可存在的线程数量和这个参数值的关系并不是绝对的关联（有依据但并不全由它决定）。
factory：当Fork/Join框架创建一个新的线程时，同样会用到线程创建工厂。只不过这个线程工厂不再需要实现ThreadFactory接口，而是需要实现ForkJoinWorkerThreadFactory接口。后者是一个函数式接口，只需要实现一个名叫newThread的方法。在Fork/Join框架中有一个默认的ForkJoinWorkerThreadFactory接口实现：DefaultForkJoinWorkerThreadFactory
handler：异常捕获处理器。当执行的任务中出现异常，并从任务中被抛出时，就会被handler捕获。
asyncMode：这个参数也非常重要，从字面意思来看是指的异步模式，它并不是说Fork/Join框架是采用同步模式还是采用异步模式工作。Fork/Join框架中为每一个独立工作的线程准备了对应的待执行任务队列，这个任务队列是使用数组进行组合的双向队列。即是说存在于队列中的待执行任务，即可以使用先进先出（FIFO）的工作模式，也可以使用后进先出（LIFO）的工作模式。

私有化构造函数
/**
     * Creates a {@code ForkJoinPool} with the given parameters, without
     * any security checks or parameter validation.  Invoked directly by
     * makeCommonPool.
     */
    private ForkJoinPool(int parallelism,
                         ForkJoinWorkerThreadFactory factory,
                         UncaughtExceptionHandler handler,
                         int mode,
                         String workerNamePrefix) {
        this.workerNamePrefix = workerNamePrefix;
        this.factory = factory;
        this.ueh = handler;
        this.config = (parallelism & SMASK) | mode;
        long np = (long)(-parallelism); // offset ctl counts
        this.ctl = ((np << AC_SHIFT) & AC_MASK) | ((np << TC_SHIFT) & TC_MASK);
    }

复制代码

可以看出public构造函数追踪根源都会走到上面private构造函数的调用。

还有一个private构造函数：makeCommonPool()

    /**
     * Creates and returns the common pool, respecting user settings
     * specified via system properties.
     */
    private static ForkJoinPool makeCommonPool() {
        int parallelism = -1;
        ForkJoinWorkerThreadFactory factory = null;
        UncaughtExceptionHandler handler = null;
        try {  // ignore exceptions in accessing/parsing properties
            String pp = System.getProperty
                ("java.util.concurrent.ForkJoinPool.common.parallelism");
            String fp = System.getProperty
                ("java.util.concurrent.ForkJoinPool.common.threadFactory");
            String hp = System.getProperty
                ("java.util.concurrent.ForkJoinPool.common.exceptionHandler");
            if (pp != null)
                parallelism = Integer.parseInt(pp);
            if (fp != null)
                factory = ((ForkJoinWorkerThreadFactory)ClassLoader.
                           getSystemClassLoader().loadClass(fp).newInstance());
            if (hp != null)
                handler = ((UncaughtExceptionHandler)ClassLoader.
                           getSystemClassLoader().loadClass(hp).newInstance());
        } catch (Exception ignore) {
        }
        if (factory == null) {
            if (System.getSecurityManager() == null)
                factory = defaultForkJoinWorkerThreadFactory;
            else // use security-managed default
                factory = new InnocuousForkJoinWorkerThreadFactory();
        }
        if (parallelism < 0 && // default 1 less than #cores
            (parallelism = Runtime.getRuntime().availableProcessors() - 1) <= 0)
            parallelism = 1;
        if (parallelism > MAX_CAP)
            parallelism = MAX_CAP;
        return new ForkJoinPool(parallelism, factory, handler, LIFO_QUEUE,
                                "ForkJoinPool.commonPool-worker-");
    }
复制代码

通过方法makeCommonPool()的描述，可以看出调用这个方法，可以创建一个依赖于用户通过系统配置指定的settings的ForkJoinPool（LIFO_QUEUE队列）。

用户手动配置：

parallelism
threadFactory 
exceptionHandler
复制代码

ForkJoinTask启动方式

异步执行        　 execute(ForkJoinTask)    　　　　 ForkJoinTask.fork
等待获取结果    　　invoke(ForkJoinTask)    　　　　  ForkJoinTask.invoke
执行,获取Future    submit(ForkJoinTask)  　　　　　　ForkJoinTask.fork(ForkJoinTask are Futures)       
复制代码

ForkJoinTask异常处理

ForkJoinTask在执行的时候可能会抛出异常，但是没办法在主线程里直接捕获异常，所以ForkJoinTask提供了isCompletedAbnormally()方法来检查任务是否已经抛出异常或已经被取消了，并且可以通过ForkJoinTask的getException方法获取异常.

    /**
     * Returns {@code true} if this task threw an exception or was cancelled.
     *
     * @return {@code true} if this task threw an exception or was cancelled
     */
    public final boolean isCompletedAbnormally() {
        return status < NORMAL;
    }
复制代码

如果task抛出异常或被取消了，则返回true。

    /**
     * Returns {@code true} if this task completed without throwing an
     * exception and was not cancelled.
     *
     * @return {@code true} if this task completed without throwing an
     * exception and was not cancelled
     */
    public final boolean isCompletedNormally() {
        return (status & DONE_MASK) == NORMAL;
    }
复制代码

如果task没有被取消且执行完成不抛出异常，则返回true。

    /**
     * Returns the exception thrown by the base computation, or a
     * {@code CancellationException} if cancelled, or {@code null} if
     * none or if the method has not yet completed.
     *
     * @return the exception, or {@code null} if none
     */
    public final Throwable getException() {
        int s = status & DONE_MASK;
        return ((s >= NORMAL)    ? null :
                (s == CANCELLED) ? new CancellationException() :
                getThrowableException());
    }
复制代码

获取抛出的异常：

任务抛出的异常
任务被取消则抛出CancellationException
任务未执行完或无异常抛出返回null

ForkJoin实现原理

ForkJoinPool 由 ForkJoinTask 数组和 ForkJoinWorkerThread 数组组成，ForkJoinTask 数组负责存放程序提交给 ForkJoinPool 的任务，而 ForkJoinWorkerThread 数组负责执行这些任务。

fork()

ForkJoinTask的fork方法实现原理。当我们调用ForkJoinTask的fork方法时，程序会调用ForkJoinWorkerThread的工作队列，将当前任务push到workQueue中，并返回结果。

/**
     * Arranges to asynchronously execute this task in the pool the
     * current task is running in, if applicable, or using the {@link
     * ForkJoinPool#commonPool()} if not {@link #inForkJoinPool}.  While
     * it is not necessarily enforced, it is a usage error to fork a
     * task more than once unless it has completed and been
     * reinitialized.  Subsequent modifications to the state of this
     * task or any data it operates on are not necessarily
     * consistently observable by any thread other than the one
     * executing it unless preceded by a call to {@link #join} or
     * related methods, or a call to {@link #isDone} returning {@code
     * true}.
     *
     * @return {@code this}, to simplify usage
     */
    public final ForkJoinTask<V> fork() {
        Thread t;
        if ((t = Thread.currentThread()) instanceof ForkJoinWorkerThread)
            ((ForkJoinWorkerThread)t).workQueue.push(this);
        else
            ForkJoinPool.common.externalPush(this);
        return this;
    }
复制代码

ForkJoinWorkerThread.workQueue.push 方法把当前任务存放在 ForkJoinTask 数组 queue 里。然后再调用 ForkJoinPool 的 signalWork(WorkQueue[] ws, WorkQueue q) 方法唤醒或创建一个工作线程来执行任务。

    /**
         * Pushes a task. Call only by owner in unshared queues.  (The
         * shared-queue version is embedded in method externalPush.)
         *
         * @param task the task. Caller must ensure non-null.
         * @throws RejectedExecutionException if array cannot be resized
         */
        final void push(ForkJoinTask<?> task) {
            ForkJoinTask<?>[] a; ForkJoinPool p;
            int b = base, s = top, n;
            if ((a = array) != null) {    // ignore if queue removed
                int m = a.length - 1;     // fenced write for task visibility
                U.putOrderedObject(a, ((m & s) << ASHIFT) + ABASE, task);
                U.putOrderedInt(this, QTOP, s + 1);
                if ((n = s - b) <= 1) {
                    if ((p = pool) != null)
                        p.signalWork(p.workQueues, this);
                }
                else if (n >= m)
                    growArray();
            }
        }
复制代码

join()

ForkJoinTask 的 join 方法实现原理。Join 方法的主要作用是阻塞当前线程并等待获取结果。

    /**
     * Returns the result of the computation when it {@link #isDone is
     * done}.  This method differs from {@link #get()} in that
     * abnormal completion results in {@code RuntimeException} or
     * {@code Error}, not {@code ExecutionException}, and that
     * interrupts of the calling thread do <em>not</em> cause the
     * method to abruptly return by throwing {@code
     * InterruptedException}.
     *
     * @return the computed result
     */
    public final V join() {
        int s;
            //调用doJoin方法阻塞等待的结果不是NORMAL,说明有异常或取消.报告异常
        if ((s = doJoin() & DONE_MASK) != NORMAL)
            reportException(s);
            //等于NORMAL,正常执行完毕,返回原始结果
        return getRawResult();
    }
复制代码

可以看到方法内先调用了doJoin()方法，通过doJoin()方法得到当前任务的状态来判断返回什么结果。任务状态有四种：

已完成（NORMAL）
被取消（CANCELLED）
信号（SIGNAL）
出现异常（EXCEPTIONAL）
复制代码

    /**
     * Throws exception, if any, associated with the given status.
     */
    private void reportException(int s) {
        //任务被取消，返回CancellationException
        if (s == CANCELLED)
            throw new CancellationException();
        //任务异常，则抛出异常    
        if (s == EXCEPTIONAL)
            rethrow(getThrowableException());
    }
复制代码

doJoin()

doJoin()方法里，首先通过查看任务的状态，看任务是否已经执行完成，如果执行完成，则直接返回任务状态；如果没有执行完，则从任务数组里取出任务并执行。

如果任务顺利执行完成，则设置任务状态为NORMAL，如果出现异常，则记录异常，并将任务状态设置为EXCEPTIONAL。

    /**
     * Implementation for join, get, quietlyJoin. Directly handles
     * only cases of already-completed, external wait, and
     * unfork+exec.  Others are relayed to ForkJoinPool.awaitJoin.
     *
     * @return status upon completion
     */
    private int doJoin() {
        int s; Thread t; ForkJoinWorkerThread wt; ForkJoinPool.WorkQueue w;
            //已完成,返回status,未完成再尝试后续
        return (s = status) < 0 ? s :
            //未完成,当前线程是ForkJoinWorkerThread,从该线程中取出workQueue,并尝试将当前task出队然后执行,
            //执行的结果是完成则返回状态,否则使用当前线程池所在的ForkJoinPool的awaitJoin方法等待
            ((t = Thread.currentThread()) instanceof ForkJoinWorkerThread) ?
            (w = (wt = (ForkJoinWorkerThread)t).workQueue).
            tryUnpush(this) && (s = doExec()) < 0 ? s :
            wt.pool.awaitJoin(w, this, 0L) :
            //当前线程不是ForkJoinWorkerThread,调用externalAwaitDone方法.
            externalAwaitDone();
    }
    
    /**
     * Primary execution method for stolen tasks. Unless done, calls
     * exec and records status if completed, but doesn't wait for
     * completion otherwise.
     *
     * @return status on exit from this method
     */
    final int doExec() {
        int s; boolean completed;
        if ((s = status) >= 0) {
            try {
                completed = exec();
            } catch (Throwable rex) {
                return setExceptionalCompletion(rex);
            }
            if (completed)
                s = setCompletion(NORMAL);
        }
        return s;
    }
复制代码

invoke()

invoke()里先调用doInvoke()方法，然后判断返回状态，如果状态不是NORMAL，则报告异常；如果状态正常则返回结果。

    /**
     * Commences performing this task, awaits its completion if
     * necessary, and returns its result, or throws an (unchecked)
     * {@code RuntimeException} or {@code Error} if the underlying
     * computation did so.
     *
     * @return the computed result
     */
    public final V invoke() {
        int s;
        if ((s = doInvoke() & DONE_MASK) != NORMAL)
        //doInvoke方法的结果status只保留完成态位表示非NORMAL,则报告异常
            reportException(s);
        //正常完成,返回原始结果
        return getRawResult();
    }
    
    /**
     * Implementation for invoke, quietlyInvoke.
     *
     * @return status upon completion
     */
    private int doInvoke() {
        int s; Thread t; ForkJoinWorkerThread wt;
        return (s = doExec()) < 0 ? s :
            ((t = Thread.currentThread()) instanceof ForkJoinWorkerThread) ?
            (wt = (ForkJoinWorkerThread)t).pool.
//ForkJoinPool::awaitJoin,在该方法中使用循环的方式进行internalWait,
//满足了每次按截止时间或周期进行等待,同时也顺便解决了虚假唤醒
            awaitJoin(wt.workQueue, this, 0L) :
            externalAwaitDone();
    }
复制代码