Java ForkJoinPoolが作成するスレッドの数を決定するものは何ですか？

Question

私がForkJoinPoolを理解している限り、そのプールは固定数のスレッド（デフォルト：コアの数）を作成し、アプリケーションがmanagedBlockを使用してスレッドの必要性を示さない限り、それ以上スレッドを作成しません。

ただし、ForkJoinPool.getPoolSize()を使用すると、30,000のタスク（RecursiveAction）を作成するプログラムで、それらのタスクを実行するForkJoinPoolが平均700スレッド（タスクが作成されるたびにカウントされるスレッド）を使用することがわかりました。タスクはI/Oを行いませんが、純粋な計算を行います。タスク間の同期は、ForkJoinTask.join()の呼び出しとAtomicBooleansへのアクセスのみです。つまり、スレッドをブロックする操作はありません。

join()は、私が理解しているように呼び出しスレッドをブロックしないので、プール内のスレッドがブロックする必要がある理由はありません。したがって、（私は想定していた）それ以上スレッドを作成する理由はありません（これはそれにもかかわらず明らかに起こっている）。

では、なぜForkJoinPoolがこれほど多くのスレッドを作成するのですか？作成されるスレッドの数を決定する要因は何ですか？

私はこの質問がコードを投稿せずに答えられることを望んでいましたが、ここでそれは要求に応じて来ます。このコードは、4倍のサイズのプログラムからの抜粋であり、重要な部分に縮小されています。そのままではコンパイルされません。もちろん、プログラム全体を投稿することもできます。

プログラムは、深さ優先検索を使用して、特定の始点から特定の終点までの経路を迷路で検索します。ソリューションが存在することが保証されています。主なロジックは、SolverTaskのcompute()メソッドにあります。ある特定のポイントから始まり、現在のポイントから到達可能なすべての隣接ポイントに続くRecursiveAction。各分岐点で新しいSolverTaskを作成するのではなく（非常に多くのタスクを作成します）、1つを除くすべてのネイバーをバックトラッキングスタックにプッシュして後で処理し、スタックにプッシュされていない1つのネイバーのみを続行します。その方法で行き止まりに達すると、最後にバックトラッキングスタックにプッシュされたポイントがポップされ、そこから検索が続行されます（それに応じてタクスの開始ポイントから構築されたパスがカットバックされます）。タスクが特定のしきい値より大きいバックトラッキングスタックを検出すると、新しいタスクが作成されます。それ以降、タスクは、バックトラッキングスタックからポップし続けるまで、それが使い果たされるまで、分岐点に到達したときにスタックにそれ以上のポイントをプッシュせず、そのようなポイントごとに新しいタスクを作成します。したがって、タスクのサイズは、スタック制限しきい値を使用して調整できます。

上記で引用した数値（「30,000タスク、平均700スレッド」）は、5000x5000セルの迷路を検索したものです。だから、ここに重要なコードがあります：

class SolverTask extends RecursiveTask<ArrayDeque<Point>> { // Once the backtrack stack has reached this size, the current task // will never add another cell to it, but create a new task for each // newly discovered branch: private static final int MAX_BACKTRACK_CELLS = 100*1000; /** * @return Tries to compute a path through the maze from local start to end * and returns that (or null if no such path found) */ @Override public ArrayDeque<Point> compute() { // Is this task still accepting new branches for processing on its own, // or will it create new tasks to handle those? boolean stillAcceptingNewBranches = true; Point current = localStart; ArrayDeque<Point> pathFromLocalStart = new ArrayDeque<Point>(); // Path from localStart to (including) current ArrayDeque<PointAndDirection> backtrackStack = new ArrayDeque<PointAndDirection>(); // Used as a stack: Branches not yet taken; solver will backtrack to these branching points later Direction[] allDirections = Direction.values(); while (!current.equals(end)) { pathFromLocalStart.addLast(current); // Collect current's unvisited neighbors in random order: ArrayDeque<PointAndDirection> neighborsToVisit = new ArrayDeque<PointAndDirection>(allDirections.length); for (Direction directionToNeighbor: allDirections) { Point neighbor = current.getNeighbor(directionToNeighbor); // contains() and hasPassage() are read-only methods and thus need no synchronization if (maze.contains(neighbor) && maze.hasPassage(current, neighbor) && maze.visit(neighbor)) neighborsToVisit.add(new PointAndDirection(neighbor, directionToNeighbor.opposite)); } // Process unvisited neighbors if (neighborsToVisit.size() == 1) { // Current node is no branch: Continue with that neighbor current = neighborsToVisit.getFirst().getPoint(); continue; } if (neighborsToVisit.size() >= 2) { // Current node is a branch if (stillAcceptingNewBranches) { current = neighborsToVisit.removeLast().getPoint(); // Push all neighbors except one on the backtrack stack for later processing for(PointAndDirection neighborAndDirection: neighborsToVisit) backtrackStack.Push(neighborAndDirection); if (backtrackStack.size() > MAX_BACKTRACK_CELLS) stillAcceptingNewBranches = false; // Continue with the one neighbor that was not pushed onto the backtrack stack continue; } else { // Current node is a branch point, but this task does not accept new branches any more: // Create new task for each neighbor to visit and wait for the end of those tasks SolverTask[] subTasks = new SolverTask[neighborsToVisit.size()]; int t = 0; for(PointAndDirection neighborAndDirection: neighborsToVisit) { SolverTask task = new SolverTask(neighborAndDirection.getPoint(), end, maze); task.fork(); subTasks[t++] = task; } for (SolverTask task: subTasks) { ArrayDeque<Point> subTaskResult = null; try { subTaskResult = task.join(); } catch (CancellationException e) { // Nothing to do here: Another task has found the solution and cancelled all other tasks } catch (Exception e) { e.printStackTrace(); } if (subTaskResult != null) { // subtask found solution pathFromLocalStart.addAll(subTaskResult); // No need to wait for the other subtasks once a solution has been found return pathFromLocalStart; } } // for subTasks } // else (not accepting any more branches) } // if (current node is a branch) // Current node is dead end or all its neighbors lead to dead ends: // Continue with a node from the backtracking stack, if any is left: if (backtrackStack.isEmpty()) { return null; // No more backtracking avaible: No solution exists => end of this task } // Backtrack: Continue with cell saved at latest branching point: PointAndDirection pd = backtrackStack.pop(); current = pd.getPoint(); Point branchingPoint = current.getNeighbor(pd.getDirectionToBranchingPoint()); // DEBUG System.out.println("Backtracking to " + branchingPoint); // Remove the dead end from the top of pathSoFar, i.e. all cells after branchingPoint: while (!pathFromLocalStart.peekLast().equals(branchingPoint)) { // DEBUG System.out.println(" Going back before " + pathSoFar.peekLast()); pathFromLocalStart.removeLast(); } // continue while loop with newly popped current } // while (current ... if (!current.equals(end)) { // this task was interrupted by another one that already found the solution // and should end now therefore: return null; } else { // Found the solution path: pathFromLocalStart.addLast(current); return pathFromLocalStart; } } // compute() } // class SolverTask @SuppressWarnings("serial") public class ParallelMaze { // for each cell in the maze: Has the solver visited it yet? private final AtomicBoolean[][] visited; /** * Atomically marks this point as visited unless visited before * @return whether the point was visited for the first time, i.e. whether it could be marked */ boolean visit(Point p) { return visited[p.getX()][p.getY()].compareAndSet(false, true); } public static void main(String[] args) { ForkJoinPool pool = new ForkJoinPool(); ParallelMaze maze = new ParallelMaze(width, height, new Point(width-1, 0), new Point(0, height-1)); // Start initial task long startTime = System.currentTimeMillis(); // since SolverTask.compute() expects its starting point already visited, // must do that explicitly for the global starting point: maze.visit(maze.start); maze.solution = pool.invoke(new SolverTask(maze.start, maze.end, maze)); // One solution is enough: Stop all tasks that are still running pool.shutdownNow(); pool.awaitTermination(Integer.MAX_VALUE, TimeUnit.DAYS); long endTime = System.currentTimeMillis(); System.out.println("Computed solution of length " + maze.solution.size() + " to maze of size " + width + "x" + height + " in " + ((float)(endTime - startTime))/1000 + "s."); }

elusive-code · Answer

Stackoverflowには関連する質問があります。

invokeAll/join中にForkJoinPoolがストールする

ForkJoinPoolはスレッドを無駄にするようです

私は何が起こっているかの実行可能なストリップバージョンを作成しました（使用したjvm引数：-Xms256m -Xmx1024m -Xss8m）：

import Java.util.ArrayList; import Java.util.List; import Java.util.concurrent.ForkJoinPool; import Java.util.concurrent.RecursiveAction; import Java.util.concurrent.RecursiveTask; import Java.util.concurrent.TimeUnit; public class Test1 { private static ForkJoinPool pool = new ForkJoinPool(2); private static class SomeAction extends RecursiveAction { private int counter; //recursive counter private int childrenCount=80;//amount of children to spawn private int idx; // just for displaying private SomeAction(int counter, int idx) { this.counter = counter; this.idx = idx; } @Override protected void compute() { System.out.println( "counter=" + counter + "." + idx + " activeThreads=" + pool.getActiveThreadCount() + " runningThreads=" + pool.getRunningThreadCount() + " poolSize=" + pool.getPoolSize() + " queuedTasks=" + pool.getQueuedTaskCount() + " queuedSubmissions=" + pool.getQueuedSubmissionCount() + " parallelism=" + pool.getParallelism() + " stealCount=" + pool.getStealCount()); if (counter <= 0) return; List<SomeAction> list = new ArrayList<>(childrenCount); for (int i=0;i<childrenCount;i++){ SomeAction next = new SomeAction(counter-1,i); list.add(next); next.fork(); } for (SomeAction action:list){ action.join(); } } } public static void main(String[] args) throws Exception{ pool.invoke(new SomeAction(2,0)); } }

どうやら結合を実行すると、現在のスレッドは必要なタスクがまだ完了していないことを確認し、自分で実行するために別のタスクを実行します。

Java.util.concurrent.ForkJoinWorkerThread#joinTask。

ただし、この新しいタスクは同じタスクをより多く生成しますが、結合でスレッドがロックされているため、プール内でスレッドを見つけることができません。そして、それらが解放されるのに必要な時間を知る方法がないため（スレッドは無限ループになるか、永久にデッドロックされる可能性があります）、新しいスレッドが生成されます（結合されたスレッドをルイスワッサーマンが言及されています）：Java.util.concurrent.ForkJoinPool#signalWork

したがって、このようなシナリオを防ぐには、タスクの再帰的な生成を回避する必要があります。

たとえば、上記のコードで初期パラメーターを1に設定した場合、childrenCountを10倍に増やしても、アクティブスレッドの量は2になります。

また、アクティブなスレッドの量は増加しますが、実行中のスレッドの量はparallelism以下になります。

Louis Wasserman · Answer

ソースコメントから：

補正：すでに十分なライブスレッドがない限り、メソッドtryPreBlock（）はスペアスレッドを作成または再アクティブ化して、ブロック解除されるまでブロックされたジョイナーを補正します。

私は何が起こっているのかと思いますが、タスクをすぐに完了していないため、新しいタスクを送信すると利用可能なワーカースレッドがないため、新しいスレッドが作成されます。

edharned · Answer

strict、full-strict、およびterminally-strictは、有向非巡回グラフ（DAG）の処理に関係しています。それらの用語をグーグルして完全に理解することができます。これは、フレームワークが処理するように設計されたタイプの処理です。 API for Recursive ...のコードを見てください。フレームワークは、compute（）コードに依存して他のcompute（）リンクを実行してから、join（）を実行します。各タスクは、DAGの処理と同じように1つのjoin（）を実行します。

DAG処理を行っていません。多くの新しいタスクを分岐し、それぞれで待機（join（））しています。ソースコードを読んでください。ものすごく複雑ですが、理解できるかもしれません。フレームワークは適切なタスク管理を行いません。 join（）を実行するときに待機タスクを配置する場所はどこですか？中断されたキューはありません。そのため、モニタースレッドは常にキューを見て何が終了したかを確認する必要があります。これがフレームワークが「継続スレッド」を使用する理由です。 1つのタスクがjoin（）を実行すると、フレームワークは、1つの下位タスクが完了するのを待っていると想定します。多くのjoin（）メソッドが存在する場合、スレッドは続行できないため、ヘルパーまたは継続スレッドが存在する必要があります。

上記のように、スキャッター/ギャザータイプのフォーク/ジョインプロセスが必要です。そこでタスクをいくつでもフォークできます

Artёm Basov · Answer

Holger Peine および elusive-code によって投稿された両方のコードスニペットは、実際には 1.8バージョンのjavadoc に記載されている推奨プラクティスに従っていません。

最も一般的な使用法では、fork-joinペアは、並列再帰関数からの呼び出し（fork）とreturn（join）のように機能します。他の形式の再帰呼び出しの場合と同様に、戻り（結合）は最も内側から実行する必要があります。たとえば、a.fork（）; b.fork（）; b.join（）; a.join（）;は、コードbの前にコードaを結合するよりも実質的に効率的です。

どちらの場合も、FJPoolはデフォルトのコンストラクターを介してインスタンス化されました。これにより、asyncMode = falseでプールが構築されます。これはデフォルトです。

@param asyncMode（trueの場合）
参加することのないforkされたタスクに対してローカルの先入れ先出しスケジューリングモードを確立します。このモードは、ワーカースレッドがイベントスタイルの非同期タスクのみを処理するアプリケーションでは、デフォルトのローカルスタックベースのモードよりも適切な場合があります。デフォルト値には、falseを使用します。

そのように作業キューは実際にはlifoです：
頭-> | t4 | t3 | t2 | t1 | ... | <-テール

したがって、スニペットでは、fork（）すべてのタスクがそれらをスタックにプッシュし、join（）よりも同じ順序で、つまり最深部からタスク（t1）から最上位（t4）へのブロックは、他のスレッドが（t1）をスチールし、次に（t2）をスチールするまで効果的にブロックします。すべてのプールスレッドをブロックするためのタスクがすべてあるため（task_count >> pool.getParallelism（））、補正は Louis Wasserman の説明のように始まります。

Ivan Beziazychnyi · Answer

elusive-code によってポストされたコードの出力がJavaのバージョンに依存することは注目に値します。 Java 8でコードを実行すると、出力が表示されます。

... counter=0.73 activeThreads=45 runningThreads=5 poolSize=49 queuedTasks=105 queuedSubmissions=0 parallelism=2 stealCount=3056 counter=0.75 activeThreads=46 runningThreads=1 poolSize=51 queuedTasks=0 queuedSubmissions=0 parallelism=2 stealCount=3158 counter=0.77 activeThreads=47 runningThreads=3 poolSize=51 queuedTasks=0 queuedSubmissions=0 parallelism=2 stealCount=3157 counter=0.74 activeThreads=45 runningThreads=3 poolSize=51 queuedTasks=5 queuedSubmissions=0 parallelism=2 stealCount=3153

しかし、Java 11で同じコードを実行すると、出力が異なります。

... counter=0.75 activeThreads=1 runningThreads=1 poolSize=2 queuedTasks=4 queuedSubmissions=0 parallelism=2 stealCount=0 counter=0.76 activeThreads=1 runningThreads=1 poolSize=2 queuedTasks=3 queuedSubmissions=0 parallelism=2 stealCount=0 counter=0.77 activeThreads=1 runningThreads=1 poolSize=2 queuedTasks=2 queuedSubmissions=0 parallelism=2 stealCount=0 counter=0.78 activeThreads=1 runningThreads=1 poolSize=2 queuedTasks=1 queuedSubmissions=0 parallelism=2 stealCount=0 counter=0.79 activeThreads=1 runningThreads=1 poolSize=2 queuedTasks=0 queuedSubmissions=0 parallelism=2 stealCount=0