String.strip（）が空白文字列のString.trim（）より5倍速いのはなぜですかIn Java 11

Question

興味深いシナリオに遭遇しました。何らかの理由で、空白文字列（空白のみを含む）に対するstrip()は、Java 11のtrim()よりも大幅に高速です。

ベンチマーク

_public class Test { public static final String TEST_STRING = " "; // 3 whitespaces @Benchmark @Warmup(iterations = 10, time = 200, timeUnit = MILLISECONDS) @Measurement(iterations = 20, time = 500, timeUnit = MILLISECONDS) @BenchmarkMode(Mode.Throughput) public void testTrim() { TEST_STRING.trim(); } @Benchmark @Warmup(iterations = 10, time = 200, timeUnit = MILLISECONDS) @Measurement(iterations = 20, time = 500, timeUnit = MILLISECONDS) @BenchmarkMode(Mode.Throughput) public void testStrip() { TEST_STRING.strip(); } public static void main(String[] args) throws Exception { org.openjdk.jmh.Main.main(args); } } _

結果

_# Run complete. Total time: 00:04:16 Benchmark Mode Cnt Score Error Units Test.testStrip thrpt 200 2067457963.295 ± 12353310.918 ops/s Test.testTrim thrpt 200 402307182.894 ± 4559641.554 ops/s _

どうやらstrip()はtrim()を約5倍上回っています。

空白以外の文字列の場合でも、結果はほぼ同じです。

_public class Test { public static final String TEST_STRING = " Test String "; @Benchmark @Warmup(iterations = 10, time = 200, timeUnit = MILLISECONDS) @Measurement(iterations = 20, time = 500, timeUnit = MILLISECONDS) @BenchmarkMode(Mode.Throughput) public void testTrim() { TEST_STRING.trim(); } @Benchmark @Warmup(iterations = 10, time = 200, timeUnit = MILLISECONDS) @Measurement(iterations = 20, time = 500, timeUnit = MILLISECONDS) @BenchmarkMode(Mode.Throughput) public void testStrip() { TEST_STRING.strip(); } public static void main(String[] args) throws Exception { org.openjdk.jmh.Main.main(args); } } # Run complete. Total time: 00:04:16 Benchmark Mode Cnt Score Error Units Test.testStrip thrpt 200 126939018.461 ± 1462665.695 ops/s Test.testTrim thrpt 200 141868439.680 ± 1243136.707 ops/s _

どうして？これはバグですか、それとも間違っていますか？

テスト環境

CPU-Intel Xeon E3-1585L v5 @ 3.00 GHz
OS-Windows 7 SP 164ビット
JVM-Oracle JDK 11.0.1
Benchamrk-JMH v 1.19

更新

さまざまな文字列（空、空白など）のパフォーマンステストを追加しました。

ベンチマーク

_@Warmup(iterations = 5, time = 1, timeUnit = SECONDS) @Measurement(iterations = 5, time = 1, timeUnit = SECONDS) @Fork(value = 3) @BenchmarkMode(Mode.Throughput) public class Test { private static final String BLANK = ""; // Blank private static final String EMPTY = " "; // 3 spaces private static final String ASCII = " abc "; // ASCII characters only private static final String UNICODE = " абв "; // Russian Characters private static final String BIG = EMPTY.concat("Test".repeat(100)).concat(EMPTY); @Benchmark public void blankTrim() { BLANK.trim(); } @Benchmark public void blankStrip() { BLANK.strip(); } @Benchmark public void emptyTrim() { EMPTY.trim(); } @Benchmark public void emptyStrip() { EMPTY.strip(); } @Benchmark public void asciiTrim() { ASCII.trim(); } @Benchmark public void asciiStrip() { ASCII.strip(); } @Benchmark public void unicodeTrim() { UNICODE.trim(); } @Benchmark public void unicodeStrip() { UNICODE.strip(); } @Benchmark public void bigTrim() { BIG.trim(); } @Benchmark public void bigStrip() { BIG.strip(); } public static void main(String[] args) throws Exception { org.openjdk.jmh.Main.main(args); } } _

結果

_# Run complete. Total time: 00:05:23 Benchmark Mode Cnt Score Error Units Test.asciiStrip thrpt 15 356846913.133 ± 4096617.178 ops/s Test.asciiTrim thrpt 15 371319467.629 ± 4396583.099 ops/s Test.bigStrip thrpt 15 29058105.304 ± 1909323.104 ops/s Test.bigTrim thrpt 15 28529199.298 ± 1794655.012 ops/s Test.blankStrip thrpt 15 1556405453.206 ± 67230630.036 ops/s Test.blankTrim thrpt 15 1587932109.069 ± 19457780.528 ops/s Test.emptyStrip thrpt 15 2126290275.733 ± 23402906.719 ops/s Test.emptyTrim thrpt 15 406354680.805 ± 14359067.902 ops/s Test.unicodeStrip thrpt 15 37320438.099 ± 399421.799 ops/s Test.unicodeTrim thrpt 15 88226653.577 ± 1628179.578 ops/s _

テスト環境は同じです。

唯一の興味深い発見。 trim() 'edよりもstrip()'edされるUnicode文字を含む文字列

Karol Dowbecki · Accepted Answer

OpenJDK 11.0.1では、String.strip()（実際にはStringLatin1.strip()）は、インターンされたString定数を返すことにより、ストリッピングを空のStringに最適化します。

_public static String strip(byte[] value) { int left = indexOfNonWhitespace(value); if (left == value.length) { return ""; } _

一方、String.trim()（実際にはStringLatin1.trim()）は常に新しいStringオブジェクトを割り当てます。あなたの例では_st = 3_と_len = 3_なので

_return ((st > 0) || (len < value.length)) ? newString(value, st, len - st) : null; _

内部で配列をコピーし、新しいStringオブジェクトを作成します

_return new String(Arrays.copyOfRange(val, index, index + len), LATIN1); _

上記の仮定を行うと、ベンチマークを更新して、前述のString.strip()最適化の影響を受けない空でないStringと比較できます。

_@Warmup(iterations = 10, time = 200, timeUnit = MILLISECONDS) @Measurement(iterations = 20, time = 500, timeUnit = MILLISECONDS) @BenchmarkMode(Mode.Throughput) public class MyBenchmark { public static final String EMPTY_STRING = " "; // 3 whitespaces public static final String NOT_EMPTY_STRING = " a "; // 3 whitespaces with a in the middle @Benchmark public void testEmptyTrim() { EMPTY_STRING.trim(); } @Benchmark public void testEmptyStrip() { EMPTY_STRING.strip(); } @Benchmark public void testNotEmptyTrim() { NOT_EMPTY_STRING.trim(); } @Benchmark public void testNotEmptyStrip() { NOT_EMPTY_STRING.strip(); } } _

それを実行すると、空でないStringのstrip()とtrim()の間に有意差はありません。奇妙なことに、空のStringへのトリミングはまだ最も遅いです。

_Benchmark Mode Cnt Score Error Units MyBenchmark.testEmptyStrip thrpt 100 1887848947.416 ± 257906287.634 ops/s MyBenchmark.testEmptyTrim thrpt 100 206638996.217 ± 57952310.906 ops/s MyBenchmark.testNotEmptyStrip thrpt 100 399701777.916 ± 2429785.818 ops/s MyBenchmark.testNotEmptyTrim thrpt 100 385144724.856 ± 3928016.232 ops/s _

Sami Hult · Answer

OpenJDKのソースコードを調べた後、Oracleバージョンの実装が類似していると仮定すると、違いは次の事実によって説明されると思います。

stripは最初の空白以外の文字を見つけようとしますが、何も見つからない場合は、単に_""_を返します。
trimは常にnew String(...the substring...)を返します

少なくともOpenJDKでは、stripはtrimよりもほんの少しだけ最適化されていると主張することができます。これは、必要な場合を除いて、新しいオブジェクトの作成を回避するためです。

（注：これらのメソッドのUnicodeバージョンを確認するのに苦労しませんでした。）

chiperortiz · Answer

うん。 Java 11以前では、.trim（）は常に新しいString（）を作成しているようですが、strip（）はキャッシュ文字列を返しています。この単純なコードをテストして自分で証明できます。

public class JavaClass{ public static void main(String[] args){ //prints false System.out.println(" ".trim()=="");//CREATING A NEW STRING() } }

vs

public class JavaClass{ public static void main(String[] args){ //prints true System.out.println(" ".strip()=="");//RETURNING CACHE "" } }