ページネーションとフィルタリングを使用した遅いpostgresqlクエリ

Question

問題のDBについては、 Pagination Done the PostgreSQL Way のアドバイスに従いました。

つまり、結果は、明確なソート順の複合インデックスでページ付けされています。

さて、私の問題は、これを結合テーブルのwhere句フィルターと組み合わせると、クエリが非常に遅くなることです。

フィルタリングの場合、どうすればクエリを高速化できますか？

注：pagination_indexは、sort_key_1, sort_key_2, sort_key_3の一意のインデックスです。

フィルタなしでクエリ

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM big_table join small_table on small_table_id = small_table.id --WHERE small_table.filter_column = 9173 ORDER BY sort_key_1, sort_key_2, sort_key_3 LIMIT 20 Limit (cost=0.85..124.26 rows=20 width=293) (actual time=49.947..50.254 rows=20 loops=1) Buffers: shared hit=64 read=1 -> Nested Loop (cost=0.85..318749153.46 rows=51655908 width=293) (actual time=49.942..50.189 rows=20 loops=1) Buffers: shared hit=64 read=1 -> Index Scan using pagination_index on big_table (cost=0.56..106181666.21 rows=51655908 width=113) (actual time=0.019..0.054 rows=20 loops=1) Buffers: shared hit=5 -> Index Scan using small_table_pkey on small_table (cost=0.28..4.11 rows=1 width=180) (actual time=2.498..2.500 rows=1 loops=20) Index Cond: (id = big_table.small_table_id) Buffers: shared hit=59 read=1 Total runtime: 92.107 ms

フィルタを使用したクエリ

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM big_table join small_table on small_table_id = small_table.id WHERE small_table.filter_column = 9173 ORDER BY sort_key_1, sort_key_2, sort_key_3 LIMIT 20 Limit (cost=0.85..79443.10 rows=20 width=293) (actual time=1331003.261..2376282.943 rows=16 loops=1) Buffers: shared hit=45840126 read=1148265 -> Nested Loop (cost=0.85..108506196.66 rows=27317 width=293) (actual time=1331003.257..2376282.873 rows=16 loops=1) Join Filter: (big_table.small_table_id = small_table.id) Rows Removed by Join Filter: 154967708 Buffers: shared hit=45840126 read=1148265 -> Index Scan using pagination_index on big_table (cost=0.56..106181666.21 rows=51655908 width=113) (actual time=0.010..1544834.357 rows=51655908 loops=1) Buffers: shared hit=45840121 read=1148265 -> Materialize (cost=0.28..14.60 rows=3 width=180) (actual time=0.002..0.007 rows=3 loops=51655908) Buffers: shared hit=5 -> Index Scan using small_table_id_index on small_table (cost=0.28..14.59 rows=3 width=180) (actual time=0.008..0.017 rows=3 loops=1) Index Cond: (filter_column = 9173) Buffers: shared hit=5 Total runtime: 2376283.041 ms

フィルタを使用してクエリを実行しますが、順序はありません

これも比較的高速です。

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM big_table join small_table on small_table_id = small_table.id WHERE small_table.filter_column = 9173 --ORDER BY sort_key_1, sort_key_2, sort_key_3 LIMIT 20 Limit (cost=0.56..78.57 rows=20 width=293) (actual time=0.035..100.824 rows=16 loops=1) Buffers: shared hit=18 read=132 -> Nested Loop (cost=0.56..106540.40 rows=27317 width=293) (actual time=0.031..100.759 rows=16 loops=1) Buffers: shared hit=18 read=132 -> Seq Scan on small_table (cost=0.00..200.91 rows=3 width=180) (actual time=0.014..10.124 rows=3 loops=1) Filter: (filter_column = 9173) Rows Removed by Filter: 5670 Buffers: shared hit=2 read=128 -> Index Scan using big_table_small_table_index on big_table (cost=0.56..35179.54 rows=26696 width=113) (actual time=30.168..30.183 rows=5 loops=3) Index Cond: (small_table_id = small_table.id) Buffers: shared hit=16 read=4 Total runtime: 100.901 ms

JonoB · Accepted Answer

私は今、私の質問が LIMIT付きの遅いORDER BY の複製であると信じています。

クエリ（結合テーブルの複雑さを取り除く）を重要なものに単純化しました。

高速クエリ

explain (analyze, buffers) SELECT * FROM big_table where small_table_id = 822573 ORDER BY sort_key_1, sort_key_2, sort_key_3 Sort (cost=24642.79..24686.64 rows=17540 width=113) (actual time=0.983..1.471 rows=227 loops=1) Sort Key: sort_key_1, sort_key_2, transmitter_id Sort Method: quicksort Memory: 56kB Buffers: shared hit=1 read=8 -> Index Scan using big_table_small_table_index on big_table (cost=0.56..23406.36 rows=17540 width=113) (actual time=0.055..0.514 rows=227 loops=1) Index Cond: (small_table_id = 822573) Buffers: shared hit=1 read=8 Total runtime: 1.941 ms

遅いクエリ

注：変更はLIMIT句の追加のみです。

explain (analyze, buffers) SELECT * FROM big_table where small_table_id = 822573 ORDER BY sort_key_1, sort_key_2, sort_key_3 LIMIT 1 Limit (cost=0.56..6061.27 rows=1 width=113) (actual time=206737.547..206737.550 rows=1 loops=1) Buffers: shared hit=13259226 read=303563 -> Index Scan using pagination_index on big_table (cost=0.56..106304766.93 rows=17540 width=113) (actual time=206737.541..206737.541 rows=1 loops=1) Filter: (small_table_id = 822573) Rows Removed by Filter: 15520057 Buffers: shared hit=13259226 read=303563 Total runtime: 206737.643 ms

制限付きの高速クエリ

リンクされた質問で示唆されているように、インデックススキャンを無効にすると、制限されたクエリが高速になります。

SET enable_indexscan = OFF; explain (analyze, buffers) SELECT * FROM big_table where small_table_id = 822573 ORDER BY sort_key_1, sort_key_2, sort_key_3 LIMIT 1 Limit (cost=62302.77..62302.77 rows=1 width=113) (actual time=0.897..0.899 rows=1 loops=1) Buffers: shared hit=9 -> Sort (cost=62302.77..62346.62 rows=17540 width=113) (actual time=0.894..0.894 rows=1 loops=1) Sort Key: sort_key_1, sort_key_2, transmitter_id Sort Method: top-N heapsort Memory: 25kB Buffers: shared hit=9 -> Bitmap Heap Scan on big_table (cost=388.50..62215.07 rows=17540 width=113) (actual time=0.040..0.477 rows=227 loops=1) Recheck Cond: (small_table_id = 822573) Buffers: shared hit=9 -> Bitmap Index Scan on big_table_small_table_index (cost=0.00..384.11 rows=17540 width=0) (actual time=0.030..0.030 rows=227 loops=1) Index Cond: (small_table_id = 822573) Buffers: shared hit=4 Total runtime: 0.932 ms

共通テーブル式を使用した高速クエリ

次のクエリも高速です。

SET enable_indexscan = ON; explain (analyze, buffers) with t as ( SELECT * FROM big_table where small_table_id = 822573 ORDER BY sort_key_1, sort_key_2, sort_key_3 ) select * from t LIMIT 1 Limit (cost=24686.64..24686.66 rows=1 width=3172) (actual time=1.011..1.013 rows=1 loops=1) Buffers: shared hit=9 CTE t -> Sort (cost=24642.79..24686.64 rows=17540 width=113) (actual time=1.001..1.001 rows=1 loops=1) Sort Key: big_table.sort_key_1, big_table.sort_key_2, sort_key_3 Sort Method: quicksort Memory: 56kB Buffers: shared hit=9 -> Index Scan using big_table_small_table_index on big_table (cost=0.56..23406.36 rows=17540 width=113) (actual time=0.025..0.504 rows=227 loops=1) Index Cond: (small_table_id = 822573) Buffers: shared hit=9 -> CTE Scan on t (cost=0.00..350.80 rows=17540 width=3172) (actual time=1.007..1.007 rows=1 loops=1) Buffers: shared hit=9 Total runtime: 1.070 ms