web-dev-qa-db-ja.com

ORDER BY NULLS LASTが主キーのクエリプランに影響するのはなぜですか?

PostgreSQL 11を使用して、約4億5000万行の次のテーブルを作成しました。

postgres=> \d+ sales
                                                             Table "public.sales"
              Column               |            Type             |             Modifiers              | Storage  | Stats target | Description
-----------------------------------+-----------------------------+------------------------------------+----------+--------------+-------------
 created_terminal_id               | integer                     | not null                           | plain    |              |
 company_id                        | integer                     | not null                           | plain    |              |
 customer_id                       | integer                     |                                    | plain    |              |
 sale_no                           | character varying(20)       | not null                           | extended |              |
 sale_type                         | smallint                    | not null                           | plain    |              |
 source_type                       | smallint                    | not null                           | plain    |              |
 sale_date                         | timestamp without time zone | not null                           | plain    |              |
 paid_amount                       | numeric(18,4)               | not null default 0.0000            | main     |              |
 change_amount                     | numeric(18,4)               | not null default 0.0000            | main     |              |
 cashup_id                         | integer                     |                                    | plain    |              |
 staff_id                          | integer                     |                                    | plain    |              |
 payment_terminal_id               | integer                     | not null                           | plain    |              |
 site_id                           | integer                     | not null                           | plain    |              |
 sale_id                           | integer                     | not null                           | plain    |              |
 deleted                           | smallint                    | default 0                          | plain    |              |
 is_tax_on                         | smallint                    | not null default 1                 | plain    |              |
 props                             | text                        | not null                           | extended |              |
 modified_time                     | timestamp without time zone | not null default CURRENT_TIMESTAMP | plain    |              |
 sum_line_variation_ex_tax_price   | numeric(18,4)               | not null default 0.0000            | main     |              |
 sum_line_variation_inc_tax_price  | numeric(18,4)               | not null default 0.0000            | main     |              |
 sum_line_quantified_ex_tax_price  | numeric(18,4)               | not null default 0.0000            | main     |              |
 sum_line_quantified_inc_tax_price | numeric(18,4)               | not null default 0.0000            | main     |              |
 sum_line_subtotal_ex_tax_price    | numeric(18,4)               | not null default 0.0000            | main     |              |
 sum_line_subtotal_inc_tax_price   | numeric(18,4)               | not null default 0.0000            | main     |              |
 sum_line_total_ex_tax_price       | numeric(18,4)               | not null default 0.0000            | main     |              |
 sum_line_total_inc_tax_price      | numeric(18,4)               | not null default 0.0000            | main     |              |
 sum_line_cost_inc_tax_price       | numeric(18,4)               | not null default 0.0000            | main     |              |
 sum_line_cost_ex_tax_price        | numeric(18,4)               | not null default 0.0000            | main     |              |
 sum_payment_tip_price             | numeric(18,4)               | not null default 0.0000            | main     |              |
 order_variation_ex_tax_price      | numeric(18,4)               | not null default 0.0000            | main     |              |
 order_variation_inc_tax_price     | numeric(18,4)               | not null default 0.0000            | main     |              |
 order_total_ex_tax_price          | numeric(18,4)               | not null default 0.0000            | main     |              |
 order_total_inc_tax_price         | numeric(18,4)               | not null default 0.0000            | main     |              |
 order_tip_price                   | numeric(18,4)               | not null default 0.0000            | main     |              |
 order_variation_is_percent        | smallint                    | not null default 0                 | plain    |              |
 order_variation_percent           | numeric(18,4)               | not null default 1.0000            | main     |              |
 order_type                        | smallint                    |                                    | plain    |              |
 order_tip_is_percent              | smallint                    | not null default 0                 | plain    |              |
 order_tip_percent                 | numeric(18,4)               | not null default 1.0000            | main     |              |
 sale_date_id                      | integer                     | not null default 0                 | plain    |              |
 voided_sale_id                    | integer                     |                                    | plain    |              |
 voided_sale_date                  | timestamp without time zone |                                    | plain    |              |
 sale_date_utc                     | timestamp without time zone |                                    | plain    |              |
 foo                               | numeric(18,4)               |                                    | main     |              |
 bar                               | numeric(18,4)               | not null default 0                 | main     |              |
Indexes:
    "sales_pkey" PRIMARY KEY, btree (sale_id)
    "idx_unique_sale" UNIQUE CONSTRAINT, btree (created_terminal_id, sale_date, sale_no)
    "idx_sale_cashup_id" btree (cashup_id)
    "idx_sale_customer_id" btree (customer_id)
    "idx_sale_modified_time" btree (modified_time)
    "idx_sale_payment_terminal_id" btree (payment_terminal_id)
    "idx_sale_site_date" btree (sale_date)
    "idx_sale_site_id" btree (site_id)
    "idx_sale_staff_id" btree (staff_id)
    "sales_company_id" btree (company_id)
    "sales_sale_date_id" btree (sale_date_id)
Has OIDs: no

次のクエリの実行には約35分かかります。

postgres=> EXPLAIN
postgres-> SELECT sales.sale_id as numeric_id, sales.site_id, sales.created_terminal_id  as terminal_id, sales.props as props, sales.order_total_inc_tax_price as SaleAmount, sales.staff_id as staff_id, sales.paid_amount as PaidAmount, (sales.order_total_inc_tax_price - sales.order_total_ex_tax_price) as taxAmount, sales.sale_date as SaleDate, sales.sale_no as SaleNo, sales.voided_sale_id as LinkedSaleNo, sales.voided_sale_date as LinkedSaleDate, sales.sale_type as sale_type, sales.deleted
postgres-> FROM sales
postgres-> WHERE sales.deleted = 0
postgres->   AND sales.site_id = 72620
postgres->   AND order_total_inc_tax_price < 40
postgres->   AND sale_date > '2019-03-08'
postgres-> ORDER BY sale_id DESC
postgres-> LIMIT 50;
                                                                                    QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------
 Limit  (cost=1000.60..76779.68 rows=50 width=189)
   ->  Gather Merge  (cost=1000.60..29806430.01 rows=19666 width=189)
         Workers Planned: 2
         ->  Parallel Index Scan Backward using sales_pkey on sales  (cost=0.57..29803160.04 rows=8194 width=189)
               Filter: ((order_total_inc_tax_price < '40'::numeric) AND (sale_date > '2019-03-08 00:00:00'::timestamp without time zone) AND (del
eted = 0) AND (site_id = 72620))
(5 rows)

ただし、ORDER BY sale_id DESC NULLS LAST根本的に異なる計画と大幅な速度向上が得られます(クエリは数秒で完了します):

postgres=> EXPLAIN
postgres-> SELECT sales.sale_id as numeric_id, sales.site_id, sales.created_terminal_id  as terminal_id, sales.props as props, sales.order_total_inc_tax_price as SaleAmount, sales.staff_id as staff_id, sales.paid_amount as PaidAmount, (sales.order_total_inc_tax_price - sales.order_total_ex_tax_price) as taxAmount, sales.sale_date as SaleDate, sales.sale_no as SaleNo, sales.voided_sale_id as LinkedSaleNo, sales.voided_sale_date as LinkedSaleDate, sales.sale_type as sale_type, sales.deleted
postgres-> FROM sales
postgres-> WHERE sales.deleted = 0
postgres->   AND sales.site_id = 72620
postgres->   AND order_total_inc_tax_price < 40
postgres->   AND sale_date > '2019-03-08'
postgres-> ORDER BY sale_id DESC NULLS LAST
postgres-> LIMIT 50;
                                                                         QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------
-----------
 Limit  (cost=216622.14..216622.26 rows=50 width=189)
   ->  Sort  (cost=216622.14..216671.30 rows=19666 width=189)
         Sort Key: sale_id DESC NULLS LAST
         ->  Index Scan using idx_sale_site_id on sales  (cost=0.57..215968.85 rows=19666 width=189)
               Index Cond: (site_id = 72620)
               Filter: ((order_total_inc_tax_price < '40'::numeric) AND (sale_date > '2019-03-08 00:00:00'::timestamp without time zone) AND (del
eted = 0))
(6 rows)

NULLを含むことができないPRIMARY KEYでソートしているのに、クエリプランナーがこの選択をするのはなぜですか?

このサーバーはベンチマーク用であり、データベースに他の負荷はないことは注目に値します。また、データのロード後にANALYZEが実行されました。

4
Elliot Chance

インデックスがORDER BYで指定されたのと同じ順序である場合、インデックスを使用して並べ替えずにORDER BY句を処理することができます。

これで、インデックスは(デフォルトで)ASC NULLS LASTでソートされ、インデックスは両方向でスキャンできるため、ORDER BY sale_id ASC NULLS LASTORDER BY sale_id DESC NULLS FIRSTの両方をサポートできます。ただし、順序が異なるため、ORDER BY sale_id DESC NULLS LASTをサポートできません。

プランナーは、列定義のNOT NULLを考慮しません。これは、build_index_pathkeyssrc/backend/optimizer/path/pathkeys.cで決定されます。

    if (ScanDirectionIsBackward(scandir))
    {
        reverse_sort = !index->reverse_sort[i];
        nulls_first = !index->nulls_first[i];
    }
    else
    {
        reverse_sort = index->reverse_sort[i];
        nulls_first = index->nulls_first[i];
    }

    /*
     * OK, try to make a canonical pathkey for this sort key.  Note we're
     * underneath any outer joins, so nullable_relids should be NULL.
     */
    cpathkey = make_pathkey_from_sortinfo(root,
                                          indexkey,
                                          NULL,
                                          index->sortopfamily[i],
                                          index->opcintype[i],
                                          index->indexcollations[i],
                                          reverse_sort,
                                          nulls_first,
                                          0,
                                          index->rel->relids,
                                          false);

ここでnull可能性を考慮するのがどれほど簡単かはわかりませんが、現時点では行われていません。

おそらく、pgsql-hackersメーリングリストにそれを提案できます。

2
Laurenz Albe