次のproduct
テーブル(高度に削減されている)を考えてみます。
`id` int AUTO_INCREMENT
`category_id` int
`subcategory_id` int
`vendor_id` int
`price` decimal(6,2)
`inserted_at` timestamp
特定のカテゴリIDについて、各サブカテゴリの最新の価格が最も低いベンダーを含むリストを取得しようとしています。 「最新」とは、ベンダーが特定のカテゴリーID /サブカテゴリーIDの組み合わせに対して複数の価格を設定している可能性があるため、そのカテゴリーID /サブカテゴリーID /ベンダーIDの最後に挿入された価格のみを使用する必要があることを意味します。 2つ以上のベンダーの価格の間に同点がある場合、最低のIDをタイブレーカーとして使用する必要があります。
たとえば、次のデータの場合:
id | category_id | subcategory_id | vendor_id | price | inserted_at
---------------------------------------------------------------------------
1 | 1 | 2 | 3 | 16.00 | 2015-07-23 04:00:00
2 | 1 | 1 | 2 | 9.00 | 2015-07-26 08:00:00
3 | 1 | 2 | 4 | 16.00 | 2015-08-02 10:00:00
4 | 1 | 1 | 1 | 7.00 | 2015-08-04 11:00:00
5 | 1 | 1 | 1 | 11.00 | 2015-08-09 16:00:00
したがって、最初にすべてのサブカテゴリ/ベンダーの組み合わせの最新の価格を見つけます(price=7.00
の行は、そのサブカテゴリのそのベンダーの最新ではないため、削除されます)。次に、サブカテゴリ1の最低価格は9(つまり、vendor_id = 2)となり、サブカテゴリ2の最低価格は16(2つのベンダーが()id 3と4を結ぶ)になるため、最低のvendor_id = 3の価格を選択します。
category_id = 1
の結果は次のようになります。
subcategory_id | vendor_id | price
----------------------------------
1 | 2 | 9.00
2 | 3 | 16.00
これが私が今まで持っているものです。私はすでに手に負えなくなってきているように感じます、そしてこれは2つ以上のベンダーの価格の間の関係を説明さえしません。
SELECT c.subcategory_id, c.vendor_id, c.price
FROM products AS c
JOIN
(
SELECT MIN(a.price) AS min_price,
a.subcategory_id
FROM products AS a
JOIN
(
SELECT MAX(`inserted_at`) AS latest_price_time,
vendor_id,
subcategory_id
FROM products
WHERE category_id = 1
GROUP BY vendor_id, subcategory_id
) AS b
ON a.inserted_at = b.latest_price_time AND a.vendor_id = b.vendor_id AND a.subcategory_id = b.subcategory_id
WHERE a.category_id = 1
GROUP BY a.subcategory_id
) AS d
ON c.price = d.min_price AND c.subcategory_id = d.subcategory_id
WHERE c.category_id = 1
先に進む前に、もっと簡単な方法があるかどうかを確認したいと思いました。追加のグループ化/集計のグループ化/集計結果に関して、最高のパフォーマンスを得る(最も重要)および/または読みやすくする(あまり重要ではない)方法はありますか?
これは「グループごとに最大の」クエリであり、MySQLで書くのは非常に複雑です。1つはウィンドウ関数がないため、2つ目はグループごとに2つの最大のn仕様があるためです。ベンダー、およびサブカテゴリごとの最低価格の2番目。
これは、これを書くためのかなり複雑な方法の1つです。
SELECT
ps.subcategory_id, ps.vendor_id, ps.price -- , p.inserted_at
FROM
( SELECT DISTINCT subcategory_id
FROM product
WHERE category_id = 1
) AS s
JOIN
product AS ps
ON ps.category_id = 1
AND ps.subcategory_id = s.subcategory_id
AND ps.id =
( SELECT psv.id
FROM
( SELECT DISTINCT subcategory_id, vendor_id
FROM product
WHERE category_id = 1
) AS sv
JOIN
product AS psv
ON psv.category_id = 1
AND psv.subcategory_id = sv.subcategory_id
AND psv.vendor_id = sv.vendor_id
AND psv.inserted_at =
( SELECT pi.inserted_at
FROM product AS pi
WHERE pi.category_id = 1
AND pi.subcategory_id = sv.subcategory_id
AND pi.vendor_id = sv.vendor_id
ORDER BY pi.inserted_at DESC
LIMIT 1
)
WHERE sv.subcategory_id = s.subcategory_id
ORDER BY psv.price,
psv.vendor_id
LIMIT 1
) ;
SQLfiddle-2 でテストされています。 (category_id, subcategory_id, vendor_id, inserted_at)
に適切なインデックスが設定されているため、プランも悪くありません。
それは最も効率的ではないかもしれませんし、私は間違いなくインデックスで実験します(もう1つのインデックスがあるフィドルを参照してください。あまり役に立たないかもしれませんが、より大きなテーブルでテストしてください。)
( SQLfidle-1 のクエリの最初のバージョン)
これはうまくいくはずです:
SELECT
d.subcategory_id,
d.vendor_id,
MIN(d.price) AS price,
d.inserted_at
FROM product AS d
JOIN (SELECT
b.category_id,
b.subcategory_id,
b.vendor_id,
a.last_iat
FROM product AS b
JOIN (SELECT
a.category_id,
a.subcategory_id,
a.vendor_id,
a.price,
MAX(a.inserted_at) AS last_iat
FROM product AS a
GROUP BY a.category_id,a.subcategory_id,a.vendor_id
) AS a
ON (a.category_id=b.category_id AND a.subcategory_id=b.subcategory_id AND a.vendor_id=b.vendor_id)
GROUP BY b.category_id,b.subcategory_id,b.vendor_id) AS c
ON (c.category_id=d.category_id AND c.subcategory_id=d.subcategory_id AND c.last_iat=d.inserted_at)
WHERE d.category_id=1
GROUP BY d.category_id,d.subcategory_id;
テスト:
mysql> SELECT
-> d.subcategory_id,
-> d.vendor_id,
-> MIN(d.price) AS price,
-> d.inserted_at
-> FROM product AS d
-> JOIN (SELECT
-> b.category_id,
-> b.subcategory_id,
-> b.vendor_id,
-> a.last_iat
-> FROM product AS b
-> JOIN (SELECT
-> a.category_id,
-> a.subcategory_id,
-> a.vendor_id,
-> a.price,
-> MAX(a.inserted_at) AS last_iat
-> FROM product AS a
-> GROUP BY a.category_id,a.subcategory_id,a.vendor_id
-> ) AS a
-> ON (a.category_id=b.category_id AND a.subcategory_id=b.subcategory_id AND a.vendor_id=b.vendor_id)
-> GROUP BY b.category_id,b.subcategory_id,b.vendor_id) AS c
-> ON (c.category_id=d.category_id AND c.subcategory_id=d.subcategory_id AND c.last_iat=d.inserted_at)
-> WHERE d.category_id=1
-> GROUP BY d.category_id,d.subcategory_id;
+----------------+-----------+-------+---------------------+
| subcategory_id | vendor_id | price | inserted_at |
+----------------+-----------+-------+---------------------+
| 1 | 2 | 9.00 | 2015-07-26 08:00:00 |
| 2 | 3 | 16.00 | 2015-07-23 04:00:00 |
+----------------+-----------+-------+---------------------+
2 rows in set (0.00 sec)
mysql>
説明:
@ ypercubeの推奨インデックスを使用しました
mysql> EXPLAIN SELECT d.subcategory_id, d.vendor_id, MIN(d.price) AS price, d.inserted_at FROM product AS d JOIN (SELECT b.category_id, b.subcategory_id, b.vendor_id, a.last_iat FROM product AS b JOIN (SELECT a.category_id, a.subcategory_id, a.vendor_id, a.price, MAX(a.inserted_at) AS last_iat FROM product AS a GROUP BY a.category_id,a.subcategory_id,a.vendor_id ) AS a ON (a.category_id=b.category_id AND a.subcategory_id=b.subcategory_id AND a.vendor_id=b.vendor_id) GROUP BY b.category_id,b.subcategory_id,b.vendor_id) AS c ON (c.category_id=d.category_id AND c.subcategory_id=d.subcategory_id AND c.last_iat=d.inserted_at) WHERE d.category_id=1 GROUP BY d.category_id,d.subcategory_id;
+----+-------------+------------+-------+---------------+------+---------+--------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+------+---------+--------------------------------------------+------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | d | ALL | q_ix | NULL | NULL | NULL | 5 | Using where; Using join buffer |
| 2 | DERIVED | <derived3> | ALL | NULL | NULL | NULL | NULL | 4 | Using temporary; Using filesort |
| 2 | DERIVED | b | ref | q_ix | q_ix | 15 | a.category_id,a.subcategory_id,a.vendor_id | 1 | Using where; Using index |
| 3 | DERIVED | a | index | NULL | q_ix | 19 | NULL | 5 | |
+----+-------------+------------+-------+---------------+------+---------+--------------------------------------------+------+----------------------------------------------+
5 rows in set (0.00 sec)
mysql>