PostgreSQL MAXおよびGROUP BY

Question

id、year、およびcountを含むテーブルがあります。

各idのMAX(count)を取得し、発生したときにyearを保持したいので、このクエリを作成します。

_SELECT id, year, MAX(count) FROM table GROUP BY id; _

残念ながら、エラーが発生します。

エラー：列 "table.year"はGROUP BY句に含まれているか、集計関数で使用されている必要があります

だから私は試します：

_SELECT id, year, MAX(count) FROM table GROUP BY id, year; _

しかし、その場合、MAX(count)は実行されず、テーブルがそのまま表示されます。 yearとidでグループ化すると、その特定の年のidの最大値を取得するためだと思います。

それでは、どのようにそのクエリを書くことができますか？ id´s MAX(count)とそれが発生する年を取得したい。

a_horse_with_no_name · Accepted Answer

select * from ( select id, year, thing, max(thing) over (partition by id) as max_thing from the_table ) t where thing = max_thing

または：

select t1.id, t1.year, t1.thing from the_table t1 where t1.thing = (select max(t2.thing) from the_table t2 where t2.id = t1.id);

または

select t1.id, t1.year, t1.thing from the_table t1 join ( select id, max(t2.thing) as max_thing from the_table t2 group by id ) t on t.id = t1.id and t.max_thing = t1.thing

または（前と同じように異なる表記で）

with max_stuff as ( select id, max(t2.thing) as max_thing from the_table t2 group by id ) select t1.id, t1.year, t1.thing from the_table t1 join max_stuff t2 on t1.id = t2.id and t1.thing = t2.max_thing

Erwin Brandstetter · Answer

最短の（おそらく最速の）クエリは、SQL標準DISTINCTのPostgreSQL拡張である DISTINCT ON を使用します。句：

SELECT DISTINCT ON (1) id, count, year FROM tbl ORDER BY 1, 2 DESC, 3;

番号は、SELECTリスト内の順序位置を参照します。わかりやすくするために列名を綴ることができます。

SELECT DISTINCT ON (id) id, count, year FROM tbl ORDER BY id, count DESC, year;

結果はidの順に並べられますが、歓迎されない場合もあります。いずれの場合も「未定義」よりも優れています。

また、明確に定義された方法で（複数の年が同じ最大数を共有する場合）関係を壊します：最も早い年を選択します。気にしない場合は、ORDER BYからyearを削除します。または、year DESCで最新の年を選択します。

この密接に関連する答えの詳細な説明、リンク、ベンチマーク、およびおそらくより高速なソリューション：

各GROUP BYグループの最初の行を選択しますか？

余談：実際のクエリでは、列名の一部を使用しません。 idは列名の非記述的なアンチパターンで、countは標準SQLの予約語およびPostgresの集約関数です。