SQL Serverのようなfrom句でjoinを使用してselect句でPostgresqlのサブクエリを実行する方法は？

Question

私はpostgresqlで次のクエリを作成しようとしています：

select name, author_id, count(1), (select count(1) from names as n2 where n2.id = n1.id and t2.author_id = t1.author_id ) from names as n1 group by name, author_id

これは確かにMicrosoft SQL Serverでは機能しますが、postegresqlではまったく機能しません。私はそのドキュメントを少し読んで、次のように書き直すことができるようです：

select name, author_id, count(1), total from names as n1, (select count(1) as total from names as n2 where n2.id = n1.id and n2.author_id = t1.author_id ) as total group by name, author_id

しかし、それはpostegresqlで次のエラーを返します：「FROMのサブクエリは同じクエリレベルの他のリレーションを参照できません」。だから私は立ち往生しています。誰も私がそれを達成する方法を知っていますか？

ありがとう

Bob Jarvis · Accepted Answer

私はあなたの意図を完全に理解しているとは思いませんが、おそらく以下はあなたが望むものに近いでしょう：

select n1.name, n1.author_id, count_1, total_count from (select id, name, author_id, count(1) as count_1 from names group by id, name, author_id) n1 inner join (select id, author_id, count(1) as total_count from names group by id, author_id) n2 on (n2.id = n1.id and n2.author_id = n1.author_id)

残念ながら、これにより、最初のサブクエリをid、name、author_idでグループ化するという要件が追加されます。ただし、2番目のサブクエリに参加するにはidが必要であるため、これを回避する方法はわかりません。おそらく他の誰かがより良い解決策を考え出すでしょう。

共有してお楽しみください。

Ricardo · Answer

上記のコメントに投稿されているボブ・ジャービスの回答に基づいて、必要な最終SQLのフォーマットされたバージョンをここで答えています：

select n1.name, n1.author_id, cast(count_1 as numeric)/total_count from (select id, name, author_id, count(1) as count_1 from names group by id, name, author_id) n1 inner join (select author_id, count(1) as total_count from names group by author_id) n2 on (n2.author_id = n1.author_id)

deFreitas · Answer

@ Bob Jarvisおよび@ dmikamを補完すると、Postgresは、シミュレーションの下でLATERALを使用しない場合、適切な計画を実行しません。どちらの場合もクエリデータの結果同じですが、コストは非常に異なります

テーブル構造

CREATE TABLE ITEMS ( N INTEGER NOT NULL, S TEXT NOT NULL ); INSERT INTO ITEMS SELECT (random()*1000000)::integer AS n, md5(random()::text) AS s FROM generate_series(1,1000000); CREATE INDEX N_INDEX ON ITEMS(N);

JOINなしでサブクエリでGROUP BY付きでLATERALを実行

EXPLAIN SELECT I.* FROM ITEMS I INNER JOIN ( SELECT COUNT(1), n FROM ITEMS GROUP BY N ) I2 ON I2.N = I.N WHERE I.N IN (243477, 997947);

結果

Merge Join (cost=0.87..637500.40 rows=23 width=37) Merge Cond: (i.n = items.n) -> Index Scan using n_index on items i (cost=0.43..101.28 rows=23 width=37) Index Cond: (n = ANY ('{243477,997947}'::integer[])) -> GroupAggregate (cost=0.43..626631.11 rows=861418 width=12) Group Key: items.n -> Index Only Scan using n_index on items (cost=0.43..593016.93 rows=10000000 width=4)

LATERALを使用

EXPLAIN SELECT I.* FROM ITEMS I INNER JOIN LATERAL ( SELECT COUNT(1), n FROM ITEMS WHERE N = I.N GROUP BY N ) I2 ON 1=1 --I2.N = I.N WHERE I.N IN (243477, 997947);

結果

Nested Loop (cost=9.49..1319.97 rows=276 width=37) -> Bitmap Heap Scan on items i (cost=9.06..100.20 rows=23 width=37) Recheck Cond: (n = ANY ('{243477,997947}'::integer[])) -> Bitmap Index Scan on n_index (cost=0.00..9.05 rows=23 width=0) Index Cond: (n = ANY ('{243477,997947}'::integer[])) -> GroupAggregate (cost=0.43..52.79 rows=12 width=12) Group Key: items.n -> Index Only Scan using n_index on items (cost=0.43..52.64 rows=12 width=4) Index Cond: (n = i.n)

私のPostgresバージョンはPostgreSQL 10.3 (Debian 10.3-1.pgdg90+1)です

dmikam · Answer

私はこれが古いことを知っていますが、 Postgresql 9. から、キーワード "LATERAL"を使用してJOINS内でRELATEDサブクエリを使用するオプションがあるため、質問からのクエリは次のようになります。

SELECT name, author_id, count(*), t.total FROM names as n1 INNER JOIN LATERAL ( SELECT count(*) as total FROM names as n2 WHERE n2.id = n1.id AND n2.author_id = n1.author_id ) as t ON 1=1 GROUP BY n1.name, n1.author_id

Zahid Gani · Answer

select n1.name, n1.author_id, cast(count_1 as numeric)/total_count from (select id, name, author_id, count(1) as count_1 from names group by id, name, author_id) n1 inner join (select distinct(author_id), count(1) as total_count from names) n2 on (n2.author_id = n1.author_id) Where true

結合グループのパフォーマンスが遅いため、内部結合が多い場合はdistinctを使用します