複数のフィールドに基づいてSQLテーブルの重複を削除する方法

Question

私はゲームの表を持っています。これは次のように説明されています。

+---------------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +---------------+-------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | date | date | NO | | NULL | | | time | time | NO | | NULL | | | hometeam_id | int(11) | NO | MUL | NULL | | | awayteam_id | int(11) | NO | MUL | NULL | | | locationcity | varchar(30) | NO | | NULL | | | locationstate | varchar(20) | NO | | NULL | | +---------------+-------------+------+-----+---------+----------------+

しかし、各ゲームは2つのチームのスケジュールに含まれていたため、テーブルのどこかに重複したエントリがあります。同一の日付、時刻、hometeam_id、awayteam_id、locationcity、locationstateフィールドに基づいてすべての重複を調べて削除するために使用できるSQL文はありますか？

N West · Accepted Answer

相関サブクエリを実行してデータを削除できるはずです。重複するすべての行を検索し、IDが最小の行を除くすべてを削除します。 MYSQLの場合、次のように内部結合（EXISTSと機能的に同等）を使用する必要があります。

delete games from games inner join (select min(id) minid, date, time, hometeam_id, awayteam_id, locationcity, locationstate from games group by date, time, hometeam_id, awayteam_id, locationcity, locationstate having count(1) > 1) as duplicates on (duplicates.date = games.date and duplicates.time = games.time and duplicates.hometeam_id = games.hometeam_id and duplicates.awayteam_id = games.awayteam_id and duplicates.locationcity = games.locationcity and duplicates.locationstate = games.locationstate and duplicates.minid <> games.id)

テストするには、delete games from games with select * from games。 DBで削除を実行しないでください:-)

Grigor Gevorgyan · Answer

このようなクエリを試すことができます：

DELETE FROM table_name AS t1 WHERE EXISTS ( SELECT 1 FROM table_name AS t2 WHERE t2.date = t1.date AND t2.time = t1.time AND t2.hometeam_id = t1.hometeam_id AND t2.awayteam_id = t1.awayteam_id AND t2.locationcity = t1.locationcity AND t2.id > t1.id )

これにより、最小のIDを持つ各ゲームインスタンスの1つの例のみがデータベースに残ります。

Ali Hashemi · Answer

私のために働いた最高のことは、テーブルを再作成することでした。

CREATE TABLE newtable SELECT * FROM oldtable GROUP BY field1,field2;

その後、名前を変更できます。

Rem · Answer

2つのフィールドと一致する重複エントリのリストを取得するには

select t.ID, t.field1, t.field2 from ( select field1, field2 from table_name group by field1, field2 having count(*) > 1) x, table_name t where x.field1 = t.field1 and x.field2 = t.field2 order by t.field1, t.field2

およびすべての重複のみを削除するには

DELETE x FROM table_name x JOIN table_name y ON y.field1= x.field1 AND y.field2 = x.field2 AND y.id < x.id;

Neville Kuyt · Answer

select orig.id, dupl.id from games orig, games dupl where orig.date = dupl.date and orig.time = dupl.time and orig.hometeam_id = dupl.hometeam_id and orig. awayteam_id = dupl.awayeam_id and orig.locationcity = dupl.locationcity and orig.locationstate = dupl.locationstate and orig.id < dupl.id

これにより、重複が得られます。これをサブクエリとして使用して、削除するIDを指定できます。

piotrpo · Answer

delete from games where id not in (select max(id) from games group by date, time, hometeam_id, awayteam_id, locationcity, locationstate );

Workaround

select max(id) id from games group by date, time, hometeam_id, awayteam_id, locationcity, locationstate into table temp_table; delete from games where id in (select id from temp);

Wicked Coder · Answer

選択クエリでテーブルのID（主キー）を取得しておらず、他のデータが完全に同じである限り、SELECT DISTINCTを使用して重複する結果を取得しないようにすることができます。

limscoder · Answer

DELETE FROM table WHERE id = (SELECT t.id FROM table as t JOIN (table as tj ON (t.date = tj.data AND t.hometeam_id = tj.hometeam_id AND t.awayteam_id = tj.awayteam_id ...))