次のクエリをデバッグしようとしています。
# Projects with at least 50 members with at least 240 yearly commits each
select project_id, count(*) as active_members from
(
select project_members.repo_id as project_id, project_members.user_id, count(*)
from project_members
inner join yearly_project_commits on project_members.user_id = yearly_project_commits.committer_id and project_members.repo_id = yearly_project_commits.project_id
group by project_members.repo_id, project_members.user_id
having count(*) > 240
) as active_member_projects
group by project_id
having count(*) > 50;
元々、結果が得られずに数日間実行されていました。クエリの実行を中断した時点では、MySQLはCPU時間を大幅に消費しておらず、システムコールも発行していませんでした(mysqldプロセスでstraceを実行すると表示されます)。その時点で、show processlist
の出力は次の結果をもたらしました。
+----+-----------+-----------+-----------+---------+--------+--------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------+-----------+-----------+---------+--------+--------------+------------------------------------------------------------------------------------------------------+
| 55 | ghtorrent | localhost | ghtorrent | Query | 217907 | Sending data | select project_id, count(*) as active_members from
(
select project_members.repo_id as project_id, |
| 69 | ghtorrent | localhost | ghtorrent | Query | 0 | NULL | show processlist |
+----+-----------+-----------+-----------+---------+--------+--------------+------------------------------------------------------------------------------------------------------+
また、クエリでEXPLAIN
を実行しようとしましたが、これもスタックし、show processlist
は次のように表示されます。
+----+-----------+-----------+-----------+---------+------+------------------------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------+-----------+-----------+---------+------+------------------------------+------------------------------------------------------------------------------------------------------+
| 10 | ghtorrent | localhost | ghtorrent | Query | 564 | Copying to tmp table on disk | explain select populous_projects.name, members, count(populous_projects.project_id) as yearly_projec |
| 40 | ghtorrent | localhost | ghtorrent | Query | 1 | NULL | show processlist |
+----+-----------+-----------+-----------+---------+------+------------------------------+------------------------------------------------------------------------------------------------------+
内部クエリでEXPLAINを実行すると、
+----+-------------+------------------------+-------+-------------------------+------------+---------+-----------------------------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------------+-------+-------------------------+------------+---------+-----------------------------------+-------+-------------+
| 1 | SIMPLE | project_members | index | PRIMARY,user_id | PRIMARY | 8 | NULL | 53530 | Using index |
| 1 | SIMPLE | yearly_project_commits | ref | committer_id,project_id | project_id | 4 | ghtorrent.project_members.repo_id | 113 | Using where |
+----+-------------+------------------------+-------+-------------------------+------------+---------+-----------------------------------+-------+-------------+
使用するフィールドにインデックスが定義されていることがわかります。
mysql> show indexes from yearly_project_commits;
+------------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| yearly_project_commits | 1 | committer_id | 1 | committer_id | A | 2535097 | NULL | NULL | YES | BTREE | | |
| yearly_project_commits | 1 | project_id | 1 | project_id | A | 7134168 | NULL | NULL | | BTREE | | |
+------------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
mysql> show indexes from project_members;
+-----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| project_members | 0 | PRIMARY | 1 | repo_id | A | 5559879 | NULL | NULL | | BTREE | | |
| project_members | 0 | PRIMARY | 2 | user_id | A | 5559879 | NULL | NULL | | BTREE | | |
| project_members | 1 | user_id | 1 | user_id | A | 5559879 | NULL | NULL | | BTREE | | |
+-----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
対応するcreate table
コマンドは次のとおりです。
+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| project_members | CREATE TABLE `project_members` (
`repo_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`ext_ref_id` varchar(24) NOT NULL DEFAULT '0',
PRIMARY KEY (`repo_id`,`user_id`),
KEY `user_id` (`user_id`),
CONSTRAINT `project_members_ibfk_1` FOREIGN KEY (`repo_id`) REFERENCES `projects` (`id`),
CONSTRAINT `project_members_ibfk_2` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
| yearly_project_commits | CREATE TABLE `yearly_project_commits` (
`id` int(11) NOT NULL DEFAULT '0',
`author_id` int(11) DEFAULT NULL,
`committer_id` int(11) DEFAULT NULL,
`project_id` int(11) NOT NULL DEFAULT '0',
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
KEY `committer_id` (`committer_id`),
KEY `project_id` (`project_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
EXPLAIN
の実行時のshow full processlist
の出力は次のとおりです。
+----+-----------+-----------+-----------+---------+-------+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------+-----------+-----------+---------+-------+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 43 | ghtorrent | localhost | ghtorrent | Query | 53434 | Sending data | explain
select project_id, count(*) as active_members from
(
select project_members.repo_id as project_id, project_members.user_id, count(*)
from project_members
inner join yearly_project_commits on project_members.user_id = yearly_project_commits.committer_id and project_members.repo_id = yearly_project_commits.project_id
group by project_members.repo_id, project_members.user_id
having count(*) > 240
) as active_member_projects
group by project_id
having count(*) > 50 |
| 47 | ghtorrent | localhost | ghtorrent | Query | 0 | NULL | SHOW FULL PROCESSLIST |
+----+-----------+-----------+-----------+---------+-------+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
これらのことは、Debian GNU/Linux 8.0(jessie)でMySQLServerバージョン5.5.43-0 + deb8u1(Debian)を実行しているアイドル状態のマシンで発生します。マシンには24GBのRAMがあり、innodb_buffer_pool_size=1GB
で構成されています。
MyISAMを使用するyearly_project_commits
(より複雑なクエリを単純化するために作成された読み取り専用テーブル)を除くすべてのテーブルは、InnoDBを使用します。
一部のクエリでは、EXPLAINは統計を取得するためにいくつかのサブクエリを実行しようとします-それは 既知のバグ これは5.6およびMariaDBで修正されています(少なくとも10、5.5についてはわかりません)。
インデックスは十分ではありません-yearly_project_commits
に2つの単一列インデックスがあります-しかし、クエリは一度に1つしか使用できません。
両方の列(project_id, committer_id)
を使用して1つの複数列インデックスを作成する必要があります。これにより、サブクエリが、結合で直接アクセスできるインデックスのみのスキャンになり、大幅に高速化されます。