SQL Serverインデックスと統計

Question

CREATE INDEXとCREATE STATISTICSの違いは何ですか？いつ使用する必要がありますか？

Thomas Stringer · Accepted Answer

インデックスには実際のデータ（データページまたはインデックスページの種類に応じてインデックスページ）が格納され、統計にはデータ分布が格納されます。したがって、CREATE INDEXは、インデックス（クラスター化、非クラスター化など）を作成するDDLであり、CREATE STATISTICSは、テーブル内の列の統計を作成するためのDDLです。

リレーショナルデータのこれらの側面について読むことをお勧めします。以下は、初心者向けの紹介記事です。これらは非常に広範なトピックであるため、それらに関する情報は非常に幅広く、非常に深くなる可能性があります。以下でそれらの一般的なアイデアを読んで、それらが発生したときにより具体的な質問をしてください。

テーブルとインデックスの編成に関するBOLリファレンス
 クラスタ化インデックス構造に関するBOLリファレンス
 非クラスター化インデックス構造のBOLリファレンス
 インデックスの概要に関するSQL Server Central
統計に関するBOLリファレンス

これらの2つの部分の動作を確認するための実際の例を次に示します（説明のためにコメント化）。

use testdb; go create table MyTable1 ( id int identity(1, 1) not null, my_int_col int not null ); go insert into MyTable1(my_int_col) values(1); go 100 -- this statement will create a clustered index -- on MyTable1. The index key is the id field -- but due to the nature of a clustered index -- it will contain all of the table data create clustered index MyTable1_CI on MyTable1(id); go -- by default, SQL Server will create a statistics -- on this index. Here is proof. We see a stat created -- with the name of the index, and the consisting stat -- column of the index key column select s.name as stats_name, c.name as column_name from sys.stats s inner join sys.stats_columns sc on s.object_id = sc.object_id and s.stats_id = sc.stats_id inner join sys.columns c on sc.object_id = c.object_id and sc.column_id = c.column_id where s.object_id = object_id('MyTable1'); -- here is a standalone statistics on a single column create statistics MyTable1_MyIntCol on MyTable1(my_int_col); go -- now look at the statistics that exist on the table. -- we have the additional statistics that's not necessarily -- corresponding to an index select s.name as stats_name, c.name as column_name from sys.stats s inner join sys.stats_columns sc on s.object_id = sc.object_id and s.stats_id = sc.stats_id inner join sys.columns c on sc.object_id = c.object_id and sc.column_id = c.column_id where s.object_id = object_id('MyTable1'); -- what is a stat look like? run DBCC SHOW_STATISTICS -- to get a better idea of what is stored dbcc show_statistics('MyTable1', 'MyTable1_CI'); go

統計のテストサンプルは次のようになります。

enter image description here

統計はデータ分布の包含であることに注意してください。これらは、SQL Serverが最適な計画を決定するのに役立ちます。これの良い例は、あなたが重い物体を生きようとしていると想像してください。その上にウェイトマーキングがあったために、そのウェイトの量がわかっている場合は、持ち上げるのに最適な方法と、どの筋肉を使用するかを決定します。これは、SQL Serverが統計で行うことの一種です。

-- create a nonclustered index -- with the key column as my_int_col create index IX_MyTable1_MyIntCol on MyTable1(my_int_col); go -- let's look at this index select object_name(object_id) as object_name, name as index_name, index_id, type_desc, is_unique, fill_factor from sys.indexes where name = 'IX_MyTable1_MyIntCol'; -- now let's see some physical aspects -- of this particular index -- (I retrieved index_id from the above query) select * from sys.dm_db_index_physical_stats ( db_id('TestDB'), object_id('MyTable1'), 4, null, 'detailed' );

上記の例から、インデックスには実際にデータが含まれていることがわかります（インデックスのタイプによって、リーフページは異なります）。

この投稿では、SQL Serverのこれら2つのlarge側面の非常に非常に短い概要のみを示しています。これらは両方とも、章や本を取り上げる可能性があります。参考文献をいくつか読んでください。そうすれば、理解が深まります。