web-dev-qa-db-ja.com

CTE階層の最適化

以下で更新

アカウントの階層を表すための典型的なアカウント/親アカウントアーキテクチャを持つアカウントのテーブルがあります(SQL Server 2012)。階層をハッシュ化するためにCTEを使用してVIEWを作成しましたが、全体的に見た目は美しく、意図したとおりに機能します。階層を任意のレベルで照会でき、ブランチを簡単に確認できます。

階層の関数として返す必要があるビジネスロジックフィールドが1つあります。各アカウントレコードのフィールドは、ビジネスのサイズを示します(これをCustomerCountと呼びます)。レポートする必要があるロジックでは、ブランチ全体からCustomerCountをロールアップする必要があります。つまり、アカウントが与えられた場合、そのアカウントのcustomercount値と、階層に沿ったアカウントの下のすべてのブランチのすべての子を合計する必要があります。

CTE内に構築された、acct4.acct3.acct2.acct1のような階層フィールドを使用して、フィールドを正常に計算しました。私が遭遇している問題は、単に高速で実行することです。この1つの計算フィールドがない場合、クエリは約3秒で実行されます。計算フィールドを追加すると、4分のクエリになります。

これが、正しい結果を返す、私が思いついた最高のバージョンです。パフォーマンスをそれほど犠牲にすることなく、このAS A VIEWを再構築する方法についてのアイデアを探しています。

これが遅くなる理由はわかりますが(where句で述語を計算する必要があります)、それを構造化して同じ結果を得る別の方法は考えられません。

これは、テーブルを作成し、CTEを私の環境で機能するのとまったく同じように実行するためのサンプルコードです。

Use Tempdb
go
CREATE TABLE dbo.Account
(
   Acctid varchar(1) NOT NULL
    , Name varchar(30) NULL
    , ParentId varchar(1) NULL
    , CustomerCount int NULL
);

INSERT Account
SELECT 'A','Best Bet',NULL,21  UNION ALL
SELECT 'B','eStore','A',30 UNION ALL
SELECT 'C','Big Bens','B',75 UNION ALL
SELECT 'D','Mr. Jimbo','B',50 UNION ALL
SELECT 'E','Dr. John','C',100 UNION ALL
SELECT 'F','Brick','A',222 UNION ALL
SELECT 'G','Mortar','C',153 ;


With AccountHierarchy AS

(                                                                           --Root values have no parent
    SELECT
        Root.AcctId                                         AccountId
        , Root.Name                                         AccountName
        , Root.ParentId                                     ParentId
        , 1                                                 HierarchyLevel  
        , cast(Root.Acctid as varchar(4000))                IdHierarchy     --highest parent reads right to left as in id3.Acctid2.Acctid1
        , cast(replace(Root.Name,'.','') as varchar(4000))  NameHierarchy   --highest parent reads right to left as in name3.name2.name1 (replace '.' so name parse is easy in last step)
        , cast(Root.Acctid as varchar(4000))                HierarchySort   --reverse of above, read left to right name1.name2.name3 for sorting on reporting only
        , cast(Root.Name as varchar(4000))                  HierarchyLabel  --use for labels on reporting only, indents names under sorted hierarchy
        , Root.CustomerCount                                CustomerCount   

    FROM 
        tempdb.dbo.account Root

    WHERE
        Root.ParentID is null

    UNION ALL

    SELECT
        Recurse.Acctid                                      AccountId
        , Recurse.Name                                      AccountName
        , Recurse.ParentId                                  ParentId
        , Root.HierarchyLevel + 1                           HierarchyLevel  --next level in hierarchy
        , cast(cast(recurse.Acctid as varchar(40)) + '.' + Root.IdHierarchy as varchar(4000))   IdHierarchy --cast because in real system this is a uniqueidentifier type needs converting
        , cast(replace(recurse.Name,'.','') + '.' + Root.NameHierarchy as varchar(4000)) NameHierarchy  --replace '.' for parsing in last step, cast to make room for lots of sub levels down the hierarchy
        , cast(Root.AccountName + '.' + Recurse.Name as varchar(4000)) HierarchySort    
        , cast(space(root.HierarchyLevel * 4) + Recurse.Name as varchar(4000)) HierarchyLabel
        , Recurse.CustomerCount                             CustomerCount

    FROM
        tempdb.dbo.account Recurse INNER JOIN
        AccountHierarchy Root on Root.AccountId = Recurse.ParentId
)


SELECT
    hier.AccountId
    , Hier.AccountName
    , hier.ParentId
    , hier.HierarchyLevel
    , hier.IdHierarchy
    , hier.NameHierarchy
    , hier.HierarchyLabel
    , parsename(hier.IdHierarchy,1) Acct1Id
    , parsename(hier.NameHierarchy,1) Acct1Name     --This is why we stripped out '.' during recursion
    , parsename(hier.IdHierarchy,2) Acct2Id
    , parsename(hier.NameHierarchy,2) Acct2Name
    , parsename(hier.IdHierarchy,3) Acct3Id
    , parsename(hier.NameHierarchy,3) Acct3Name
    , parsename(hier.IdHierarchy,4) Acct4Id
    , parsename(hier.NameHierarchy,4) Acct4Name
    , hier.CustomerCount

    /* fantastic up to this point. Next block of code is what causes problem. 
        Logic of code is "sum of CustomerCount for this location and all branches below in this branch of hierarchy"
        In live environment, goes from taking 3 seconds to 4 minutes by adding this one calc */

    , (
        SELECT  
            sum(children.CustomerCount)
        FROM
            AccountHierarchy Children
        WHERE
            hier.IdHierarchy = right(children.IdHierarchy, (1 /*length of id field*/ * hier.HierarchyLevel) + hier.HierarchyLevel - 1 /*for periods inbetween ids*/)
            --"where this location's idhierarchy is within child idhierarchy"
            --previously tried a charindex(hier.IdHierarchy,children.IdHierarchy)>0, but that performed even worse
        ) TotalCustomerCount
FROM
    AccountHierarchy hier

ORDER BY
    hier.HierarchySort


drop table tempdb.dbo.Account

2013年11月20日更新

提案された解決策のいくつかは私のジュースを流しました、そして私は近づく新しいアプローチを試みましたが、新しい/異なる障害をもたらします。正直なところ、これが別の投稿に値するかどうかはわかりませんが、これはこの問題の解決策に関連しています。

Sum(customercount)を難しくしているのは、最上部から始まり、下がっていく階層のコンテキストで子供を識別することだと私が決定したのは、したがって、「他のアカウントの親ではないアカウント」で定義されたルートを使用して、下から上に構築する階層を作成し、後方への再帰的結合を実行することから始めました(root.parentacctid = recurse.acctid)

このようにして、再帰が発生したときに子の顧客数を親に追加できます。レポートとレベルの必要性から、上から下に加えて下から上にCTEを実行し、アカウントIDを介してそれらを結合しています。このアプローチは、元の外部クエリの顧客数よりもはるかに高速であることが判明しましたが、いくつかの障害に遭遇しました。

1つ目は、複数の子の親であるアカウントの重複顧客数をうっかりキャプチャしていたことです。私はいくつかのアカウントの顧客数を2倍または3倍にしました。私の解決策は、アカウントが持つノードの数をカウントするさらに別のcteを作成し、再帰中にacct.customercountを分割することです。そのため、ブランチ全体を合計しても、アカウントは二重にカウントされません。

したがって、現時点では、この新しいバージョンの結果は正しくありませんが、理由はわかります。ボトムアップcteは重複を作成しています。再帰が成功すると、アカウントテーブルのアカウントの子であるルート(最下位レベルの子)内のすべてが検索されます。 3番目の再帰では、2番目の再帰で行ったのと同じアカウントを取得し、それらを再び配置します。

ボトムアップCTEを実行する方法に関するアイデア、または他のアイデアが流れるのでしょうか?

Use Tempdb
go


CREATE TABLE dbo.Account
(
    Acctid varchar(1) NOT NULL
    , Name varchar(30) NULL
    , ParentId varchar(1) NULL
    , CustomerCount int NULL
);

INSERT Account
SELECT 'A','Best Bet',NULL,1  UNION ALL
SELECT 'B','eStore','A',2 UNION ALL
SELECT 'C','Big Bens','B',3 UNION ALL
SELECT 'D','Mr. Jimbo','B',4 UNION ALL
SELECT 'E','Dr. John','C',5 UNION ALL
SELECT 'F','Brick','A',6 UNION ALL
SELECT 'G','Mortar','C',7 ;



With AccountHierarchy AS

(                                                                           --Root values have no parent
    SELECT
        Root.AcctId                                         AccountId
        , Root.Name                                         AccountName
        , Root.ParentId                                     ParentId
        , 1                                                 HierarchyLevel  
        , cast(Root.Acctid as varchar(4000))                IdHierarchy     --highest parent reads right to left as in id3.Acctid2.Acctid1
        , cast(replace(Root.Name,'.','') as varchar(4000))  NameHierarchy   --highest parent reads right to left as in name3.name2.name1 (replace '.' so name parse is easy in last step)
        , cast(Root.Acctid as varchar(4000))                HierarchySort   --reverse of above, read left to right name1.name2.name3 for sorting on reporting only
        , cast(Root.Acctid as varchar(4000))                HierarchyMatch 
        , cast(Root.Name as varchar(4000))                  HierarchyLabel  --use for labels on reporting only, indents names under sorted hierarchy
        , Root.CustomerCount                                CustomerCount   

    FROM 
        tempdb.dbo.account Root

    WHERE
        Root.ParentID is null

    UNION ALL

    SELECT
        Recurse.Acctid                                      AccountId
        , Recurse.Name                                      AccountName
        , Recurse.ParentId                                  ParentId
        , Root.HierarchyLevel + 1                           HierarchyLevel  --next level in hierarchy
        , cast(cast(recurse.Acctid as varchar(40)) + '.' + Root.IdHierarchy as varchar(4000))   IdHierarchy --cast because in real system this is a uniqueidentifier type needs converting
        , cast(replace(recurse.Name,'.','') + '.' + Root.NameHierarchy as varchar(4000)) NameHierarchy  --replace '.' for parsing in last step, cast to make room for lots of sub levels down the hierarchy
        , cast(Root.AccountName + '.' + Recurse.Name as varchar(4000)) HierarchySort    
        , CAST(CAST(Root.HierarchyMatch as varchar(40)) + '.' 
            + cast(recurse.Acctid as varchar(40))   as varchar(4000))   HierarchyMatch
        , cast(space(root.HierarchyLevel * 4) + Recurse.Name as varchar(4000)) HierarchyLabel
        , Recurse.CustomerCount                             CustomerCount

    FROM
        tempdb.dbo.account Recurse INNER JOIN
        AccountHierarchy Root on Root.AccountId = Recurse.ParentId
)

, Nodes as
(   --counts how many branches are below for any account that is parent to another
    select
        node.ParentId Acctid
        , cast(count(1) as float) Nodes
    from AccountHierarchy  node
    group by ParentId
)

, BottomUp as
(   --creates the hierarchy starting at accounts that are not parent to any other
    select
        Root.Acctid
        , root.ParentId
        , cast(isnull(root.customercount,0) as float) CustomerCount
    from
        tempdb.dbo.Account Root
    where
        not exists ( select 1 from tempdb.dbo.Account OtherAccts where root.Acctid = OtherAccts.ParentId)

    union all

    select
        Recurse.Acctid
        , Recurse.ParentId
        , root.CustomerCount + cast ((isnull(recurse.customercount,0) / nodes.nodes) as float) CustomerCount
        -- divide the recurse customercount by number of nodes to prevent duplicate customer count on accts that are parent to multiple children, see customercount cte next
    from
        tempdb.dbo.Account Recurse inner join 
        BottomUp Root on root.ParentId = recurse.acctid inner join
        Nodes on nodes.Acctid = recurse.Acctid
)

, CustomerCount as
(
    select
        sum(CustomerCount) TotalCustomerCount
        , hier.acctid
    from
        BottomUp hier
    group by 
        hier.Acctid
)


SELECT
    hier.AccountId
    , Hier.AccountName
    , hier.ParentId
    , hier.HierarchyLevel
    , hier.IdHierarchy
    , hier.NameHierarchy
    , hier.HierarchyLabel
    , hier.hierarchymatch
    , parsename(hier.IdHierarchy,1) Acct1Id
    , parsename(hier.NameHierarchy,1) Acct1Name     --This is why we stripped out '.' during recursion
    , parsename(hier.IdHierarchy,2) Acct2Id
    , parsename(hier.NameHierarchy,2) Acct2Name
    , parsename(hier.IdHierarchy,3) Acct3Id
    , parsename(hier.NameHierarchy,3) Acct3Name
    , parsename(hier.IdHierarchy,4) Acct4Id
    , parsename(hier.NameHierarchy,4) Acct4Name
    , hier.CustomerCount

    , customercount.TotalCustomerCount

FROM
    AccountHierarchy hier inner join
    CustomerCount on customercount.acctid = hier.accountid

ORDER BY
    hier.HierarchySort 



drop table tempdb.dbo.Account
15
liver.larson

編集:これは2回目の試行です

@Max Vernonの回答に基づいて、インラインサブクエリ内でのCTEの使用を回避する方法を次に示します。これは、CTEに自己結合するようなものであり、効率が悪い理由だと思います。 SQL-Serverの2012バージョンでのみ利用可能な分析関数を使用します。 SQL-Fiddleでテスト済み

この部分は読むことをスキップできます、それはマックスの答えからのコピーペーストです:

_;With AccountHierarchy AS
(                                                                           
    SELECT
        Root.AcctId                                         AccountId
        , Root.Name                                         AccountName
        , Root.ParentId                                     ParentId
        , 1                                                 HierarchyLevel  
        , cast(Root.Acctid as varchar(4000))                IdHierarchyMatch        
        , cast(Root.Acctid as varchar(4000))                IdHierarchy
        , cast(replace(Root.Name,'.','') as varchar(4000))  NameHierarchy   
        , cast(Root.Acctid as varchar(4000))                HierarchySort
        , cast(Root.Name as varchar(4000))                  HierarchyLabel          ,
        Root.CustomerCount                                  CustomerCount   

    FROM 
        account Root

    WHERE
        Root.ParentID is null

    UNION ALL

    SELECT
        Recurse.Acctid                                      AccountId
        , Recurse.Name                                      AccountName
        , Recurse.ParentId                                  ParentId
        , Root.HierarchyLevel + 1                           HierarchyLevel
        , CAST(CAST(Root.IdHierarchyMatch as varchar(40)) + '.' 
            + cast(recurse.Acctid as varchar(40))   as varchar(4000))   IdHierarchyMatch
        , cast(cast(recurse.Acctid as varchar(40)) + '.' 
            + Root.IdHierarchy  as varchar(4000))           IdHierarchy
        , cast(replace(recurse.Name,'.','') + '.' 
            + Root.NameHierarchy as varchar(4000))          NameHierarchy
        , cast(Root.AccountName + '.' 
            + Recurse.Name as varchar(4000))                HierarchySort   
        , cast(space(root.HierarchyLevel * 4) 
            + Recurse.Name as varchar(4000))                HierarchyLabel
        , Recurse.CustomerCount                             CustomerCount
    FROM
        account Recurse INNER JOIN
        AccountHierarchy Root on Root.AccountId = Recurse.ParentId
)
_

ここでは、IdHierarchyMatchを使用してCTEの行を並べ替え、行番号と現在の合計(次の行から最後まで)を計算します。

_, cte1 AS 
(
SELECT
    h.AccountId
    , h.AccountName
    , h.ParentId
    , h.HierarchyLevel
    , h.IdHierarchy
    , h.NameHierarchy
    , h.HierarchyLabel
    , parsename(h.IdHierarchy,1) Acct1Id
    , parsename(h.NameHierarchy,1) Acct1Name
    , parsename(h.IdHierarchy,2) Acct2Id
    , parsename(h.NameHierarchy,2) Acct2Name
    , parsename(h.IdHierarchy,3) Acct3Id
    , parsename(h.NameHierarchy,3) Acct3Name
    , parsename(h.IdHierarchy,4) Acct4Id
    , parsename(h.NameHierarchy,4) Acct4Name
    , h.CustomerCount
    , h.HierarchySort
    , h.IdHierarchyMatch
        , Rn = ROW_NUMBER() OVER 
                  (ORDER BY h.IdHierarchyMatch)
        , RunningCustomerCount = COALESCE(
            SUM(h.CustomerCount)
            OVER
              (ORDER BY h.IdHierarchyMatch
               ROWS BETWEEN 1 FOLLOWING
                        AND UNBOUNDED FOLLOWING)
          , 0) 
FROM
    AccountHierarchy AS h  
)
_

次に、以前の累計と行番号を使用する中間CTEがもう1つあります。基本的には、ツリー構造のブランチのエンドポイントの場所を見つけるためです。

_, cte2 AS
(
SELECT
    cte1.*
    , rn3  = LAST_VALUE(Rn) OVER 
               (PARTITION BY Acct1Id, Acct2Id, Acct3Id 
                ORDER BY Acct4Id
                ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)       
    , rn2  = LAST_VALUE(Rn) OVER 
               (PARTITION BY Acct1Id, Acct2Id 
                ORDER BY Acct3Id, Acct4Id
                ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) 
    , rn1  = LAST_VALUE(Rn) OVER 
               (PARTITION BY Acct1Id 
                ORDER BY Acct2Id, Acct3Id, Acct4Id
                ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) 
    , rcc3 = LAST_VALUE(RunningCustomerCount) OVER 
               (PARTITION BY Acct1Id, Acct2Id, Acct3Id 
                ORDER BY Acct4Id
                ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)       
    , rcc2 = LAST_VALUE(RunningCustomerCount) OVER 
               (PARTITION BY Acct1Id, Acct2Id 
                ORDER BY Acct3Id, Acct4Id
                ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) 
    , rcc1 = LAST_VALUE(RunningCustomerCount) OVER 
               (PARTITION BY Acct1Id 
                ORDER BY Acct2Id, Acct3Id, Acct4Id
                ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) 
FROM
    cte1 
) 
_

最後に、最後の部分を作成します。

_SELECT
    hier.AccountId
    , hier.AccountName
    ---                        -- columns skipped 
    , hier.CustomerCount

    , TotalCustomerCount = hier.CustomerCount
        + hier.RunningCustomerCount 
        - ca.LastRunningCustomerCount

    , hier.HierarchySort
    , hier.IdHierarchyMatch
FROM
    cte2 hier
  OUTER APPLY
    ( SELECT  LastRunningCustomerCount, Rn
      FROM
      ( SELECT LastRunningCustomerCount
              = RunningCustomerCount, Rn
        FROM (SELECT NULL a) x  WHERE 4 <= HierarchyLevel 
      UNION ALL
        SELECT rcc3, Rn3
        FROM (SELECT NULL a) x  WHERE 3 <= HierarchyLevel 
      UNION ALL
        SELECT rcc2, Rn2 
        FROM (SELECT NULL a) x  WHERE 2 <= HierarchyLevel 
      UNION ALL
        SELECT rcc1, Rn1
        FROM (SELECT NULL a) x  WHERE 1 <= HierarchyLevel 
      ) x
      ORDER BY Rn 
      OFFSET 0 ROWS
      FETCH NEXT 1 ROWS ONLY
      ) ca
ORDER BY
    hier.HierarchySort ; 
_

そして、上記のコードと同じ_cte1_を使用して単純化します。 SQL-Fiddle-2でテストします。どちらのソリューションも、ツリーに最大4つのレベルがあることを前提として機能することに注意してください。

_SELECT
    hier.AccountId
    ---                      -- skipping rows
    , hier.CustomerCount

    , TotalCustomerCount = CustomerCount
        + RunningCustomerCount 
        - CASE HierarchyLevel
            WHEN 4 THEN RunningCustomerCount
            WHEN 3 THEN LAST_VALUE(RunningCustomerCount) OVER 
                   (PARTITION BY Acct1Id, Acct2Id, Acct3Id 
                    ORDER BY Acct4Id
                    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)       
            WHEN 2 THEN LAST_VALUE(RunningCustomerCount) OVER 
                   (PARTITION BY Acct1Id, Acct2Id 
                    ORDER BY Acct3Id, Acct4Id
                    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) 
            WHEN 1 THEN LAST_VALUE(RunningCustomerCount) OVER 
                   (PARTITION BY Acct1Id 
                    ORDER BY Acct2Id, Acct3Id, Acct4Id
                    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) 
          END

    , hier.HierarchySort
    , hier.IdHierarchyMatch
FROM cte1 AS hier
ORDER BY
    hier.HierarchySort ; 
_

3番目のアプローチは、CTEが1つだけで、再帰部分がウィンドウ集約関数(SUM() OVER (...))のみであるため、2005以降のすべてのバージョンで機能します。 SQL-Fiddle-でテストこのソリューションは、前のソリューションと同様に、階層に最大4つのレベルがあることを前提としています木:

_;WITH AccountHierarchy AS
(                                                                           
    SELECT
          AccountId      = Root.AcctId                                         
        , AccountName    = Root.Name                                         
        , ParentId       = Root.ParentId                                     
        , HierarchyLevel = 1                                                   
        , HierarchySort  = CAST(Root.Acctid AS VARCHAR(4000))                
        , HierarchyLabel = CAST(Root.Name AS VARCHAR(4000))                   
        , Acct1Id        = CAST(Root.Acctid AS VARCHAR(4000))                
        , Acct2Id        = CAST(NULL AS VARCHAR(4000))                       
        , Acct3Id        = CAST(NULL AS VARCHAR(4000))                       
        , Acct4Id        = CAST(NULL AS VARCHAR(4000))                       
        , Acct1Name      = CAST(Root.Name AS VARCHAR(4000))                  
        , Acct2Name      = CAST(NULL AS VARCHAR(4000))                       
        , Acct3Name      = CAST(NULL AS VARCHAR(4000))                       
        , Acct4Name      = CAST(NULL AS VARCHAR(4000))                       
        , CustomerCount  = Root.CustomerCount                                   

    FROM 
        account AS Root

    WHERE
        Root.ParentID IS NULL

    UNION ALL

    SELECT
          Recurse.Acctid 
        , Recurse.Name 
        , Recurse.ParentId 
        , Root.HierarchyLevel + 1 
        , CAST(Root.AccountName + '.' 
            + Recurse.Name AS VARCHAR(4000)) 
        , CAST(SPACE(Root.HierarchyLevel * 4) 
            + Recurse.Name AS VARCHAR(4000)) 
        , Root.Acct1Id 
        , CASE WHEN Root.HierarchyLevel = 1 
              THEN cast(Recurse.Acctid AS VARCHAR(4000)) 
              ELSE Root.Acct2Id 
          END 
        , CASE WHEN Root.HierarchyLevel = 2 
              THEN CAST(Recurse.Acctid AS VARCHAR(4000)) 
              ELSE Root.Acct3Id 
          END 
        , CASE WHEN Root.HierarchyLevel = 3 
              THEN CAST(Recurse.Acctid AS VARCHAR(4000)) 
              ELSE Root.Acct4Id 
          END 

        , cast(Root.AccountName as varchar(4000))          
        , CASE WHEN Root.HierarchyLevel = 1 
              THEN CAST(Recurse.Name AS VARCHAR(4000)) 
              ELSE Root.Acct2Name 
          END 
        , CASE WHEN Root.HierarchyLevel = 2 
              THEN CAST(Recurse.Name AS VARCHAR(4000)) 
              ELSE Root.Acct3Name 
          END 
        , CASE WHEN Root.HierarchyLevel = 3 
              THEN CAST(Recurse.Name AS VARCHAR(4000)) 
              ELSE Root.Acct4Name 
          END 
        , Recurse.CustomerCount 
    FROM 
        account AS Recurse INNER JOIN 
        AccountHierarchy AS Root ON Root.AccountId = Recurse.ParentId
)

SELECT
      h.AccountId
    , h.AccountName
    , h.ParentId
    , h.HierarchyLevel
    , IdHierarchy = 
          CAST(COALESCE(h.Acct4Id+'.','') 
               + COALESCE(h.Acct3Id+'.','') 
               + COALESCE(h.Acct2Id+'.','') 
               + h.Acct1Id AS VARCHAR(4000))
    , NameHierarchy = 
          CAST(COALESCE(h.Acct4Name+'.','') 
               + COALESCE(h.Acct3Name+'.','') 
               + COALESCE(h.Acct2Name+'.','') 
               + h.Acct1Name AS VARCHAR(4000))   
    , h.HierarchyLabel
    , h.Acct1Id
    , h.Acct1Name
    , h.Acct2Id
    , h.Acct2Name
    , h.Acct3Id
    , h.Acct3Name
    , h.Acct4Id
    , h.Acct4Name
    , h.CustomerCount
    , TotalCustomerCount =  
          CASE h.HierarchyLevel
            WHEN 4 THEN h.CustomerCount
            WHEN 3 THEN SUM(h.CustomerCount) OVER 
                   (PARTITION BY h.Acct1Id, h.Acct2Id, h.Acct3Id)       
            WHEN 2 THEN SUM(h.CustomerCount) OVER 
                   (PARTITION BY Acct1Id, h.Acct2Id) 
            WHEN 1 THEN SUM(h.CustomerCount) OVER 
                   (PARTITION BY h.Acct1Id) 
          END
    , h.HierarchySort
    , IdHierarchyMatch = 
          CAST(h.Acct1Id 
               + COALESCE('.'+h.Acct2Id,'') 
               + COALESCE('.'+h.Acct3Id,'') 
               + COALESCE('.'+h.Acct4Id,'') AS VARCHAR(4000))   
FROM
    AccountHierarchy AS h  
ORDER BY
    h.HierarchySort ; 
_

中間CTEとして、階層のクロージャテーブルを計算する4番目のアプローチ。 SQL-Fiddle-4でテストします。利点は、合計の計算の場合、レベル数に制限がないことです。

_;WITH AccountHierarchy AS
( 
    -- skipping several line, identical to the 3rd approach above
)

, ClosureTable AS
( 
    SELECT
          AccountId      = Root.AcctId  
        , AncestorId     = Root.AcctId  
        , CustomerCount  = Root.CustomerCount 
    FROM 
        account AS Root

    UNION ALL

    SELECT
          Recurse.Acctid 
        , Root.AncestorId 
        , Recurse.CustomerCount
    FROM 
        account AS Recurse INNER JOIN 
        ClosureTable AS Root ON Root.AccountId = Recurse.ParentId
)

, ClosureGroup AS
(                                                                           
    SELECT
          AccountId           = AncestorId  
        , TotalCustomerCount  = SUM(CustomerCount)                             
    FROM 
        ClosureTable AS a
    GROUP BY
        AncestorId
)

SELECT
      h.AccountId
    , h.AccountName
    , h.ParentId
    , h.HierarchyLevel 
    , h.HierarchyLabel
    , h.CustomerCount
    , cg.TotalCustomerCount 

    , h.HierarchySort
FROM
    AccountHierarchy AS h  
  JOIN
    ClosureGroup AS cg
      ON cg.AccountId = h.AccountId
ORDER BY
    h.HierarchySort ;  
_
6
ypercubeᵀᴹ

私はこれがより速くなると信じています:

;With AccountHierarchy AS
(                                                                           
    SELECT
        Root.AcctId                                         AccountId
        , Root.Name                                         AccountName
        , Root.ParentId                                     ParentId
        , 1                                                 HierarchyLevel  
        , cast(Root.Acctid as varchar(4000))                IdHierarchyMatch        
        , cast(Root.Acctid as varchar(4000))                IdHierarchy
        , cast(replace(Root.Name,'.','') as varchar(4000))  NameHierarchy   
        , cast(Root.Acctid as varchar(4000))                HierarchySort
        , cast(Root.Name as varchar(4000))                  HierarchyLabel          ,
        Root.CustomerCount                                  CustomerCount   

    FROM 
        tempdb.dbo.account Root

    WHERE
        Root.ParentID is null

    UNION ALL

    SELECT
        Recurse.Acctid                                      AccountId
        , Recurse.Name                                      AccountName
        , Recurse.ParentId                                  ParentId
        , Root.HierarchyLevel + 1                           HierarchyLevel
        , CAST(CAST(Root.IdHierarchyMatch as varchar(40)) + '.' 
            + cast(recurse.Acctid as varchar(40))   as varchar(4000))   IdHierarchyMatch
        , cast(cast(recurse.Acctid as varchar(40)) + '.' 
            + Root.IdHierarchy  as varchar(4000))           IdHierarchy
        , cast(replace(recurse.Name,'.','') + '.' 
            + Root.NameHierarchy as varchar(4000))          NameHierarchy
        , cast(Root.AccountName + '.' 
            + Recurse.Name as varchar(4000))                HierarchySort   
        , cast(space(root.HierarchyLevel * 4) 
            + Recurse.Name as varchar(4000))                HierarchyLabel
        , Recurse.CustomerCount                             CustomerCount
    FROM
        tempdb.dbo.account Recurse INNER JOIN
        AccountHierarchy Root on Root.AccountId = Recurse.ParentId
)


SELECT
    hier.AccountId
    , Hier.AccountName
    , hier.ParentId
    , hier.HierarchyLevel
    , hier.IdHierarchy
    , hier.NameHierarchy
    , hier.HierarchyLabel
    , parsename(hier.IdHierarchy,1) Acct1Id
    , parsename(hier.NameHierarchy,1) Acct1Name
    , parsename(hier.IdHierarchy,2) Acct2Id
    , parsename(hier.NameHierarchy,2) Acct2Name
    , parsename(hier.IdHierarchy,3) Acct3Id
    , parsename(hier.NameHierarchy,3) Acct3Name
    , parsename(hier.IdHierarchy,4) Acct4Id
    , parsename(hier.NameHierarchy,4) Acct4Name
    , hier.CustomerCount
    , (
        SELECT  
            sum(children.CustomerCount)
        FROM
            AccountHierarchy Children
        WHERE
            Children.IdHierarchyMatch LIKE hier.IdHierarchyMatch + '%'
        ) TotalCustomerCount
        , HierarchySort
        , IdHierarchyMatch
FROM
    AccountHierarchy hier
ORDER BY
    hier.HierarchySort

IdHierarchyMatchのフォワードバージョンであるIdHierarchyという名前のCTEに列を追加して、TotalCustomerCountサブクエリWHERE句を検索可能にします。

実行プランの推定サブツリーコストを比較すると、この方法は約5倍高速になります。

5
Max Vernon

私も試してみました。あまりきれいではありませんが、パフォーマンスは向上しているようです。

USE Tempdb
go

SET STATISTICS IO ON;
SET STATISTICS TIME OFF;
SET NOCOUNT ON;

--------
-- assuming the original table looks something like this 
-- and you cannot control it's indexes 
-- (only widened the data types a bit for the extra sample rows)
--------
CREATE TABLE dbo.Account
    (
      Acctid VARCHAR(10) NOT NULL ,
      Name VARCHAR(100) NULL ,
      ParentId VARCHAR(10) NULL ,
      CustomerCount INT NULL
    );

--------
-- inserting the same records as in your sample
--------
INSERT  Account
        SELECT  'A' ,
                'Best Bet' ,
                NULL ,
                21
        UNION ALL
        SELECT  'B' ,
                'eStore' ,
                'A' ,
                30
        UNION ALL
        SELECT  'C' ,
                'Big Bens' ,
                'B' ,
                75
        UNION ALL
        SELECT  'D' ,
                'Mr. Jimbo' ,
                'B' ,
                50
        UNION ALL
        SELECT  'E' ,
                'Dr. John' ,
                'C' ,
                100
        UNION ALL
        SELECT  'F' ,
                'Brick' ,
                'A' ,
                222
        UNION ALL
        SELECT  'G' ,
                'Mortar' ,
                'C' ,
                153;

--------
-- now lets up the ante a bit and add some extra rows with random parents 
-- to these 7 items, it is hard to measure differences with so few rows
--------
DECLARE @numberOfRows INT = 25000
DECLARE @from INT = 1
DECLARE @to INT = 7
DECLARE @T1 TABLE ( n INT ); 

WITH    cte ( n )
          AS ( SELECT   ROW_NUMBER() OVER ( ORDER BY CURRENT_TIMESTAMP )
               FROM     sys.messages
             )
    INSERT  INTO @T1
            SELECT  n
            FROM    cte
            WHERE   n <= @numberOfRows;

INSERT  INTO dbo.Account
        ( acctId ,
          name ,
          parentId ,
          Customercount
        )
        SELECT  CHAR(64 + RandomNumber) + CAST(n AS VARCHAR(10)) AS Id ,
                CAST('item ' + CHAR(64 + RandomNumber) + CAST(n AS VARCHAR(10)) AS VARCHAR(100)) ,
                CHAR(64 + RandomNumber) AS parentId ,
                ABS(CHECKSUM(NEWID()) % 100) + 1 AS RandomCustCount
        FROM    ( SELECT    n ,
                            ABS(CHECKSUM(NEWID()) % @to) + @from AS RandomNumber
                  FROM      @T1
                ) A;

--------
-- Assuming you cannot control it's indexes, in my tests we're better off taking the IO hit of copying the data
-- to some structure that is better optimized for this query. Not quite what I initially expected,  but we seem 
-- to be better off that way.
--------
CREATE TABLE tempdb.dbo.T1
    (
      AccountId VARCHAR(10) NOT NULL
                            PRIMARY KEY NONCLUSTERED ,
      AccountName VARCHAR(100) NOT NULL ,
      ParentId VARCHAR(10) NULL ,
      HierarchyLevel INT NULL ,
      HPath VARCHAR(1000) NULL ,
      IdHierarchy VARCHAR(1000) NULL ,
      NameHierarchy VARCHAR(1000) NULL ,
      HierarchyLabel VARCHAR(1000) NULL ,
      HierarchySort VARCHAR(1000) NULL ,
      CustomerCount INT NOT NULL
    );

CREATE CLUSTERED INDEX IX_Q1
ON tempdb.dbo.T1  ([ParentId]);

-- for summing customer counts over parents
CREATE NONCLUSTERED INDEX IX_Q2 
ON tempdb.dbo.T1  (HPath) INCLUDE(CustomerCount);

INSERT  INTO tempdb.dbo.T1
        ( AccountId ,
          AccountName ,
          ParentId ,
          HierarchyLevel ,
          HPath ,
          IdHierarchy ,
          NameHierarchy ,
          HierarchyLabel ,
          HierarchySort ,
          CustomerCount 
        )
        SELECT  Acctid AS AccountId ,
                Name AS AccountName ,
                ParentId AS ParentId ,
                NULL AS HierarchyLevel ,
                NULL AS HPath ,
                NULL AS IdHierarchy ,
                NULL AS NameHierarchy ,
                NULL AS HierarchyLabel ,
                NULL AS HierarchySort ,
                CustomerCount AS CustomerCount
        FROM    tempdb.dbo.account;



--------
-- I cannot seem to force an efficient way to do the sum while selecting over the recursive cte, 
-- so I took it aside. I am sure there is a more elegant way but I can't seem to make it happen. 
-- At least it performs better this way. But it remains a very expensive query.
--------
;
WITH    AccountHierarchy
          AS ( SELECT   Root.AccountId AS AcId ,
                        Root.ParentId ,
                        1 AS HLvl ,
                        CAST(Root.AccountId AS VARCHAR(1000)) AS [HPa] ,
                        CAST(Root.accountId AS VARCHAR(1000)) AS hid ,
                        CAST(REPLACE(Root.AccountName, '.', '') AS VARCHAR(1000)) AS hn ,
                        CAST(Root.accountid AS VARCHAR(1000)) AS hs ,
                        CAST(Root.accountname AS VARCHAR(1000)) AS hl
               FROM     tempdb.dbo.T1 Root
               WHERE    Root.ParentID IS NULL
               UNION ALL
               SELECT   Recurse.AccountId AS acid ,
                        Recurse.ParentId ParentId ,
                        Root.Hlvl + 1 AS hlvl ,
                        CAST(Root.HPa + '.' + Recurse.AccountId AS VARCHAR(1000)) AS hpa ,
                        CAST(recurse.AccountId + '.' + Root.hid AS VARCHAR(1000)) AS hid ,
                        CAST(REPLACE(recurse.AccountName, '.', '') + '.' + Root.hn AS VARCHAR(1000)) AS hn ,
                        CAST(Root.hs + '.' + Recurse.AccountName AS VARCHAR(1000)) AS hs ,
                        CAST(SPACE(root.hlvl * 4) + Recurse.AccountName AS VARCHAR(1000)) AS hl
               FROM     tempdb.dbo.T1 Recurse
                        INNER JOIN AccountHierarchy Root ON Root.AcId = Recurse.ParentId
             )
    UPDATE  tempdb.dbo.T1
    SET     HierarchyLevel = HLvl ,
            HPath = Hpa ,
            IdHierarchy = hid ,
            NameHierarchy = hn ,
            HierarchyLabel = hl ,
            HierarchySort = hs
    FROM    AccountHierarchy
    WHERE   AccountId = AcId;

SELECT  --HPath ,
        AccountId ,
        AccountName ,
        ParentId ,
        HierarchyLevel ,
        IdHierarchy ,
        NameHierarchy ,
        HierarchyLabel ,
        PARSENAME(IdHierarchy, 1) Acct1Id ,
        PARSENAME(NameHierarchy, 1) Acct1Name ,
        PARSENAME(IdHierarchy, 2) Acct2Id ,
        PARSENAME(NameHierarchy, 2) Acct2Name ,
        PARSENAME(IdHierarchy, 3) Acct3Id ,
        PARSENAME(NameHierarchy, 3) Acct3Name ,
        PARSENAME(IdHierarchy, 4) Acct4Id ,
        PARSENAME(NameHierarchy, 4) Acct4Name ,
        CustomerCount ,
        Cnt.TotalCustomerCount
FROM    tempdb.dbo.t1 Hier
        CROSS APPLY ( SELECT    SUM(CustomerCount) AS TotalCustomerCount
                      FROM      tempdb.dbo.t1
                      WHERE     HPath LIKE hier.HPath + '%'
                    ) Cnt
ORDER BY HierarchySort;

DROP TABLE tempdb.dbo.t1;
DROP TABLE tempdb.dbo.Account;
3
souplex