Hans Klüser

 

There are different advantages and requirements to flatten a parent-child-hierarchy as mentioned in my previous blog posts.

In the following example I will show you a way to flatten most parent-child-hierarchies in a standardized pattern.

Our goal is a dimension which consists of two parts:

  • The left part resembles the original parent-child-structure and
  • the right part is a transposed version of the essential attributes.

Parent-child-hierarchies

Let’s use the DimEmployee from AdventureWorksDW2012.

First we write two common-table-expressions to define the left part and the essential attributes which need to be transposed. These are the main parts which you need to edit when using the code for different hierarchies.

WITH left_part AS
(
SELECT
NodeKey = EmployeeKey
,ParentKey = ParentEmployeeKey
/* additional attributes */
,Name = LastName + ‚, ‚ + FirstName
,StartDate
,EndDate
FROM DimEmployee

)
, right_part AS
(
SELECT
NodeKey = EmployeeKey
,ParentKey = ParentEmployeeKey
/* Numerics */
,N01 = EmployeeKey
,N02 = NULL
,N03 = NULL
,N04 = NULL
,N05 = NULL
/* Nvarchars */
,T01 = LastName
,T02 = MiddleName
,T03 = FirstName
,T04 = NULL
,T05 = NULL
FROM DimEmployee
)

The following steps are similar for all hierarchies and you can use them for all your hierarchies.

In the next step we transform the “right_part” in a way that we have one row for each recursive child of an element together with its original element and the distance between the original element and each child:

, rec AS
(
SELECT
NodeKey
,ParentKey

        ,N01
,N02
,N03
,N04
,N05

        ,T01
,T02
,T03
,T04
,T05

        ,Distance = 0
FROM right_part

    UNION ALL

    SELECT
rec.NodeKey
,ri.ParentKey

        ,ri.N01
,ri.N02
,ri.N03
,ri.N04
,ri.N05

        ,ri.T01
,ri.T02
,ri.T03
,ri.T04
,ri.T05

        ,Distance = Distance + 1
FROM Rec
JOIN right_part AS ri
ON
ri.NodeKey = Rec.ParentKey
WHERE
Distance < 10
)

Next step inverts the distance to hierarchy level simply by calculating maximum distance of the recursive part above and substracting the distance of the current element.

, invLevel AS
(
SELECT
Level = 1+MAX(Distance) OVER (Partition BY NodeKey)
,invLevel = 1+MAX(Distance) OVER (Partition BY NodeKey) – Distance
,*
FROM rec
)

Level contains the value for the NodeKey and invLevel is the Level of the child.

Definition of a diagonal matrix:

,tr
AS
(
SELECT * FROM
(
VALUES
( 1,’01‘,“,NULL,NULL,NULL,NULL    ,0,NULL,NULL,NULL,NULL),
( 2,’02‘,NULL,“,NULL,NULL,NULL    ,NULL,0,NULL,NULL,NULL),
( 3,’03‘,NULL,NULL,“,NULL,NULL    ,NULL,NULL,0,NULL,NULL),
( 4,’04‘,NULL,NULL,NULL,“,NULL    ,NULL,NULL,NULL,0,NULL),
( 5,’05‘,NULL,NULL,NULL,NULL,“    ,NULL,NULL,NULL,NULL,0)

    ) AS T (invLevel, Lev, T01, T02, T03, T04, T05, N01, N02, N03, N04, N05)
)

 

Last step before transposing the table is to fill up the elements for upper hierarchy-nodes in order to have children for all levels.

 

, complete AS
(
SELECT
ri.NodeKey,
Level = MAX(inv.Level) OVER (PARTITION BY ri.NodeKey),
Tr.invLevel

        ,N01 = ISNULL(inv.N01,ri.N01)
,N02 = ISNULL(inv.N02,ri.N02)
,N03 = ISNULL(inv.N03,ri.N03)
,N04 = ISNULL(inv.N04,ri.N04)
,N05 = ISNULL(inv.N05,ri.N05)

        ,T01 = ISNULL(inv.T01,ri.T01)
,T02 = ISNULL(inv.T02,ri.T02)
,T03 = ISNULL(inv.T03,ri.T03)
,T04 = ISNULL(inv.T04,ri.T04)
,T05 = ISNULL(inv.T05,ri.T05)

    FROM right_part ri
CROSS JOIN tr
Left Join invLevel AS inv
ON
inv.NodeKey = ri.NodeKey AND
inv.invLevel = Tr.invLevel
)

Now we use a diagonal matrix and a simple aggregation to transpose the table.

, final AS
(
SELECT
NodeKey
,Level

        ,L01N01 = MAX(tr.n01 + c.n01),L01N02 = MAX(tr.n01 + c.n02)
        ,L01N03 = MAX(tr.n01 + c.n03),L01N04 = MAX(tr.n01 + c.n04)
        ,L01N05 = MAX(tr.n01 + c.n05)
,L01T01 = MAX(tr.t01 + c.t01),L01T02 = MAX(tr.t01 + c.t02)
        ,L01T03 = MAX(tr.t01 + c.t03),L01T04 = MAX(tr.t01 + c.t04)
        ,L01T05 = MAX(tr.t01 + c.t05)

        ,L02N01 = MAX(tr.n02 + c.n01),L02N02 = MAX(tr.n02 + c.n02)
        ,L02N03 = MAX(tr.n02 + c.n03),L02N04 = MAX(tr.n02 + c.n04)
        ,L02N05 = MAX(tr.n02 + c.n05)
,L02T01 = MAX(tr.t02 + c.t01),L02T02 = MAX(tr.t02 + c.t02)
        ,L02T03 = MAX(tr.t02 + c.t03),L02T04 = MAX(tr.t02 + c.t04)
        ,L02T05 = MAX(tr.t02 + c.t05)

        ,L03N01 = MAX(tr.n03 + c.n01),L03N02 = MAX(tr.n03 + c.n02)
        ,L03N03 = MAX(tr.n03 + c.n03),L03N04 = MAX(tr.n03 + c.n04)
        ,L03N05 = MAX(tr.n03 + c.n05)
,L03T01 = MAX(tr.t03 + c.t01),L03T02 = MAX(tr.t03 + c.t02)
        ,L03T03 = MAX(tr.t03 + c.t03),L03T04 = MAX(tr.t03 + c.t04)
        ,L03T05 = MAX(tr.t03 + c.t05)

        ,L04N01 = MAX(tr.n04 + c.n01),L04N02 = MAX(tr.n04 + c.n02)
        ,L04N03 = MAX(tr.n04 + c.n03),L04N04 = MAX(tr.n04 + c.n04)
        ,L04N05 = MAX(tr.n04 + c.n05)
,L04T01 = MAX(tr.t04 + c.t01),L04T02 = MAX(tr.t04 + c.t02)
        ,L04T03 = MAX(tr.t04 + c.t03),L04T04 = MAX(tr.t04 + c.t04)
        ,L04T05 = MAX(tr.t04 + c.t05)

        ,L05N01 = MAX(tr.n05 + c.n01),L05N02 = MAX(tr.n05 + c.n02)
       
,L05N03 = MAX(tr.n05 + c.n03),L05N04 = MAX(tr.n05 + c.n04)
        ,L05N05 = MAX(tr.n05 + c.n05)
,L05T01 = MAX(tr.t05 + c.t01),L05T02 = MAX(tr.t05 + c.t02)
        ,L05T03 = MAX(tr.t05 + c.t03),L05T04 = MAX(tr.t05 + c.t04)
        ,L05T05 = MAX(tr.t05 + c.t05)

    FROM complete c
JOIN tr
ON
c.invLevel = tr.invLevel
GROUP BY
NodeKey,
Level
)
 

Finally the SELECT query:

SELECT
le.*
, ri.Level

    , L01N01, L01T01, L01T02, L01T03
, L02N01, L02T01, L02T02, L02T03
, L03N01, L03T01, L03T02, L03T03
, L04N01, L04T01, L04T02, L04T03
, L05N01, L05T01, L05T02, L05T03

FROM left_part AS le
JOIN final AS ri
ON
le.NodeKey = ri.NodeKey
ORDER BY
Level, NodeKey

Here it is:

parent-child-hierarchies

Conclusion

With this script you can flatten nearly every parent child hierarchy except you have explicit gaps like direct jumps from Lev2Nodes to Lev4Nodes. You only have to modify the first and second cte – so it’s really efficient to implement new hierarchies.

In our scenario we use a version for up to 15 hierarchy-levels and use the script in a view definition.