In the previous post, we looked at degenerate dimensions. Rather than create a new SQL script that you can use to practice we can revisit the transaction data star that we used in the February 25th post. The script is here and the ERD is below.
Notice that the TRANSACTION_NUMBER and TRANSACTION_LINE columns are in the FACT_TRANSACTION table. If we created a new dimension table, called DIM_TRANSACTION, with this structure (in Oracle)...
CREATE TABLE DIM_TRANSACTION
(
KEY_TRANSACTION NUMBER,
TRANSACTION_NUMBER NUMBER,
TRANSACTION_LINE NUMBER
)
...it would contain as many rows as the FACT_TRANSACTION table. This is not wrong, but the effort involved in creating the surrogate keys which will only be used by one fact row may not be necessary. If you wish, run the create table statement that I provided above, add some data, and create some surrogate keys so that it can be joined to the fact table. You will see that the reporting capabilities from one design to the other are the same, with a bit more effort being required to maintain the DIM_TRANSACTION table. This is why I opted to place these columns directly in the fact table as degenerate dimensions. When faced with this design question, consider which is best for your organization and move forward with turning data into valuable data.
No comments:
Post a Comment