Sunday, September 2, 2012

ETL - Practice Loading A Fact

We looked recently at the process used to load a dimension table.  Consider this post as a way to practice loading a fact table.  An ETL job that is designed to load a fact table will accomplish these things…
1.)    Select all necessary data from the source
a.       Measures
b.      Natural Keys of the dimension objects
c.       Degenerate Dimension objects
2.)    Transform the data as needed
a.       Aggregate measures and group by the appropriate objects
b.      Add foreign keys to dimension tables
3.)    Delete any existing rows from the fact table that will be replaced by the new data
4.)    Load the fact table

As with the dimension example, no two developers will do this the same way, which is perfectly fine.  Consider this example which is to be used in conjunction with the two tables used in the ETL – Practice Loading a Dimension post (click on the image to enlarge it)…



…and this fact table needs to be populated.

Run the following scripts in Oracle to create the source data

1.)    Product data – This is the same script from the ETL – Practice Loading a Dimension post.  If you have already run this one, you are fine.
2.)    Transaction data - This script will create and populate the TRANSACTION_HEADER and TRANSACTION_DETAIL tables above.

The TRANSACTION_DETAIL.PRODUCT_ID is a foreign key to the PRODUCT.ID field from the ETL – Practice Loading a Dimension post.  If you wish, try to write an ETL job that will populate the FACT_TRANSACTION table in PL/SQL or another language of your choosing.  We will examine a possible solution in a future post.

For more practice data and additional information on dimensional modeling visit www.brianciampa.com.  Also, if you need a fresh approach to marketing your data warehousing skillset consider The Data Warehouse Portfolio.

No comments:

Post a Comment