Saturday, May 12, 2012
As discussed previously, a typical design is to place the facts of a business process that are measured at the same grain into one table (fact table) and to group the contextual pieces into other tables (dimension tables). Occasionally, a piece of context will exist at the same grain as the fact table. For example, using the same scenario as we used above, suppose that the sales revenue needs to be stored at the sales transaction level. If a transaction number exists for each individual transaction, then a dimension table that contains this value will contain as many rows as the fact table. While there is nothing wrong with this approach, it may seem like extra work to create an additional table with a surrogate key that will only be referenced by one row of the associated fact table.
As a result, some designers may opt to make this value a degenerate dimension (mentioned briefly at the end of a previous post). A degenerate dimension is a dimension value that exists directly in the fact table, as opposed to the fact table containing a foreign key that points to that record in a dimension table. The database will not need to allocate any additional space for the surrogate keys in this case, since they add little value anyway. Designing a dimension object like this to be a degenerate dimension is not a must in this situation. Using the standard approach of creating a separate table is fine too.
As with everything that we do as Business Intelligence experts, when faced with this design question it is best to consider the value that each option brings to the table. Asking some of the following questions may be a good start...
1.) Do I have too many dimension tables included in this star already?
2.) How will the reporting tools react to this design?
3.) Will the report writers become confused if I design it this way?
If needed, spend some time absorbing the concept of degenerate dimensions. We will look at some examples later.
Image: photostock / FreeDigitalPhotos.net