Friday, March 28, 2014

Accumulating Snapshot Fact Tables

We've been looking at the various fact table design options in data warehousing.  Two weeks ago we examined the transaction fact table and last week we examined the periodic snapshot fact table.  The third and final option is called an accumulating snapshot.  This type of fact table is different from the other two in one big way.  Each row is often revisited.  Consider our banking example from the other posts.  When a deposit is added to a transaction fact table, that row is added and then left alone.  All of the data needed to add and complete that row is known.  The same is true of periodic snapshots.  An accumulating snapshot fact table begins each row and then accumulates data until that row is complete.  Let's consider an example in a different context.  Suppose that a star is built for the purposes of analyzing a help desk ticketing business process.  The following descriptors (in addition to any measures) are a part of the business process:

Date Ticket Opened
Date Ticket Assigned
Date Solution Provided To Customer
Date Customer Accepted Solution
Date Ticket Closed
Ticket Number

Suppose that ticket number 10012 is opened on 3/1/14.  At that point, this row will exist in the fact table:


Date Ticket Opened

Date Ticket Assigned

Date Solution Provided To Customer

Date Customer Accepted Solution

Date Ticket Closed

Ticket Number
3/1/2014 10012

Now, suppose that the ticket is assigned to a technician on 3/3/14.  That same row will be updated to look like this:


Date Ticket Opened

Date Ticket Assigned

Date Solution Provided To Customer

Date Customer Accepted Solution

Date Ticket Closed

Ticket Number
3/1/2014 3/3/2014 10012

If the solution is provided to the customer one day later, the row will be updated to look like this:


Date Ticket Opened

Date Ticket Assigned

Date Solution Provided To Customer

Date Customer Accepted Solution

Date Ticket Closed

Ticket Number
3/1/2014 3/3/2014 3/4/2014 10012

Assuming that the customer accepts the solution on 3/5 and the ticket is closed 3/6, the row will be updated to look like this:


Date Ticket Opened

Date Ticket Assigned

Date Solution Provided To Customer

Date Customer Accepted Solution

Date Ticket Closed

Ticket Number
3/1/2014 3/3/2014 3/4/2014 3/5/2014 10012

and then this:


Date Ticket Opened

Date Ticket Assigned

Date Solution Provided To Customer

Date Customer Accepted Solution

Date Ticket Closed

Ticket Number
3/1/2014 3/3/2014 3/4/2014 3/5/2014 3/6/2014 10012

At this point, the row is left alone.  If you so desire, an accumulating snapshot allows you to store some lags between dates to help with analysis.  This can help ease the burden of using the database to calculate the various lags.

In the future we will look at writing some ETL to populate an accumulating snapshot fact table.

Image courtesy of Vichaya Kiatying-Angsulee / FreeDigitalPhotos.net

1 comment:

  1. Hi did you ever get around to writing the code for this? I would fid this very helpful. Thanks

    ReplyDelete