Friday, February 28, 2014

The Bus Matrix

One of the invaluable tools that Ralph Kimball describes in his book The Data Warehouse Toolkit is the bus matrix.  The bus matrix is basically a grid that will ultimately allow you to see the relationships between fact tables and their conformed dimensions.

Recall from prior posts that a fact table contains measures (or events for factless fact tables) that pertain to a business process.  A dimension table contains the descriptors of those measures.  Dimension tables should be reused by multiple fact tables if more than one business process uses that dimension.  For example, the salary fact table may describe the payee using the dim_employee dimension table.  The help desk ticketing fact table may describe the person to whom a ticket is/was assigned using that same dim_employee dimension table.  That concept is explained in this post.

A bus matrix is a two-dimensional grid that lists the business processes (which will become fact tables) along the left and the descriptors (which will become dimension tables) across the top.  In the middle, an X or a check mark is placed at the intersection of a fact and dimension that belong in the same star schema.   Consider this example:

Descriptors
Time
Employee
Vendor
Department 
Business
Processes 
Payroll
x
x

x
Shipping
x

x
x
Accounts Receivable
x


x
Sales
x
x

x

When an analyst is gathering requirements in an effort to understand what needs to be warehoused, he can easily list the business processes that come from the conversation along the left of a white board.  He can also list the descriptors (i.e., day, person, department, product, etc.) along the top.  Later, these items can be translated into table names, resulting in a bus matrix.

Creating a bus matrix is a great idea (thanks to Mr. Kimball for that) for the following reasons:

1.) You can easily see the facts and dimensions that reside in your data warehouse.  Entity Relationship Diagrams provide some great information, although they can get pretty large for a large data warehouse.  If seeing the relationships at a high level is necessary, a bus matrix will allow that to be done very easily.

2.) As you add to your data warehouse you can revisit this document and add to it.  Revisiting the bus matrix will help to ensure that you use the conformed dimensions as opposed to inadvertently re-creating one.

3.) We have been treating the bus matrix as a document that can be used to communicate some of the technical relationships of the data warehouse.  That is not a bad use, but consider a version of the bus matrix that simply lists the business processes and descriptors (not their respective tables).  Such a document will essentially describe the organization.  The business processes and the entities that somehow touch those processes are all displayed visually, giving the executives a high level view of the makeup of their organization.

All of the cool BI that provides flashy new toys begins with working through these fundamentals first.  The bus matrix helps the leaders of an organization think through the beginnings of their data management strategy.

Business Intelligence is a great industry with a very bright future.  Have fun!  Are you interested in entering this industry or do you know somebody who is?  Consider this.

Image courtesy of ddpavumba / FreeDigitalPhotos.net