As with any skill, only so much can be learned from a book (or a blog post). In this post, I’m hoping to offer some hands-on training that will allow you to play with some data a bit. If you have an Oracle environment available to you (check the previous post if you do not), you can run the script found here. If you are unfamiliar with running a script like this, some suggestions are in the previous post as well. The following tables with data will be created…
This is a very basic star that holds data related to sales transactions. Here is a quick breakdown of the tables…
Dim_Product (22 rows): This is a dimension table that contains a product’s overall group (larger grouping) and then the individual product numbers and names within that group (smaller grouping).
Dim_Employee (19 rows): This is a dimension table that contains information regarding the employees that sold the products.
Dim_Day (365 rows): This is a date dimension table that contains one row for each day as well as higher-level date groupings. In this example, this table only holds data in calendar year 2011 (1/1/2011 – 12/31/2011).
Fact_Transaction (300 rows): This is a fact table that contains one row for each transaction line item. So, if an individual purchased three items on the same transaction (i.e., brought three items to the cash register) that transaction will produce three rows in this fact table.
Data Warehousing is ideal for top-down analysis. Once the script has been run in Oracle, and the tables have been created, consider the following example…
select b.calendar_year, sum(a.amount)
from fact_transaction a,
dim_day b,
dim_employee c,
dim_product d
where a.key_day = b.key_day
and a.key_sales_person = c.key_employee
and a.key_product = d.key_product
group by b.calendar_year
This select statement shows the dollars sold per fiscal year, although there is only one year in this case. If you want to throw in the product_group to see the dollars sold per year, per product_group, that will give you an additional level of detail…
select b.calendar_year, d.product_group, sum(a.amount)
from fact_transaction a,
dim_day b,
dim_employee c,
dim_product d
where a.key_day = b.key_day
and a.key_sales_person = c.key_employee
and a.key_product = d.key_product
group by b.calendar_year, d.product_group
Do you see what is happening? You can continue drilling down until you get to a specific transaction_line if you wish.
An analyst may examine the data at a high-level and notice that one quarter had an enormous number of sales dollars. She can keep adding descriptors to the select clause (and group by them) to eventually find where those dollars lie. Did one sales rep have a stellar quarter? Did one product group do exceptionally well? Did sales go through the roof on one particular day out of the quarter? Did one product alone do exceptionally well? Once you have the information you can turn that into action. Does somebody deserve a raise for their extraordinary work? Did one marketing strategy work remarkably well, meaning that we need to consider implementing it across the entire organization?
This type of structure can turn data into valuable data. Valuable data can enable great decisions. Have fun playing with the data. See how many ways you can slice and dice it. We will do some more with this star in future posts.