Junk and Outrigger and Mini Dimensions OH MY!

How is a Junk dimension different from an Outrigger dimension different from a Mini dimension?  This is one that confused me for a long time, I hope the following explanation helps shed some light on the differences between these extremely useful dimension types.

A Junk dimension is a dimension that we create when we are building our Fact table and we are left with some dimension attributes (fields) that are useful but don’t belong in any of the other dimension tables we’ve defined.  We could create separate dimension tables for these fields, but that would clutter up our design for no good reason.  Usually, these fields are flag fields (TRUE/FALSE, 0,-1) or low cardinality (few distinct field values).  Examples might be IsActive (T/F), IsCovered (T/F), IsEligible (T/F), Status (Open/Closed/Suspended).  The idea is we would create a table that contains all the unique combinations of these fields.  We could either populate that table once with every possible combination of these fields or we might fill it as we go, adding in new combinations as they exist in the data.  The Junk dimension is then treated like any other dimension and used in the Fact table to give additional context to our measures.

An Outrigger table is a little different.  In this case, we are creating a dimension that will contain a subset of another dimension.  These fields all describe something in common as opposed to a Junk dimension where the only thing in common between the fields is that they are all flag fields or low cardinality fields.  For example, we might start with a Customer dimension that looks like this …

For the most part, this is a great looking dimension.  The problem is that every customer will have a change to their age and Income (hopefully a positive change!) every year.  This could result in a LOT of new records in our dimension table (assuming age and income are things we decide to track history on),  If we have a lot of customers this dimension could become HUGE as a result of these changes.  It might be better to break out the demographic fields to a new dimension.  Just as with the Junk dimension we can either load every possible combination of values (not realistic considering Income is one of our fields) or we can load all the unique combinations when we start and then add new combinations as we find them in the data.

The new dimensions would look more like this

Another detail that sets the Outrigger dimension apart from the Junk dimension is HOW we use it.  A Junk dimension would be used as a new foreign key in the FACT table.  An Outrigger dimension is used as a foreign key in a DIMENSION table.  So now our dimension table might look like this …

Now, some of you are probably looking at this and thinking … but that’s a SNOWFLAKE dimension?  Nope … If we were to add snowflakes to our design the tables linked to our CustomerDimension would be broken down closer to third normal form, in this case, the dimension we are linking to is still more denormalized in design.

As if that wasn’t enough, here’s one last bit to blow your mind.  The dimDemographic table can now be used as EITHER an Outrigger dimension or mini-dimension.  It’s the same dimension, but if we are getting to it via the demographicID in the DimCustomer table (known as a browse query because we are browsing dimensional details without looking at a fact table) we think of it as an Outrigger.  If we are accessing the Demographic detail through a DemographicID stored in the FACT table then we think of it as a mini-dimension.  Here’s what our Fact table would look like if we were looking at the DimDemographic table as a mini dimension …

What do you think?  Did I miss anything?

One Comment

  1. Pingback: Demographics: Some thoughts and a link | Timcost

Leave a Reply

Your email address will not be published. Required fields are marked *