Demographics: Some thoughts and a link

Yesterday I published a post in which I explored the differences between Junk, Outrigger and Mini dimensions.  I chose a demographic table as an example because that’s something we see ALL THE TIME in real data.  Customer demographics are infinitely useful.  Let’s take a deeper look at a couple things you might want to consider when you build a demographics dimension.

The first thing that comes to mind is something I noticed about the example data I shared yesterday … it looked something like this.

This is a VERY brief example, but the field that catches my eye for possible improvement is Income.  The question I would ask myself here is, how much do I (or does my customer) care about the ACTUAL income of our customer?   I’m not arguing that income is not a great data point, just that we don’t really care if Jon Smith makes 81,000 a year or 81,402 a year.  What we really want to do is create some ranges here.  Ranges are WONDERFUL.

  • Ranges reduce changes to the data.  You only have to implement a change when a customer moves from one range to another.
  • Ranges are great for visual analytics.  They give us a natural high-level overview.
  • I felt like this section needed three bullet points, so this one is number three!

Note:  Just because we are adding a range, doesn’t mean we can’t also keep the original data point the range is based on.  I would argue that some data points have less analytic value than others and in some cases (income, age, last grade of school finished) the range IS more valuable than the original data point.  In these cases, the ranges insulate us from change and it’s unlikely we would do in-depth analytics on how many people earned $81,402 last year.

In closing, I promised you a link.  Full disclosure here, I’m not terribly creative when it comes to thinking of useful demographic details that I want to look for or capture in my data.  I often find myself going back to this post on Demographic Surveys from Survey Monkey.

I look forward to your comments on how YOU handle the demographic dimension in your data!


Leave a Reply

Your email address will not be published. Required fields are marked *