Data Warehousing for Dummies PDF: How BIG do I build?<!-- --> | <!-- -->Assume Wisely
Data Warehousing for Dummies PDF: How BIG do I build?

Data Warehousing for Dummies PDF: How BIG do I build?

Posted: April 25, 2018

This is the third blog in my Data Warehousing for Dummies blog series. If you would like, I have a FREE Data Warehousing for Dummies PDF that takes you through the whole process. Today, I want to talk about an important question, “Should you build one large database of information (a data warehouse) and then parcel off smaller portions to different organizations, or should you build a bunch of smaller scale databases (data marts) and then integrate them together later?”

If you say to yourself:

“Go big or go home. We are building a Costco size warehouse.” OR “ We want a data mart, not a warehouse.”

Then you are starting off on the wrong foot. It is easy to think in terms of big or small. It is better to think in terms of the answers you are looking for.

Answers to Specific Business Questions

Often the answer to a handful of business questions drives an organization’s operations. Jim Rohn used to say, “There are always about a half dozen things that makes 80% of the difference in any area of your life.” Look for the few things that make the most difference, and spend the most of your time on those few things.

Business questions that carry this kind of weight are often analysed in Excel.

“There are always about a half dozen things that makes 80% of the difference in any area of your life.” ~ Jim Rohn

The most democratic data analysis tool in use today is Excel. Virtually everyone has access to Excel. You may fire up Excel to consolidate and check the information you dig up in order to summarize and organize it. This type of “spread-mart” lacks the repeatability and data quality for anything more than a single serve analysis of one moment in time. Sounds like a job for a data warehouse!

For your first project, consider a small scale data mart designed specifically to answer these high impact, high value, “How are we doing?” type of questions.

When A Data Warehouse is the better choice.

Until you understand the following three issues, you have no basis on deciding what investment to make:

The business problem you are trying to solve; the questions you are trying to answer. The business value you expect to gain when your system is successfully built. The volumes and characteristics of the data you need.

When considering the volume and characteristics of the data you need it can be helpful to think in terms of data sources. A small business can have anywhere from 8 to 10 sources, some examples: Quickbooks, Google Analytics, WordPress, Activecampaign, Amazon Affiliate Marketplace, Facebook, Twitter, Salesforce CRM, Survey Monkey, etc. Once you have asked your handful of questions, are the answers easier to get from one integrated database or a pair of independent databases?

A quick note on sources vs subject areas. In the last paragraph we listed out some data sources. An example of a subject area is “Marketing.” Marketing makes use of two data sources:Facebook and Google Analytics. A data mart can service one at most two subject areas. Most businesses have at least 5 subject areas: Marketing, Sales, Fulfilment, Customer Service, & Finance. To address three or more subject areas requires a data warehouse.

Business Intelligence Tools

Data Mart users usually ask questions with a “Tell me what happened” perspective. They require reports and don’t do much heavy analytical processing. On the other hand, there are a variety of different ways to look at the contents of a warehouses because it collects data on a broad range of subject areas:

Simple Reporting. Just like a data mart, the goal is “Tell me what happened?” Business Analysis. Not just, “Tell me what happened?” but also, “Why?” Dashboards & ScoreCards. The warehouse gathers a variety of information and consolidates it: “Tell me a lot of things, but don’t make me work for it.” Data Mining. In addition to, statistical analysis, econometrics, machine learning, AI, data mining is generally about taking large volumes of data and sifting through it to separate knowledge from the noise, often without even having to ask a specific question: “Tell me something interesting.”

Tool vendors increasingly try to provide suites of products to handle as many of these different functions as possible, you will need to deal with different products. Don’t assume that you can select a single vendor whose products satisfy all the business intelligence capabilities your users need. Carefully check out the vendors’ products - all of them.

Benefits of a Data Mart

I started off asking about what kind on investment you should make in your data, to start. The answer depends on what key pieces of information you need to run your business. Once you know the answers you are looking for, then you can look at the data you have to go about getting those answers. Data marts are preferable to warehouses for three reasons:

Speed. These are typically completed in 90 to 120 days. Full scale warehouses take considerably longer. Cost. Simple is fast. Fast is cheaper. Risk. When you work with less data, fewer sources, over a shorter period of time you have a less complex environment - fewer associated risks.

In reality most “real” data investments involve a warehouse that covers all five subject areas I listed before: Marketing, Sales, Fulfillment, Customer Service, & Finance. To lean more about how to build your first data warehouse check out this free resource: Data Warehousing For Dummies PDF.

Git Sum (un)common sense,


Don't miss my next thought:

© 2018 · Rho Lall