What is Dimensional Modeling?
Different types of data modeling techniques are optimized for different applications. Normalized Entity-Relationship models (ER models) are designed to eliminate data redundancy and quickly perform the Insert, Update, and Delete operations, and get the data inside a database.
In contrast to that, dimensional models – data models based on the technique introduced by Ralph Kimball – are denormalized structures designed to retrieve data out of a data warehouse. They are optimized to perform the Select operation and are used in the basic design framework to build highly optimized and functional data warehouses.
In this article, we will cover the basics of dimensional modeling and the concepts related to it. We will also discuss different tools and implementation methods for successfully designing dimensional models.
Benefits of Dimensional Modeling
Dimensional modeling is still the most commonly used data modeling technique for designing enterprise data warehouses because of the benefits it yields. These include:
Faster Retrieval of Data
Dimensional modeling merges the tables in the model itself, which enables users to retrieve data faster by running join queries compared to the other approaches. The denormalized schema of a dimensional model is optimized to run ad hoc queries. As a result, it greatly complements the business intelligence (BI) goals of an organization.
Better Understanding of Business Processes
The information in a dimensional model is stored in fact and dimension tables. We will cover what facts and dimensions are in the subsequent sections. This categorization of data into facts and dimensions, and the entity-relationship structure of a dimensional model, present complex business processes in an easy-to-understand manner to analysts.
Flexible to Change
Dimensional modeling framework makes the data warehouse design extensible. The design can be easily modified to incorporate any new business requirements or make any adjustments. New entities can be added in the model or layout of the existing ones can be changed to reflect modified business processes.
Elements of a Dimensional Model
Fact Tables or Business Measures
Fact tables store the numeric information about business measures and foreign keys to the dimensional tables. Business facts can be additive, semi-additive, or non-additive. Table 1 explains the three types of fact tables.
|Type of Facts ||Description |
|Additive Facts ||Business measures that can be aggregated across all dimensions |
|Semi-additive Facts ||Business measures that can be aggregated across some dimensions and not across others (usually date and time dimensions) |
|Non-additive Facts ||Business measures that cannot be aggregated across any dimension |
Table 1: Types of facts in a fact table
Fact Types Explained with an Example
An apparel store maintains the following data for a sales transaction:
|Date ||Store Location ||Product Type ||Quantity ||Unit Price ||Sales Amount ||Inventory ||Sales Tax |
|6/3/2018 ||CA ||Nylon ||5 ||100 ||500 ||30 ||7.75% |
|6/3/2018 ||CA ||Polyester ||7 ||250 ||1750 ||50 ||7.75% |
|6/3/2018 ||PA ||Nylon ||6 ||100 ||600 ||65 ||6.00% |
|6/3/2018 ||PA ||Polyester ||3 ||250 ||750 ||25 ||6.00% |
|6/4/2018 ||CA ||Nylon ||7 ||100 ||700 ||36 ||7.75% |
|6/4/2018 ||CA ||Polyester ||6 ||250 ||1500 ||17 ||7.75% |
|/4/2018 ||PA ||Nylon ||9 ||100 ||900 ||14 ||6.00% |
|6/4/2018 ||PA ||Polyester ||10 ||250 ||2500 ||20 ||6.00% |
Table 2: Transactional table maintained by an apparel store
The columns containing numeric information about the business process are our business facts. In this example, Quantity, Unit Price, Sales Amount, Inventory, and Sales Tax are facts. And the rest of the entities (Date, Store, and Product Type) are dimensions.
Sales Amount can be added across all dimensions therefore, it is an additive fact. Moreover, adding Inventory information across Store dimension provides useful business information but since this is just the snapshot of the amount of goods at a certain point, adding it across the Date dimension does not give any useful business insights. Since Inventory is additive across some dimensions and non-additive across others, it is a semi-additive fact. Now consider Sales Tax. Adding Sales Tax across any dimension will give us inaccurate insights. Sales Tax is, therefore, a non-additive fact.
Dimension tables store descriptive information about the business facts to help understand and analyze the data better. In the example presented in Table 2, Date, Store Location, and Product Type are dimension entities, and they give more information about the business facts. Total amount of Sales is an important measure to record, but without the dimensions, a business cannot assess which store location or product type is yielding more sales.
Figure 1 shows how fact tables and dimension tables are arranged in a star schema dimensional model.
Figure 1: Star schema with fact and dimension tables
Primary key is a column in dimension tables that identifies unique records. For the slowly changing dimensions, the surrogate key will be the primary key.
Foreign keys are used to join two tables (usually fact and dimension tables). The primary key in a dimension table is a foreign key in the related fact table and is used to reference that particular dimension.
Designing a Dimensional Model
To understand the process of designing dimensional models, let’s consider the example of an apparel line that sells two kinds of windbreakers – Nylon and Polyester in its two stores across California and Pennsylvania. The sample data for the example is shown in Table 2.
Step 1: Identify the Business Processes
Before modeling the data, you should ask what you are trying to explain using the data model. The dimensional modeling process (or any type of data modeling) begins with the identification of the business process that you want to track. In this case, we want to track sales for the two types of windbreakers.
Step 2: Identify Facts and Dimensions
The information in a dimensional model is categorized into two tables types – Facts and Dimensions. The next step is to identify the business facts that you want to measure and their associated dimensions. In our example, windbreaker sale is the fact that we want to measure. Date, store location (California and Pennsylvania), and product type (Nylon windbreakers and Polyester windbreakers) are the dimensions that give us further insights into the sales process.
Step 3: Identify the Attributes for Dimensions
After you’ve identified the facts and dimensions for your business process, the next step is to identify attributes and create a separate dimensional table for each of the dimensions. Each record in the dimension table should have a unique key. This key will be used to uniquely identify the records in the dimension table and will be used as the foreign key in the fact table to reference the particular dimension and join it with the fact table. Tables 3-5 show the dimension table for each of the dimensions in our apparel line example.
|Date Dimension |
|Date Key ||Date ||Day |
|10201 ||6/3/2018 ||Sunday |
|10202 ||6/4/2018 ||Monday |
Table 3: Dimension table for Date
|Store Dimension |
|Store Key ||Store Name ||City ||State |
|151 ||Angie’s Apparel ||Los Angeles ||California |
|152 ||Angie’s Apparel ||Pittsburgh ||Pennsylvania |
Table 4: Dimension table for Store
|Product Dimension |
|Product Code ||Collection ||Material ||Color |
|131620 ||Windbreaker – Fall Collection ||Nylon ||Orange |
|131571 ||Windbreaker – Fall Collection ||Polyester ||Black |
Table 5: Dimension table for Product
Step 4: Define the Granularity for Business Facts
Granularity refers to the level of information that is stored in any table. For instance, in our example, the sales amount is recorded on a daily basis, therefore the granularity in this case is daily. The fact tables in a dimensional model should be consistent with the pre-defined granularity.
Figure 1 shows how the final dimensional will look.
Step 5: Storing Historical Information (Slowly Changing Dimensions)
An important feature of dimensional models is that the dimensional attributes can be easily modified without changing the complete transaction information. For example, the apparel line decides to continue the Nylon windbreaker from Fall Collection into the Spring Collection and updates the name in the Collection attribute. Making the update is an easy process in the dimensional table, but with the update we will lose our previous data. If the goal of your data modeling and data warehouse is maintaining and storing history, this could be a problem. Dimensions that change slowly over time are called Slowly Changing Dimensions. You can maintain and store history by tracking slowly changing dimensions.
Read more about different use cases of slowly changing dimensions.
Automation – A Game Changer for Designing Dimensional Models
Designing dimensional models is an essential step in building the framework of an enterprise data warehouse. The process can be streamlined with the help of a robust data warehouse automation tool such as Astera Data Warehouse Builder.
With Astera DW Builder, you can quickly build dimensional models in a visual code-free integrated development environment. Entities can be denormalized with simple drag-and-drop and merge s. Entity roles (facts and dimensions) can be assigned in bulk, which can save you valuable time when working with hundreds of entities. In addition, the product enables you to manage slowly changing dimensions with built-in support for SCD type 1, 2, 3, and 6.
Astera DW Builder is an end-to-end data warehouse automation platform that comes with built-in data modeling capabilities, support for a wide range of databases and CRM applications, automated data mapping and data loading features, and native integration with business intelligence platforms such as Tableau and Power BI.
See Astera DW Builder’s demo or schedule a discovery call with us today to experience the power of data warehouse automation first-hand.