When it comes to data warehouse designing, two of the most widely discussed approaches are the Inmon method and Kimball method. For years, people have debated over which one is better and more effective for businesses. However, there’s still no definite answer as both methods have their benefits and drawbacks.
In this article, we’ll discuss in detail what are the Kimball and Inmon approaches. We’ll also look at the factors that differentiate between these two data warehouse concepts.
The Kimball Approach
Initiated by Ralph Kimball, this approach follows a bottom-up method of data warehouse architecture design in which data marts are formed first based on the business requirements.
The primary data sources are then evaluated, and an Extract, Transform and Load (ETL) tool is used to fetch different types of data formats from several sources and load it into a staging area. The next phase includes loading data into a dimensional model that’s denormalized by nature. This model partitions data into fact table, which is numeric transactional data or dimension table, which is the reference information that supports facts.
The star schema is the fundamental element of dimensional modeling in which a fact table is bounded by several dimensions. Several star schemas can be constructed within a dimensional model to fulfill various reporting needs.
To integrate data, Kimball suggested the idea of conformed dimensions. It exists as a basic dimension table that is shared across different fact tables (such as customer and product) within a data warehouse or as the same dimension tables in various data marts. This guarantees that a single data item is used in a similar manner across all the facts.
An important designing tool in this approach is the enterprise bus matrix which vertically records the facts and horizontally records the conformed dimensions. This matrix displays how star schemas are constructed. It is used by business management teams as an input to prioritize which row of the matrix should be implemented first.
Some of the main benefits of the Kimball approach include:
- The dimensional model is fast to construct as no normalization is involved, which means swift execution of the initial phase of the data warehousing design process.
- Most of the data operators can easily comprehend star schema and because of its denormalized structure, it simplifies querying and analysis.
- The data warehouse system footprint is trivial because it focuses on individual business areas and processes rather than the whole enterprise. So, it takes less space in the database, simplifying system management.
- It enables fast data retrieval from the data warehouse; as data is segregated into facts and dimensions.
- A smaller team of designers and planners is sufficient for data warehouse management because data source systems are quite stable and the data warehouse is process-oriented. Also, query optimization is straightforward, predictable, and controllable.
- Using conformed dimensions, a business intelligence tool can go across several star schemas for report generation.
Some of the drawbacks of the Kimball approach include:
- As data isn’t entirely integrated before reporting, the idea of a ‘single source of truth’ is lost.
- Irregularities can occur when data is updated. This is because in denormalization, redundant data is added to database tables.
- Performance issues may occur due to the addition of columns in the fact table; as these tables are quite in-depth. The addition of new columns can expand the dimensions of the fact table, affecting its performance. Also, the dimensional model becomes difficult to alter with any change in business needs.
- As the model is business process-oriented, instead of focusing on the enterprise as a whole, this approach cannot handle all the BI reporting requirements.
- The process of incorporating large amount of legacy data into the data warehouse is complex.
The Inmon Method
Bill Inmon’s method to develop a data warehouse starts with designing the corporate data model, which identifies the main subject areas and entities the enterprise works with, such as customer, product, vendor, and so on.
The model then creates a thorough logical model for every primary entity. For instance, a logical model is constructed for product with all the attributes associated with that entity. This logical model could include ten diverse entities under product including all the details, such as business drivers, aspects, relationships, dependencies, and affiliations.
The Inmon approach uses the normalized form for building entity structure, avoiding data redundancy as much as possible. This results in clearly identifying business requirements and preventing any data update irregularities.
Next, the physical model is constructed, which follows the normalized structure. This model creates a single source of truth for the whole business. Data loading becomes less complex due to the normalized structure of the model. However, using this arrangement for querying is challenging as it includes numerous tables and links.
This approach proposes constructing data marts separately for each division, such as finance, marketing sales, etc. All the data entering the data warehouse is integrated. To ensure integrity and consistency across the enterprise, the data warehouse acts as a single data source for various data marts.
The Inmon approach offers the following benefits :
- The data warehouse acts as a unified source of truth for the entire business, where all the data is integrated.
- This approach has very low data redundancy. So, there’s less possibility of data update irregularities, which makes the ETL process more straightforward and less susceptible to failure.
- It simplifies business processes, as the logical model represents detailed business objects.
- This approach offers greater flexibility; as it’s easier to update the data warehouse in case there’s any change in the business requirements or source data.
- It can handle diverse enterprise-wide reporting requirements.
The possible drawbacks of this approach are as following:
- Complexity increases as multiple tables are added to the data model with time.
- Resources skilled in data modeling are required, and that can be expensive and challenging to find.
- The preliminary setup and delivery are time-consuming.
- Additional ETL operation is required since data marts are created after the creation of the data warehouse.
- This approach requires experts to effectively manage a data warehouse.
Kimball vs. Inmon Approach: Which Data Warehouse Concepts Should You Opt For?
Now that we’ve seen the advantages and drawbacks of both these methods, the question arises: Which one of these data warehouse concepts would best serve your business?
Both these approaches consider the data warehouse as a central repository that supports business reporting. Also, both methods use ETL for data loading. However, the main difference lies in modeling data and loading it in the data warehouse.
The approach used for data warehouse construction influences the preliminary delivery time of the warehousing project and the capacity to put up with prospective variations in the ETL design.
Still not sure which one of these data warehouse concepts would serve your company best?
We’ve narrowed down a few aspects that can help you decide between the two approaches.
- Reporting Needs: If you need organization-wide and integrated reporting, then the Inmon approach is more suitable. But if you require reporting focused on the business process or team, then opt for the Kimball method.
- Project Deadline: Designing a normalized data model is comparatively more complex than designing a denormalized model. This makes the Inmon approach a time-intensive process. Therefore, if you have less time for delivery, then opt for the Kimball method.
- Prospective Recruitment Plan: The higher level of complexity of data model creation in the Inmon approach requires a larger team of professionals for data warehouse management. Therefore, choose accordingly.
- Frequent Changes: If your reporting needs are likely to change more quickly and you are dealing with volatile source systems, then opt for the Inmon method as it offers more flexibility. However, if reporting needs and source systems are comparatively stable, then it’s better to use the Kimball method.
- Organizational Principles: If your organization’s stakeholders and corporate directors recognize the need for data warehousing and are ready to bear the expenses, then the Inmon method would be a safer bet. On the other hand, if the decision-makers aren’t concerned about the nitty-gritty of the approach, and are only looking for a solution to improve reporting, then it’s sufficient to opt for the Kimball approach.
Both the Inmon and the Kimball methods can be used to successfully design data warehouses. In fact, several enterprises use a blend of both these approaches (called the hybrid model).
In the hybrid model, the Inmon method is used to form an integrated data warehouse. Whereas, the Kimball approach is followed to develop data marts using the star schema.
It’s not possible to claim which approach is better as both methods have their benefits and drawbacks, and they both work well in different situations. The data warehouse designer has to choose a method depending on the various factors discussed in this article.
Lastly, for any method to be effective, it has to be well-thought-out, explored in-depth, and developed to gratify your company’s business intelligence reporting requirements.