The Rumsfeld Matrix as an effective tool in the decision-making process
During a briefing on the Iraq War, Donald Rumsfeld divided information into 4 categories: known known, known unknown, unknown known, unknown unknown. ...
It’s already difficult to imagine a situation where business makes a progress without data. It plays the main role in companies’ operation on the basis of which business team makes different decisions, develops strategies and forecasts etc. For every improving company it is too necessary to make appropriate relations with their data. And the most important step is data warehouse architecture.
What is data warehouse?
Data warehouse is a storage system for data collected from different sources within a company and used to run decision making process.
Currently there are 2 prominent architecture styles to build data warehouse: the Inmon architecture and the Kimball architecture. Ralph Kimball and Bill Inmon propose different concepts. The main difference is a technique of data structure modeling, loading and storage. This difference has influence on the initial delivery time of the data warehouse and the ability to accommodate ETL design changes. However, methods have general characteristics: both of them position the data warehouse as the central data repository for a company; cover all corporate reporting needs; use ETL to load data warehouse.
Let’s consider each method.
The Inmon approach
This method begins with the corporate data modelling. With the help of this main subject areas and entities (customers, product/service, vendors etc.) are identified. Consequently, on the basis of this a detailed logical model for each entity is created. Entity structure has a normalized form, data redundancy is avoided as much as possible. It is a key characteristic of this technique that allows to determinate business concept and avoid data update anomalies.
The data warehouse is the one source of the truthful information for enterprise. Such structure simplifies data loading. But it is difficult to use structure for querying because of many tables and joins.
So, B. Inmon propose to build data marts for every specific department (Finance, Sales, Business Development etc.). All data is integrated, and data warehouse is the single source of data from different marts. Such concept guarantees data completeness and consistency across an organization.
Advantages:
Disadvantages:
The Kimball approach
This approach begins with main business processes and questions identifying. The operating system is a key data source. For data delivery from different sources and loading it into a staging area is used ETL software. From here data is loaded into a dimensional model. The key approach difference is not normalized dimensional model. The star schema is the main concept of dimensional modeling where is centralized store. The fact table contains all data relevant to the subject area. The dimensional table describes stored data. User can make detailing without additional connections as dimensional tables are totally not normalized. R. Kimball proposes the conformed dimensions concept to achieve an integration in the dimensional model. Key characteristics (customer, product, service) are built once and used by all facts. This guarantees identical characteristic usage by all facts.
Advantages:
Disadvantages:
Both techniques have their advantages and disadvantages and depends on a situation each of them can be more efficient. The main task is to make reasonable and amenable to business needs decision for the best result achievement.