Part 2 Analytical Databases
Chapter 8 Data Warehouse and Data Mart Modeling
Introduction
Earlier in this book, we gave a detailed overview of the modeling process of operational databases. In Chapter 2, we described the details of ER modeling, a predominant technique for visualizing database requirements used extensively for conceptual modeling of operational databases. In Chapter 3, we gave a detailed overview of relational modeling as the standard method for logical modeling of operational databases. Both these techniques can also be used during the development of data warehouses and data marts. In addition to these two methods, a modeling technique known as dimensional modeling, tailored specifically for analytical database design purposes, is regularly used in practice for modeling data warehouses and data marts.
In the first part of this chapter, we will give a detailed overview of dimensional modeling. In the second part, we will give an overview of the most commonly used data warehouse modeling strategies with respect to how they utilize ER modeling, relational modeling, and dimensional modeling.
Dimensional Modeling: Basic Concepts
Dimensional modeling is a data design methodology used for designing subject-oriented analytical databases (i.e., data, warehouses or data marts). Commonly, dimensional modeling is employed as a relational data modeling technique. As a relational modeling technique, dimensional modeling designs relational tables that have primary keys and are connected to each other via foreign keys, while conforming to the standard relational integrity constraints. In addition to using the regular relational concepts (primary keys, foreign keys, integrity constraints, etc.), dimensional modeling distinguishes two types of tables—dimensions and facts:
bull; Dimension tables (dimensions) contain descriptions of the business, organization, or enterprise to which the subject of analysis belongs. Columns in dimension tables contain descriptive information that is often textual (e.g., product brand, product color, customer gender, customer education level), but can also be numeric (e.g., product weight, customer income level). This information provides a basis for analysis of the subject. For example, if the subject of the business analysis is sales, it can be analyzed by product brand, customer gender, customer income level, and so on.
bull; Fact tables contain measures related to the subject of analysis. In addition, fact tables contain foreign keys associating them with dimension tables. The measures in the fact tables are typically numeric and are intended for mathematical computation and quantitative analysis. For example, if the subject of the business analysis is sales, one of the measures in the fact table sales could be the salersquo;s dollar amount. The sale amounts can be calculated and recalculated using different mathematical functions across various dimension columns. For example, the total and average sale can be calculated per product brand, customer gender, customer income level, and so on.
Figure 8.1A dimensional model (star schema).
The result of relational dimensional modeling is a dimensional relational schema containing facts and dimensions, shown in Figure 8.1. The dimensional schema is often referred to as the star schema.
A dimension table contains a primary key and attributes that are used for analysis of the measures in the fact tables. A fact table contains fact-measure attributes and foreign keys that connect the fact table to the dimension tables. In this book, we mark fact tables with a bolded thicker frame to distinguish them from the dimension tables in the figures. The primary key of a fact table is a composite key that combines foreign key columns and/or other columns in the fact table (we will elaborate on this later in this chapter).
Even though any order of columns in a fact table (or in any table, for that matter) is acceptable, for readability reasons, in examples of star schema in this book we will always show measures in the fact table as the last columns in the table.
Initial Example: Dimensional Model
Based on A Single Source
The following example illustrates the basic concepts of dimensional modeling. This example uses the same fictional scenario involving the database for the ZAGI Retail Company Sales Department used in Chapters 2 and 3. Figures illustrating the ZAGI Retail Company Sales Department Database and its records are repeated here for convenience.
Figure 8.2 (same as Figure 3.32) shows the ER diagram and the resulting relational schema for the ZAGI Retail Company Sales Department Database.
Figure 8.3 (same as Figure 3.33) shows the records in the relational database for the ZAGI Retail Company Sales Department Database.
Based on its operational database (the ZAGI Retail Company Sales Department Database) shown in Figures 8.2 and 8.3, the ZAGI Retail Company decides to use the dimensional modeling technique to design an analytical database whose subject of analysis is sales. The result of this process is the star schema shown in
Figure 8.4.
In the star schema, the chosen subject of analysis is represented by a fact table. In this example, the chosen subject of analysis (sales) is represented by the SALES fact table.
Figure 8.2 The ZAGI Retail Company Sales Department Database—an ER diagram and the resulting relational schema.
Designing the star schema involves considering which dimensions to use with the fact table representing the chosen subject. For every dimension under consideration, two questions must be answered:
Question 1: Can the dimension be useful for the analysis of the chosen subject?
Quest
剩余内容已隐藏,支付完成后下载完整资料
说明:
封面之后放英文原文;英文原文之后放译文;译文后空两行,注明英文出处(请按照参考文献的书写要求书写,英文原文不能来自网站)。
例:
原文来源:
Y. Desmedt. Some recent research aspects of threshold cryptography. In : E. Okamoto , G. Davida , M. Mambo , eds. Proc. the 1st Intrsquo;l Information Security Workshop, Lecture Notes in Computer Science 1396. New York : Spring-Verlag , 1997. 158~173
原文:
Part 2 Analytical Databases
Chapter 8 Data Warehouse and Data Mart Modeling
Introduction
Earlier in this book, we gave a detailed overview of the modeling process of operational databases. In Chapter 2, we described the details of ER modeling, a predominant technique for visualizing database requirements used extensively for conceptual modeling of operational databases. In Chapter 3, we gave a detailed overview of relational modeling as the standard method for logical modeling of operational databases. Both these techniques can also be used during the development of data warehouses and data marts. In addition to these two methods, a modeling technique known as dimensional modeling, tailored specifically for analytical database design purposes, is regularly used in practice for modeling data warehouses and data marts.
In the first part of this chapter, we will give a detailed overview of dimensional modeling. In the second part, we will give an overview of the most commonly used data warehouse modeling strategies with respect to how they utilize ER modeling, relational modeling, and dimensional modeling.
Dimensional Modeling: Basic Concepts
Dimensional modeling is a data design methodology used for designing subject-oriented analytical databases (i.e., data, warehouses or data marts). Commonly, dimensional modeling is employed as a relational data modeling technique. As a relational modeling technique, dimensional modeling designs relational tables that have primary keys and are connected to each other via foreign keys, while conforming to the standard relational integrity constraints. In addition to using the regular relational concepts (primary keys, foreign keys, integrity constraints, etc.), dimensional modeling distinguishes two types of tables—dimensions and facts:
bull; Dimension tables (dimensions) contain descriptions of the business, organization, or enterprise to which the subject of analysis belongs. Columns in dimension tables contain descriptive information that is often textual (e.g., product brand, product color, customer gender, customer education level), but can also be numeric (e.g., product weight, customer income level). This information provides a basis for analysis of the subject. For example, if the subject of the business analysis is sales, it can be analyzed by product brand, customer gender, customer income level, and so on.
bull; Fact tables contain measures related to the subject of analysis. In addition, fact tables contain foreign keys associating them with dimension tables. The measures in the fact tables are typically numeric and are intended for mathematical computation and quantitative analysis. For example, if the subject of the business analysis is sales, one of the measures in the fact table sales could be the salersquo;s dollar amount. The sale amounts can be calculated and recalculated using different mathematical functions across various dimension columns. For example, the total and average sale can be calculated per product brand, customer gender, customer income level, and so on.
Figure 8.1A dimensional model (star schema).
The result of relational dimensional modeling is a dimensional relational schema containing facts and dimensions, shown in Figure 8.1. The dimensional schema is often referred to as the star schema.
A dimension table contains a primary key and attributes that are used for analysis of the measures in the fact tables. A fact table contains fact-measure attributes and foreign keys that connect the fact table to the dimension tables. In this book, we mark fact tables with a bolded thicker frame to distinguish them from the dimension tables in the figures. The primary key of a fact table is a composite key that combines foreign key columns and/or other columns in the fact table (we will elaborate on this later in this chapter).
Even though any order of columns in a fact table (or in any table, for that matter) is acceptable, for readability reasons, in examples of star schema in this book we will always show measures in the fact table as the last columns in the table.
Initial Example: Dimensional Model
Based on A Single Source
The following example illustrates the basic concepts of dimensional modeling. This example uses the same fictional scenario involving the database for the ZAGI Retail Company Sales Department used in Chapters 2 and 3. Figures illustrating the ZAGI Retail Company Sales Department Database and its records are repeated here for convenience.
Figure 8.2 (same as Figure 3.32) shows the ER diagram and the resulting relational schema for the ZAGI Retail Company Sales Department Database.
Figure 8.3 (same as Figure 3.33) shows the records in the relational database for the ZAGI Retail Company Sales Department Database.
Based on its operational database (the ZAGI Retail Company Sales Department Database) shown in Figures 8.2 and 8.3, the ZAGI Retail Company decides to use the dimensional modeling technique to design an analytical database whose subject of analysis is
剩余内容已隐藏,支付完成后下载完整资料
资料编号:[253957],资料为PDF文档或Word文档,PDF文档可免费转换为Word
以上是毕业论文外文翻译,课题毕业论文、任务书、文献综述、开题报告、程序设计、图纸设计等资料可联系客服协助查找。