The process of defining your data warehousing system (DWH) has started. You’ve outlined the relevant dimension tables, which tie to the business requirements. These tables definewhatwe weigh, observe and scale. Now we need to definehowwe measure.Fact tables are where we store these measurements. They hold business data that can be aggregated across dimension combinations. But the fact is that fact tables are not so easily described – they have flavors of their own. In this article, we’ll answer some basic questions about fact tables, and examine the pros and cons of each type.
In the previous two articles, we considered the two most common data warehouse models: the star schema and the snowflake schema. Today, we’ll examine the differences between these two schemas and we’ll explain when it’s better to use one or the other.The star schema and the snowflake schema are ways to organize data marts or entire data warehouses using relational databases. Both of them usedimension tablesto describe data aggregated in a
In a previous article we discussed the star schema model. The snowflake schema is next to the star schema in terms of its importance in data warehouse modeling. It was developed out of the star schema, and it offers some advantages over its predecessor. But these advantages come at a cost. In this article, we’ll discuss when and how to use the snowflake schema.The Snowflake SchemaThe snowflake schema’s name comes from the fact that dimension tables branch out and look something like a snowflake. When we look at the model above, we’ll notice it’s a fact table surrounded by a few dimension tables, some of which do the aforementioned branching. Unlike the star schema, dimension tables in the snowflake schema can have their own categories.
Today, reports and analytics are almost as important as core business. Reports can be built out of your live data; often this approach will do the trick for small- and medium-sized companies without lots of data. But when things get bigger – or the amount of data starts increasing dramatically – it’s time to think about separating your operational and reporting systems.Before we tackle basic data modeling, we need some background on the systems involved. We can roughly divide systems in two categories: operational and reporting systems. Operational systems are often called Online Transaction Processing (OLTP). Reporting and analytical systems are referred to as Online Analytical Processing (OLAP). OLTP systems support business processes. They work with “live” operational data, are highly normalized, and react very quickly to user actions. On the other hand, the primary purpose of the OLAP systems is analytics. These systems use summarized data, which is usually placed in a denormalized data warehousing structure like the star schema. (What is denormalization? Simply put, it’s having redundant data records for the sake of better performance.