Sunday, February 17, 2019

OAC―Knowing the Dimensional Modelling Basics (1/2)

Video 1.  Dimensional Modeling – Declaring Dimensions

Operational Processing vs Data Warehousing


One of the most important assets of any organization is its information. This asset is almost always used for two purposes:[1]



Operational Processing Analytical Decision Making
Flavor
Transactional Analytical
Main Data Flow
The operational systems are where you put data in The Data Warehousing and Business Intelligence (DW/BI) systems are where you get the data out
Optimization
Optimized to process transactions quickly Optimized for high-performance queries
# of Transactions in Processing
Almost always deal with one transaction record at a time Often require that many transactions be searched and compressed into an answer set
History Preservation
Typically do not maintain history, but rather update data to reflect the most current state Typically demand that historical context be preserved to accurately evaluate the organization's performance over time


Dimensional Modeling (DW/BI)


Dimensional modeling is one of the methods of data modeling, that is the preferred technique for presenting analytic data.  It helps us store the data in such a way that it is relatively easy to retrieve the data from the data once the data is stored in database.

This is the reason why dimensional modeling is used mostly in data warehouses built for reporting. On the other side, dimensional model is not a good solution if your primary purpose of your data modeling is to reduce storage space requirement, reduce redundancy, speed-up loading time etc.[1]

Figure 1.  Star Schema (left) vs OLAP Cube (right)

3NF Model vs Dimensional Model


Although dimensional models are often instantiated in relational database management systems (RDMS), they are quite different from third normal form (3NF) models which seek to remove data redundancies:
  • 3NF Model (or Normalized Model)
    • Divides data into many discrete entities, each of which becomes a relational table
    • Sometimes are referred as entity-relationship (ER) models
    • Designed to reduce the duplication of data and ensure referential integrity
    • Designed to improve database processing while minimizing storage costs
    • Useful in operational processing because an update or insert transaction touches the database in only one place
      • However, are too complicated for BI queries. 
  • Dimensional Model (Star Schemas and OLAP Cubes)
    • Both stars and cubes have a common logical design with recognizable dimensions; however, the physical implementation differs (see Figure 1):
      • Star Schemas
        • Referred to as star schemas in RDBS because of their resemblance to a star-like structure in RDMS implementation
      • OLAP Cubes
        • Referred to as online analytical processing (OLAP) cubes in multidimensional database platform
        • Cubes can deliver superior query performance because of the precalculations, indexing strategies, and other optimizations
        • The downside is that you pay a load performance price for these capabilities, especially with large data sets
    • Contains the same information as a normalized model, but packages the data in a format that delivers user understandability, query performance, and resilience to change
Both 3NF and dimensional models can be represented in ERDs because both consist of joined relational tables; the key difference between 3NF and dimensional models is the degree of normalization.

Although the capabilities of OLAP technology are continuously improving, we generally recommend that detailed, atomic information be loaded into a star schema; optional OLAP cubes are then populated from the star schema.[1]

Dimensional Modeling Case Study


Consider the business scenario for a fast food chain:[2]
The business objective is to create a data model that can store and report number of burgers and fries sold from a specific McDonalds outlet per day. 

Below are the steps used for dimensional modeling: 
  • Identify the dimensions
    • Dimensions 
      • Describe the “who, what, where, when, how, and why” associated with the business process measurement event (e.g. a sales transaction).
    • In the above scenario, we have 3 dimensions - "food" (e.g. burgers and fries), "store" and "day"
      • Separate dimension tables are created for separate dimensions
        • The dimension tables contain the textual context (normally with set of descriptive nouns that characterize the business process) associated with a measurement event. 
  • Identify the measurement events (or facts)
    • Measurement Events
      • A measurement event in the physical world has a one-to-one relationship to a single row in the corresponding fact table
    • In the above scenario, we have 1 measurement event - "quantity"
      • fact table is created for storing measures and foreign keys to the dimension tables
        • The fact table stores the "number" of food sold in "Quantity" column against a given store, food and day columns. 
        • These store/food/day columns are basically foreign key columns of the primary keys in respective dimension tables. 
  • Identify the attributes or properties of dimensions
    • Attributes (or Properties)
      • Each dimension might have number of different properties, but for a given context, not all of them are relevant for business
    • Knowing the properties let us decide what columns are required to be created in each dimension table.
    • In the above scenario, we could have
      • Food: name (burgers or fries)
      • Store: name, location, etc
      • Day: date
  • Identify the granularity of the measures
    • All the measurement rows in a fact table must be at the same grain (i.e., day or month). 
    • Having the discipline to create fact tables with a single level of detail ensures that measurements aren't inappropriately double-counted.
  • History Preservation (Optional)
    • Identify which dimensions are slowly changing (or fast changing or unchanging) is the last and final step of modeling (see video 1)
    • There are 8 different dimension types, but only 3 are commonly used:[3]
      • Type 0 - Fixed, non changing attribute
      • Type 1 - Changing attribute, no history kept
      • Type 2 - Most complex, keeps historical changes

Figure 2.  Sample rows from a dimension table with denormalized hierarchies

Summary

  • Dimensional Model (cf. Normalized Model )
    • Dimensional schema is simpler and symmetric
      • Business users benefit from the simplicity because the data is easier to understand and navigate
      • Database optimizers process these simple schemas with fewer joins more efficiently
      • Every dimension is equivalent; all dimensions are symmetrically-equal entry points into the fact table.
    • Dimensional models are gracefully extensible to accommodate change
      • With dimensional models, you can add completely new dimensions to the schema as long as a single value of that dimension is defined for each existing fact row.
  • Fact Tables
    • Fact tables tend to be deep in terms of the number of rows, but narrow in terms of the number of columns
    • The most useful facts are numeric and additive, such as dollar sales amount. 
      • Additivity is crucial because BI applications rarely retrieve a single fact table row.
        • However, you will see that facts are sometimes semi-additive (e.g., account balances) or even non-additive (e.g., unit prices). 
    • Facts are often described as continuously valued 
    • Fact tables usually make up 90 percent or more of the total space consumed by a dimensional model. 
    • All fact tables have two or more foreign keys that connect to the dimension tables' primary keys.
    • Fact tables (or bridge table) express many-to-many relationships
  • Dimension Tables
    • Dimension tables tend to be shallow in terms of the number of rows, but wide in terms of the number of columns
    • Each dimension is defined by a single primary key (surrogate key or natural key) , which serves as the basis for referential integrity with any given fact table to which it is joined.
    • Robust dimension attributes deliver robust analytic slicing-and-dicing capabilities.
      • In many ways, the data warehouse is only as good as the dimension attributes; the analytic power of the DW/BI environment is directly proportional to the quality and depth of the dimension attributes.
      • Dimension attributes serve as the primary source of query constraints, groupings, and report labels.
        • You should strive to minimize the use of codes or cryptic abbreviations in dimension tables by replacing them with more verbose textual attributes.
    • Dimension tables often represent hierarchical relationships (See Figure 2)
      • For example, products roll up into brands and then into categories
      • For each row in the product dimension, you should store the associated brand and category description. 
      • The hierarchical descriptive information is stored redundantly in the spirit of ease of use and query performance.
      • You should resist the habitual urge to normalize data (i.e., snowflaking)
        • You should almost always trade off dimension table space for simplicity and accessibility.
        • Because dimension tables typically are geometrically smaller than fact tables, improving storage efficiency by normalizing or snowflaking has virtually no impact on the overall database size. 
  • Fact or Dimension Attribute
    • When triaging operational source data, it is sometimes unclear whether a numeric data element is a fact or dimension attribute.  It is
      • A fact if
        • The column is a measurement that takes on lots of values and participates in calculations
      • A dimension attribute if
        • The column is a discretely valued description that is more or less constant and participates in constraints and row labels
      • Note:
        • Continuously valued numeric observations are almost always facts; discrete numeric observations drawn from a small list are almost always dimension attributes.

References

  1. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
  2. What is dimensional modelling?
  3. Dimensional Modeling – Declaring Dimensions (Youtube)
  4. Learn Modern Data Visualization with Oracle Analytics

25 comments:

Unknown said...

Dalam permainan taruhan online, pertama-tama kita harus tahu berapa batas minimum deposit yang harus disiapkan untuk mulai bermain
asikqq
http://dewaqqq.club/
http://sumoqq.today/
interqq
pionpoker
bandar ceme
freebet tanpa deposit
paito warna terlengkap
syair sgp

Robert Green said...

Nice blog. I would like to share it with my friends. I hope you will continue your works like this. Keep up the excellent work. You have a magical talent for holding readers' minds. It is something special which cant be given to everyone.
Quickbooks support helps you with everything regarding QuickBooks software. If you are stuck in some general issues related to QuickBooks like installation error or payroll issues then we are here to help you anytime you want.

American Airlines Phone Number said...

QuickBooks Customer Service is available round the clock to resolve accounting problems. Reach experts at QuickBooks support phone Number and get instant help.

norhan said...


اعالى الخليج تقدم افضل خدمات نقل العفش الدولى المتميزه باسعار متميزة ومنها :

شركة شحن عفش من الرياض الى الامارات
نقل عفش من الرياض الى الاردن شركة شحن عفش من الرياض الى الاردن

Airlines booking said...

Get instant support and guidance before, during, or after the journey via Air China Customer Service, and mark an extraordinary journey with the optimum comfort.

Airlines booking said...

There has to be a reason for traveling. Book your tickets with Air China Phone Number and we can promise you that you won’t feel uncomfortable. All our flights are designed as per international standards, which is an important reason behind you to travel in Air China flights.

Airlines booking said...

Ensure a perfect journey with us with your airfares marked down with the biggest discounts. Call us at American Phone Number and relish your journey.

Airlines booking said...

Dial Southwest Phone Number and get best offers for every booking and fly with the cheap rates only with southwest airlines. you can enjoy at your amazing destinations. Our agents are there to help with best solutions over the Southwest Phone Number.

Airlines booking said...

You want to book an air ticket, then visit American Airlines official site to make your vacations and meetings comfortable. Check out our online portal for American Airlines Reservations and get the best discount and offers on your bookings. If you want more concessions, then call on our support number.

Airlines booking said...

Like all other airlines United Airlines flights also offer entertainment while you are in the flight. Book your air tickets with United Airlines official site and get the benefits of our exclusive and special entertainment systems installed in the flights for your entertainment throughout your journey.

molly watson said...

Hi, I am Molly. I have 12 years of rich experience in rendering technical issues for Epson Printer .Is Your Epson Printer Offline ? Connect with me to get the best possible fix .

Online Casino said...

Online Casino Spielautomaten | Bestes Online Casino: Entdecken Sie Neue Online Casinos.

molly watson said...

Call our experts to recieve best SBCGlobal Customer Service. We provide you the perfect solutions for Email at SBCGlobal Customer Service.

David William said...

Your blog is valuable us with significant information flight booking at,your new travel Spirit Phone Number booking associate. Contact our pros. It was the key bearers to show Premium Economy Each and each tip of your post are astonishing. Much refreshing for sharing. Keep blogging.if you any issue you can interface authorities groups master bunch best deals of Spirit Airline Phone Number.

Sahil said...

Excellent information providing by your Article, thank you for taking the time to share with us such a nice article. Amazing insight you have on this, it's nice to find a website that details so much information about different artists. Also visit the our website, we providing academic assignment writing at best prices ever.

molly watson said...

Our experts at Trend Micro Geek Squad provides the best technical help for installation, activation or setup or trend micro antivirus

customer services said...


Book a Geek Squad Appointment Scheduling for device installation, mending or troubleshooting. Reach us for assistance with Geek Squad Appointment Scheduling.

Unknown said...

Entdecken Sie die besten Online-Glücksspielseiten in unseren deutschland online casino. Überprüfen Sie unsere Kriterien für die Bewertung von Slots, Roulette, Poker und Blackjack.

petersidle said...

We help the households, offices and other venues to get Lizard Control services with our trusted service providers. Call us now!!

herryjhon said...

Awesome Blog, Thanks for providing important informations there.When you plan a great trip or vacation with your friends or family, the first and foremost thing that you do is searching for a cost-effective mode of traveling and the airline is flying to several destinations, so you can choose your destination to fly according to cheapest dates by Qatar Airways Customer Service .

herryjhon said...

If you are thinking of going to somewhere, but confused about when to go, then you can even toggle between the months to finally decide when the least expensive and best time to book airfare is ,So just visit at Turkish Airlines Fare Calendar and pick your date with our special offers, and be ready to feel the joy of traveling at budget-friendly prices!

herryjhon said...

The reality is traveling is not the cheap and expensive fare of transportation, especially air-transportation, often spoil plans just visit at Allegiant Airlines Fare Calendar where you don’t need to search over dates now and you will find the lowest airfare options for both domestic as well as international flights.. Get comfortable services to reach relaxed to your destination.

herryjhon said...

Thanks for your excellent blog, nice work keep it up thanks. We understand the need of guidance and information for the new and experienced travelers. It is time-consuming and baffling to deal with numerous issues So visit at Southwest Airlines Fare Calendar and you might easily check the cheapest fares among the listed dates around the tentative dates for your trips.

Amania said...

NOw watch and enjoy all the latest seriale turcesti subtitrat in hd. We provide provide daily all the seriale turecesti subtitrat in romana which you will be watch on despre seriale turcesti

chris demon said...

Jetblue Reservations
Allegiant Airlines Low Fare Calendar
jetblue customer service number
jetblue customer service
jetblue customer service phone number
frontier airlines customer service
frontier airlines customer service number
Delta Airlines Customer Service Number
delta airlines customer service phone number
Air Canada Customer Service