Sunday, February 17, 2019

OAC―Knowing the Dimensional Modelling Basics (1/2)

Video 1.  Dimensional Modeling – Declaring Dimensions (YouTube link)

Operational Processing vs Data Warehousing


One of the most important assets of any organization is its information. This asset is almost always used for two purposes:[1]



Operational Processing Analytical Decision Making
Flavor
Transactional Analytical
Main Data Flow
The operational systems are where you put data in The Data Warehousing and Business Intelligence (DW/BI) systems are where you get the data out
Optimization
Optimized to process transactions quickly Optimized for high-performance queries
# of Transactions in Processing
Almost always deal with one transaction record at a time Often require that many transactions be searched and compressed into an answer set
History Preservation
Typically do not maintain history, but rather update data to reflect the most current state Typically demand that historical context be preserved to accurately evaluate the organization's performance over time


Dimensional Modeling (DW/BI)


Dimensional modeling is one of the methods of data modeling, that is the preferred technique for presenting analytic data.  It helps us store the data in such a way that it is relatively easy to retrieve the data from the data once the data is stored in database.

This is the reason why dimensional modeling is used mostly in data warehouses built for reporting. On the other side, dimensional model is not a good solution if your primary purpose of your data modeling is to reduce storage space requirement, reduce redundancy, speed-up loading time etc.[1]

Figure 1.  Star Schema (left) vs OLAP Cube (right)

3NF Model vs Dimensional Model


Although dimensional models are often instantiated in relational database management systems (RDMS), they are quite different from third normal form (3NF) models which seek to remove data redundancies:
  • 3NF Model (or Normalized Model)
    • Divides data into many discrete entities, each of which becomes a relational table
    • Sometimes are referred as entity-relationship (ER) models
    • Designed to reduce the duplication of data and ensure referential integrity
    • Designed to improve database processing while minimizing storage costs
    • Useful in operational processing because an update or insert transaction touches the database in only one place
      • However, are too complicated for BI queries. 
  • Dimensional Model (Star Schemas and OLAP Cubes)
    • Both stars and cubes have a common logical design with recognizable dimensions; however, the physical implementation differs (see Figure 1):
      • Star Schemas
        • Referred to as star schemas in RDBS because of their resemblance to a star-like structure in RDMS implementation
      • OLAP Cubes
        • Referred to as online analytical processing (OLAP) cubes in multidimensional database platform
        • Cubes can deliver superior query performance because of the precalculations, indexing strategies, and other optimizations
        • The downside is that you pay a load performance price for these capabilities, especially with large data sets
    • Contains the same information as a normalized model, but packages the data in a format that delivers user understandability, query performance, and resilience to change
Both 3NF and dimensional models can be represented in ERDs because both consist of joined relational tables; the key difference between 3NF and dimensional models is the degree of normalization.

Although the capabilities of OLAP technology are continuously improving, we generally recommend that detailed, atomic information be loaded into a star schema; optional OLAP cubes are then populated from the star schema.[1]

Dimensional Modeling Case Study


Consider the business scenario for a fast food chain:[2]
The business objective is to create a data model that can store and report number of burgers and fries sold from a specific McDonalds outlet per day. 

Below are the steps used for dimensional modeling: 
  • Identify the dimensions
    • Dimensions 
      • Describe the “who, what, where, when, how, and why” associated with the business process measurement event (e.g. a sales transaction).
    • In the above scenario, we have 3 dimensions - "food" (e.g. burgers and fries), "store" and "day"
      • Separate dimension tables are created for separate dimensions
        • The dimension tables contain the textual context (normally with set of descriptive nouns that characterize the business process) associated with a measurement event. 
  • Identify the measurement events (or facts)
    • Measurement Events
      • A measurement event in the physical world has a one-to-one relationship to a single row in the corresponding fact table
    • In the above scenario, we have 1 measurement event - "quantity"
      • fact table is created for storing measures and foreign keys to the dimension tables
        • The fact table stores the "number" of food sold in "Quantity" column against a given store, food and day columns. 
        • These store/food/day columns are basically foreign key columns of the primary keys in respective dimension tables. 
  • Identify the attributes or properties of dimensions
    • Attributes (or Properties)
      • Each dimension might have number of different properties, but for a given context, not all of them are relevant for business
    • Knowing the properties let us decide what columns are required to be created in each dimension table.
    • In the above scenario, we could have
      • Food: name (burgers or fries)
      • Store: name, location, etc
      • Day: date
  • Identify the granularity of the measures
    • All the measurement rows in a fact table must be at the same grain (i.e., day or month). 
    • Having the discipline to create fact tables with a single level of detail ensures that measurements aren't inappropriately double-counted.
  • History Preservation (Optional)
    • Identify which dimensions are slowly changing (or fast changing or unchanging) is the last and final step of modeling (see video 1)
    • There are 8 different dimension types, but only 3 are commonly used:[3]
      • Type 0 - Fixed, non changing attribute
      • Type 1 - Changing attribute, no history kept
      • Type 2 - Most complex, keeps historical changes

Figure 2.  Sample rows from a dimension table with denormalized hierarchies

Summary

  • Dimensional Model (cf. Normalized Model )
    • Dimensional schema is simpler and symmetric
      • Business users benefit from the simplicity because the data is easier to understand and navigate
      • Database optimizers process these simple schemas with fewer joins more efficiently
      • Every dimension is equivalent; all dimensions are symmetrically-equal entry points into the fact table.
    • Dimensional models are gracefully extensible to accommodate change
      • With dimensional models, you can add completely new dimensions to the schema as long as a single value of that dimension is defined for each existing fact row.
  • Fact Tables
    • Fact tables tend to be deep in terms of the number of rows, but narrow in terms of the number of columns
    • The most useful facts are numeric and additive, such as dollar sales amount. 
      • Additivity is crucial because BI applications rarely retrieve a single fact table row.
        • However, you will see that facts are sometimes semi-additive (e.g., account balances) or even non-additive (e.g., unit prices). 
    • Facts are often described as continuously valued 
    • Fact tables usually make up 90 percent or more of the total space consumed by a dimensional model. 
    • All fact tables have two or more foreign keys that connect to the dimension tables' primary keys.
    • Fact tables (or bridge table) express many-to-many relationships
  • Dimension Tables
    • Dimension tables tend to be shallow in terms of the number of rows, but wide in terms of the number of columns
    • Each dimension is defined by a single primary key (surrogate key or natural key) , which serves as the basis for referential integrity with any given fact table to which it is joined.
    • Robust dimension attributes deliver robust analytic slicing-and-dicing capabilities.
      • In many ways, the data warehouse is only as good as the dimension attributes; the analytic power of the DW/BI environment is directly proportional to the quality and depth of the dimension attributes.
      • Dimension attributes serve as the primary source of query constraints, groupings, and report labels.
        • You should strive to minimize the use of codes or cryptic abbreviations in dimension tables by replacing them with more verbose textual attributes.
    • Dimension tables often represent hierarchical relationships (See Figure 2)
      • For example, products roll up into brands and then into categories
      • For each row in the product dimension, you should store the associated brand and category description. 
      • The hierarchical descriptive information is stored redundantly in the spirit of ease of use and query performance.
      • You should resist the habitual urge to normalize data (i.e., snowflaking)
        • You should almost always trade off dimension table space for simplicity and accessibility.
        • Because dimension tables typically are geometrically smaller than fact tables, improving storage efficiency by normalizing or snowflaking has virtually no impact on the overall database size. 
  • Fact or Dimension Attribute
    • When triaging operational source data, it is sometimes unclear whether a numeric data element is a fact or dimension attribute.  It is
      • A fact if
        • The column is a measurement that takes on lots of values and participates in calculations
      • A dimension attribute if
        • The column is a discretely valued description that is more or less constant and participates in constraints and row labels
      • Note:
        • Continuously valued numeric observations are almost always facts; discrete numeric observations drawn from a small list are almost always dimension attributes.

References

  1. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
  2. What is dimensional modelling?
  3. Dimensional Modeling – Declaring Dimensions (Youtube)
  4. Learn Modern Data Visualization with Oracle Analytics
  5. Click here for more A-Team Oracle Analytics (OAC) Blogs.

35 comments:

  1. Dalam permainan taruhan online, pertama-tama kita harus tahu berapa batas minimum deposit yang harus disiapkan untuk mulai bermain
    asikqq
    http://dewaqqq.club/
    http://sumoqq.today/
    interqq
    pionpoker
    bandar ceme
    freebet tanpa deposit
    paito warna terlengkap
    syair sgp

    ReplyDelete
  2. Nice blog. I would like to share it with my friends. I hope you will continue your works like this. Keep up the excellent work. You have a magical talent for holding readers' minds. It is something special which cant be given to everyone.
    Quickbooks support helps you with everything regarding QuickBooks software. If you are stuck in some general issues related to QuickBooks like installation error or payroll issues then we are here to help you anytime you want.

    ReplyDelete
  3. There has to be a reason for traveling. Book your tickets with Air China Phone Number and we can promise you that you won’t feel uncomfortable. All our flights are designed as per international standards, which is an important reason behind you to travel in Air China flights.

    ReplyDelete
  4. Hi, I am Molly. I have 12 years of rich experience in rendering technical issues for Epson Printer .Is Your Epson Printer Offline ? Connect with me to get the best possible fix .

    ReplyDelete
  5. Your blog is valuable us with significant information flight booking at,your new travel Spirit Phone Number booking associate. Contact our pros. It was the key bearers to show Premium Economy Each and each tip of your post are astonishing. Much refreshing for sharing. Keep blogging.if you any issue you can interface authorities groups master bunch best deals of Spirit Airline Phone Number.

    ReplyDelete
  6. Excellent information providing by your Article, thank you for taking the time to share with us such a nice article. Amazing insight you have on this, it's nice to find a website that details so much information about different artists. Also visit the our website, we providing academic assignment writing at best prices ever.

    ReplyDelete

  7. Book a Geek Squad Appointment Scheduling for device installation, mending or troubleshooting. Reach us for assistance with Geek Squad Appointment Scheduling.

    ReplyDelete
  8. Entdecken Sie die besten Online-Glücksspielseiten in unseren deutschland online casino. Überprüfen Sie unsere Kriterien für die Bewertung von Slots, Roulette, Poker und Blackjack.

    ReplyDelete
  9. Thanks for your excellent blog, nice work keep it up thanks. We understand the need of guidance and information for the new and experienced travelers. It is time-consuming and baffling to deal with numerous issues So visit at Southwest Airlines Fare Calendar and you might easily check the cheapest fares among the listed dates around the tentative dates for your trips.

    ReplyDelete
  10. NOw watch and enjoy all the latest seriale turcesti subtitrat in hd. We provide provide daily all the seriale turecesti subtitrat in romana which you will be watch on despre seriale turcesti

    ReplyDelete
  11. Download aplikasi idn poker android resmi, hanya di situs pokerwan. agen pokerwan ini menjalankan permainan judi poker online uang asli deposit pulsa sejak tahun 2015.
    http://139.59.64.229/jackpot.php
    Dengan sistem server terbaru yang kami punya, anda bisa mendapatkan jackpot ratusan juta rupiah dengan cepat di situs poker online pokerwan. Jangan takut menerima kekalahan di situs kami, karena kami mempunyai bonus turnover khusus yang kami persembahkan kepada player poker online sekalian. Dengan cara ini anda bisa mendapatkan bonus judi online terbesar meskipun saat kalah. Ayo gabung bersama kami sekarang juga.

    ReplyDelete
  12. Dapatkan jackpot ratusan juta rupiah dengan cepat dan mudah, ketika anda bermain judi poker online di situs idn poker Pokerwan. Kami adalah agen judi poker online terpercaya yang sudah mendapatkan lisensi resmi dari pihak PAGCOR dan BMMTestLab.
    http://capsawan.com/
    Sehingga anda bisa menikmati permainan poker online indonesia tanpa ada BOT dan SCAM sama sekali. Daftar poker online di situs ini amatlah sangat mudah, hanya dengan biaya 10.000 rupiah saja. Anda sudah bisa menikmati game kartu online di situs Pokerwan. Mari bergabung dan rasakan sensasi bermain judi poker online Indonesia di situs Pokerwan.

    ReplyDelete
  13. To be honest I found very helpful information your blog thanks for providing us such blog DC vs SRH | Cricket Betting Tips

    ReplyDelete

  14. To be honest I found very helpful information your blog thanks for providing us such blog CSK vs MI Dream11 Team Prediction

    ReplyDelete

  15. To be honest I found very helpful information your blog thanks for providing us such blog ST-W vs AS-W Dream11 Team Prediction

    ReplyDelete
  16. To be honest I found very helpful information your blog thanks for providing us such blog AUS vs IND Dream11 Team Prediction

    ReplyDelete
  17. To be honest I found very helpful information your blog thanks for providing us such blog SA vs ENG Dream11 Team Prediction

    ReplyDelete
  18. A well written Content provided by you. Your writing skills are perfect Spirit Airlines Last Minute Flights. i hope so you will write more content like this but if you are searching for flights then contact us or visit our website.

    ReplyDelete
  19. Independent Ratings and Reviews of Online Casinos and Betting Sites. Best Online Casino Bonuses and Promotions & Top rated Slots - Stakers

    ReplyDelete
  20. Thanks for sharing such a good article. This is really helpful. Best Oracle Cloud Automated Testing Services

    ReplyDelete
  21. There are manydata warehouse architecture models but a common approach is the 3-tier model, which consists of four basic layers: the data store, which typically is a relational database; the data access layer, which interacts with the data store by using Structured Query Language (SQL); the business logic layer, which processes the data; and the application layer, which provides an interface for accessing the warehouse.

    ReplyDelete
  22. Via online casino 2go vind je een overzicht van alle online casinos in Nederland.

    ReplyDelete
  23. An Entity Relationship Diagram (ERD) is a visual representation of different entities within a system and how they relate to each other

    ReplyDelete
  24. Teknobuilt serves the world with professional construction approaches from planning to execution of the project. Our solutions are based on your needs and this is why we keep our focus on ensuring productivity, quality, and excellence in our services. We also provide Digital Construction and occupational health safety and environment services for better development. For more information, contact @ +91-11-41072395, or you can mail us at info@teknobuilt.com.

    ReplyDelete
  25. This comment has been removed by the author.

    ReplyDelete
  26. www.amazon.com/mytv. This business entity is independent and has no relationship with Amazon or any of its subsidiaries. You can also find the information on Amazon's official site. amazon code
    amazon.com/code
    paypal login
    play.google.com/redeem
    amazon.com/code

    ReplyDelete
  27. Amazon video activation code is six-digit code, which includes numbers and letters. Ex. TKMHYU. To prevent misuse, it is important to keep the Amazon activation code secure. You can either get the Amazon.com activation code with your www.Amazon.com/mytv product or by email.
    amazon mytv
    amazon.com/code

    ReplyDelete
  28. To activate your disneyplus visit disneyplus.com/begin and follow the activation process.
    disneyplus.com/begin

    To activate your mintmobile visit mintmobile.com/activate and follow the activation process.
    mintmobile.com/activate

    To activate your T-Mobile sim visit t-mobile.com/sim and follow the activation process.
    t-mobile.com/sim

    ReplyDelete
  29. I read your blog really useful keep sharing more.
    Men's Suits for Sale

    ReplyDelete
  30. Very Good Content !! have Any about Quickbooks Software ? So you can call at
    Quickbooks Customer Support +1 267-773-4333For any help

    ReplyDelete
  31. At DJI Bangladesh Shop, we take pride in being your one-stop destination for all things DJI. Whether you are an aspiring filmmaker, a content creator, or simply someone passionate about capturing life's moments in the best possible way, our range of top-notch products will surely leave you awe-inspired.

    ReplyDelete