Thursday, September 26, 2019

Data Warehousing - Slowly Changing Dimension (SCD)

Types of Dimensions:
  1. Slowly Changing Dimension
  2. Conformed Dimension
  3. Degenearte Dimension
  4. Junk Dimension
  5. Role Playing Dimension
Slowly Changing Dimension (SCD):

Dimension attributes that change slowly over a period of time rather than changing regularly is group as SCD's.

Let's consider an example of a person changing his/her city from city 1 to city 2 and this change does not happen at regular intervals.

Examples: Name, Job, Address, Phone Number

There are 3 ways in which these types of attributes are handled.

SCD Type - 1: Overwrite with Old value
SCD Type - 2: Add a new Row/Record
SCD Type - 3: Add a new Column

SCD Type - 1: Overwrite with Old value
*******************************************************************************
Table 1: Before Change


Suppose Raj has changed his city from Tirupati to Bangalore in 2014, the values of the Table 1 is overwritten as shown below.

Table 2: After Change


The Advantage of above approach is that is very easy to follow and results in huge space savings and hence cost savings.

The Disadvantage is that no history is maintained.

SCD Type - 2: Add a new Row/Record
********************************************************************************
Table 1: Before Change



Suppose Raj has changed his city from Tirupati to Bangalore in 2014, new row is added as shown in Table 2.

Table 2: After Change


The Advantage of above approach is that complete history is maintained.

The disadvantage is that it results in huge space allocation because the entire history from the start has to be maintained.

It is relatively difficult to determine which record is the most recent.
A sub query has to be written to fetch the most recent record for each ID
To overcome these, there are 2 approaches to indicate most c current record.

Approach 1 - Use the Start Date and End Date

Suppose Raj started living in Tirupati from 1st June 2011, hence for the very first time the record will look as below:

If  Raj moves to Bangalore on 1st April 2014, then the record as per 1st approach will look as below:


Hence the most record can be now identified by using the condition End date > Current Date

Approach 2 - Use an Indicator

First time when a record arrives with Tirupati as the city, the indicator will be set as Y




When the person changes the city then the indicator of the previous record is changed to  N and the indicator of the current record is changed to Y


Hence to get the most recent record, just use the condition WHERE current_flag='Y'

SCD Type - 3: Add a new Column
********************************************************************************
Sometimes it is not required to maintain the entire history of a slowly changing dimension.

We might want to retain the current record and the immediate previous record. In such cases instead of adding new rows, it is beneficial to add columns as illustrated below.

For the first time when Raj moves to Tirupati the old city column will be blank.


When he moves to Bangalore, the city column becomes Bangalore and the old city column will be Tirupati


The Disadvantage of the above approach is clearly seen in the table above.
For each attribute, there must be 2 columns, one indicating the current value and the other indicating the previous value.

1 comment:

Delete Row Button to all records of Interactive Report in Oracle Apex

 1. add 'Delete' Del column to Report Query 2. Set the Following Properties for the DEL Column Type: Link Heading: Delete Targ...