Data Mining: OLAP Operations
Data Mining: OLAP Operations
OLAP Operations
o Production.
o Roll-up
o Drill down
o Slice
o Dice
o pivot
Operations
on cube
down
1) Roll-up: -
o Roll up operation performs aggregation on data cube by climbing up the hierarchy to this
o Here we observed that Tumkur and Mysore both cities are assumed to climbing up
“India” country and Pune and Bombay cities are into U.S.A, Hence number of
2) Drill-down: -
o It is reverse operation of roll-up.
o When we perform drill-down on data cube dimensions are added to the cube.
o Here less detailed data to more detailed data. Consider that we are going add dimensions
to time dimensions.
3) Slice: -
It performs selection on one dimension from given cube and provides a new sub-
cube.
4) Dice: -
This operation selects two or more dimensions from a given cube and provides a new sub
cube.
5. Pivot: -
A schema is an overall structure or design of objects like tables views index etc.
The data warehouse is designed using either one of these three schemas. They are
o Star Schema
1. Star Schema:
o It has only one fact table which stores foreign keys and it refers number ofdimension
tables.
o In this schema all dimension tables are "not normalized”. It means a less number of
tables are used for that less number of joins used in this.
Ex:
Consider 4-dimensional table like book, college, employee and student and one fact table that is
university.
o In the above star schema “University” is a table which referred to all other tables.
o This schema is most suitable for query processing because we can use simple query.
2. Snowflake schema:
o It is also same as the star schema, which is also having only one “fact” table which is
o But the difference is, in this schema all “dimension tables are normalized” and these
o If the tables are normalized, more no of tables are used and more joins are used in order
o Advantage is, less redundancy because of normalized hence dimension tables are easy to
Consider the same 4-dimension tables but again we are going to tale next level of book table.
o In this we can use multiple fact tables that share common dimension tables
o Dimension tables are also very large in this suppose we have 2 fact tables sales and
o The process of moving data from traditional databases to data warehouse is called ETL
process.
o Transactional databases cannot answer complex questions then we can use ETL.
o ETL provides a method of moving data from various sources to data warehouse.
Source 1
Different sources
extract
Source 3
2. Data transformation:
After extracting data, it is a raw and it’s not useful for that reason we need to do some
transformations like,
Here, staging area gives an opportunity to validate the extracted data before it is moved to data
warehouse.
3. Data loading:
In this step, the transformed data is finally loaded into data warehouse
daily/weekly/monthly/yearly.