Sei sulla pagina 1di 12

Group Assignment 1

Advanced Database Systems


Group Member
1. 1701497840 Alexander Gunawan
2. 1701498383 Alpha Epsilon
3. 1701497903 Armandha Aria
4. 1701497885 Ghema Nusa Persada
5. 1701497872 Rico Malibu

Question / Problem
One big company has 6 departments such as marketing, inventory, purchasing,
Finance, IT and HR Departments. The company want to setup data warehouse
environment for their HR Department as their 4 departments such as Marketing,
Inventory, Finance and Purchasing which have been setup with data warehouse
environment.

Based on Transaction such as daily attendance and leave and others HR


Department data such as list of dependents such as couple or children, employee
position such as manager, supervisor or staff and employee degree such as
bachelor, master or Doctoral degree and all list of departments then up there is the
design of Class diagram of OLTP (On-line Transactional Processing) / TPS
(Transactional Processing System) from HR Department.
1. What kind of data warehouse implementation approach will be suitable for
this scheme and give your reason.
2. What kind data warehouse architecture will be suitable for this HR
Department and please figure out that data warehouse architecture.
3. Design star schema for this HR Department based on 1 of example report
which is created by you and please explain what is inside the schema such as
how many tables, what kind of the name of Tables, how many primary,
foreign and composite key and what are they?
4. Design snowflake schema for this HR Department based on 1 of example
report which is created by you and please explain what is inside the schema
such as how many tables, what kind of the name of Tables, how many
primary, foreign and composite key and what are they?

Answer / Solution
Revised Diagram
Additional field is added to revise some undefined relation regarding Manager of
Department and to ease the logical relation among Employee, Position and
Department.

Revised Column List


To reflect rhe revision here is the column list to show the relation between tables
table
Employee

Emp_Dep

Department

Emp_Pos

Position
Attendance

Emp_Leave

Leave
Emp_Depen

Dependent
Emp_Degree

Degree

column
Emp_ID
Emp_firstName
Emp_lastName
Emp_DOB
Emp_phone
Emp_email
Emp_hiredate
Emp_status
Emp_Mgr_ID
Emp_ID
Dep_ID
Dep_Start
Dep_finish
Dep_ID
Dep_name
Dep_Mgr_ID
Emp_ID
Pos_Code
Pos_start
Pos_Finish
Pos_salary
Pos_code
Pos_name
Emp_ID
Att_Date
Att_timeIN
Att_timeout
Emp_ID
Leave_code
Leave_start
Leave_Finish
Leave_code
leave_type
Emp_ID
Depen_Code
Depen_Name
Depen_DOB
Depen_code
Depen_type
Emp_ID
Deg_code
Name_univ
Deg_start
Deg_finish
Deg_GPA
Deg_code
Deg_type

key
PK

join key

values

SK
active, inactive
FK
FK
PK

Employee.Emp_ID
Employee.Emp_ID
Department.Dep_ID

PK
FK
CK
CK

01=HR, 02=Marketing, 03=Inventory, 04=Finance, 05=Purchasing


HR, Marketing, Inventory, Finance, Purchasing
Employee.Emp_ID
Employee.Emp_ID

PK

01=Manager, 02=Supervisor, 03=Staff


Manager, Supervisor. Staff

CK
CK

Employee.Emp_ID

CK
CK

Employee.Emp_ID

PK
CK
CK

01=Yearly, 02=Maternity, 03=Religion, 04=unpaid 05=Other


Yearly, Maternity, Religion, Unpaid, Other
Employee.Emp_ID
Dependent.Depen_Code

PK
CK
CK

PK

01=Spouse, 02=Child, 03=Parent, 04=Other


Spouse, Child, Parent, Other
Employee.Emp_ID
Degree.Deg_Code

01=Bachelor, 02=Master, 03=Doctoral, 04=other


Bachelor, Master, Doctoral, other

Suitable Data Warehouse Implementation


The dimensional approach refers to Ralph Kimballs approach in which it is stated
that the data warehouse should be modeled using a Dimensional Model/star
schema. The normalized approach, also called the 3NF model (Third Normal Form)
refers to Bill Inmon's approach in which it is stated that the data warehouse should
be modeled using an E-R model/normalized model.
In a dimensional approach, transaction data are partitioned into "facts", which are
generally numeric transaction data, and "dimensions", which are the reference
information that gives context to the facts. For example, a sales transaction can be
broken up into facts such as the number of products ordered and the price paid for
the products, and into dimensions such as order date, customer name, product
number, order ship-to and bill-to locations, and salesperson responsible for
receiving the order.
A key advantage

Quick data retrieval from data warehouse

Easy to understand for business users because structure is divided into


measurements/facts and context/dimensions

Suitable Data Warehouse Architecture


Available Architectures
1. Basic / Single-Layer Architecture
A single-layer architecture is not frequently used in practice. Its goal is to
minimize the amount of data stored; to reach this goal, it removes data
redundancies. Only layer physically available: the source layer. In this case,
data warehouses are virtual. This means that a data warehouse is
implemented as a multidimensional view of operational data created by
specific middleware, or an intermediate processing layer (Devlin, 1997).

2. With a Staging Area / Two-Layer Architecture


The requirement for separation plays a fundamental role in defining the
typical architecture for a data warehouse system, as shown below. Although it
is typically called a two-layer architecture to highlight a separation between
physically available sources and data warehouses, it actually consists of four
subsequent data flow stages (Lechtenbrger, 2001).

3. With a Staging Area and Data Marts


Ralph Kimball, on the other hand, advocates what he calls a bus architecture
data warehouse. His methodology specifies conformed dimensions, where
multiple fact tables share common dimensional tables. Each of these fact
tables represents a data mart. The row of dimensional tables that all the fact
tables plug into is the bus, and because, for example, the finance and the
sales data marts both use the same product dimension table there is
integration between departments.

The most suitable architecture would be With a Staging Area and Data Marts
Citation:
https://docs.oracle.com/cd/B10500_01/server.920/a96520/concept.htm

Star Schema for current HR Department


Using Star Schema
Sample Report: Departmental Monthly Attendance (DMAReport)
Assumptions

Monthly Attendance = (monthly sum of Hours of Attendance) (monthly


unpaid leaves * work hours)

work hours=8

Below is ETL Process from transactional table


Tables/Column Involved (Extract)
Emp_ID
Employee
Emp_firstName
Emp_lastName
Emp_email
Emp_Mgr_ID
Emp_ID
Emp_Dep
Dep_ID
Dep_Start
Department Dep_ID
Dep_name
Dep_Mgr_ID
Emp_ID
Emp_Pos
Pos_Code
Pos_start
Pos_Finish
Pos_code
Position
Pos_name
Emp_ID
Attendance
Att_Date

Emp_Leave

Leave

New Table (Transform) as Dimension


EmployeeID (PK)
EmplyInfo
EmployeeName
EmployeeEmail
DeptInfo

Generate Report (Load) / Fact table


EmployeeID
DMAReport
EmployeeName
EmployeeEmail

EmployeeID (PK)
Dep_Start

PosInfo

DepartmentName
ManagerName
EmployeeID (PK)

DepartmentName
ManagerName

Pos_start
Pos_Finish

AttendInfo

PositionName
EmployeeID (PK)
Att_Date

Att_timeIN

Att_timeIN

Att_timeout
Emp_ID
Leave_code

Att_timeout

Leave_start

Leave_start

Leave_Finish
Leave_code
leave_type

Leave_Finish

PositionName

AttendHour =
sum(Att_timeout
Att_timeIN) in
month(Att_Date)

LeaveHour =
(Leave_Finish LeaveStart) in
month(Att_Date)

leave_type=04 (unpaid)
TotalHourInMonth =
AttendHour LeaveHour

with Star Schema, the report will involved 4 Dimension and 1 Fact Tables
extracted from 8 original table from Transactional Processing System. Each
Dimension will have 1 Primary key as a result of denormalization process of the
original tables
Star Schema Diagram will be as follow:
DeptInfo
EmployeeID (PK)
Dep_Start
DepartmentName
ManagerName

PosInfo
EmployeeID (PK)
Pos_start
Pos_Finish
PositionName
DMAReport
EmployeeID
EmployeeName
EmployeeEmail
ReportMonth
DepartmentName
ManagerName
ReportMonth
PositionName
AttendHour
LeaveHour
TotalHourInMonth

EmplyInfo
EmployeeID (PK)
EmployeeName
EmployeeEmail

AttendInfo
EmployeeID (PK)
Att_Date
Att_timeIN
Att_timeout
Leave_start
Leave_Finish
leave_type=04

Snowflake Schema for current HR Department


Using Snowflake Schema
Sample Report: Dependent Count of Company's Employee in each Department

Assumption

No detailed info for each employee (name, DOB, etc)

extract only small part of original table (for faster performance)

Purposes
It can be use to Count Total Dependent of Employee as well as Count of Employee's
Dependent grouped by

Dependent type (Spouse, Child, etc)

Department

Employee Position

Below is ETL Process from transactional table


Tables/Column Involved (Extract)
column
table
Emp_ID (PK)
Employee
Emp_status
Emp_ID (CK)
Emp_Dep
Dep_ID (CK)
Dep_Start
Dep_finish
Dep_ID (PK)
Department
Dep_name
Emp_ID (CK)
Emp_Pos
Pos_code (CK)
Pos_start
Pos_Finish
Pos_code (PK)
Position
Pos_name
Emp_ID (CK)
Emp_Depen
Depen_Code (CK)
Depen_Name
Depen_DOB
Depen_code (PK)
Dependent
Depen_type

New Table (Transform) as Dimension


column
table
Emp_ID (PK)
Employee
Emp_status=Active
Emp_ID (CK)
Emp_Dep
Dep_ID (CK)
Dep_Start
Dep_finish
Dep_ID (PK)
Department
Dep_name
Emp_ID (CK)
Emp_Pos
Pos_code (CK)
Pos_start
Pos_Finish
Pos_code (PK)
Position
Pos_name
Emp_ID (CK)
Emp_Depen
Depen_Code (CK)
Depen_Name
Depen_DOB
Depen_code (PK)
Dependent
Depen_type

Generate Report (Load) / Fact table


column
table
DependentCountDept EmployeeID

Dep_ID

Pos_Code

Depen_Code

Depen_Count

with Snowflakes Schema, this report will involved 7 Dimension and 1 Fact Tables
extracted from 7 original tables in Transactional Processing System with simplified
column list to keep the performance high. Each Dimension will have Several keys of
Primary keys and Composite Keys as a result of minimum denormalization process
of the original tables and to minimize redundancy
Snowflakes Schema Diagram will be as follow:
`
Employee
Emp_ID (PK)
Emp_status=Active

Position
Pos_code (PK)
Pos_name
Emp_Pos
Emp_ID (CK)
Pos_code (CK)
Pos_start
Pos_Finish

DependentCountDept

EmployeeID
Dep_ID
Pos_Code
Depen_Code
Depen_Count
Emp_Dep
Emp_ID (CK)
Dep_ID (CK)
Dep_Start
Dep_finish

Department
Dep_ID (PK)
Dep_name

Emp_Depen
Emp_ID (CK)
Depen_Code (CK)
Depen_Name
Depen_DOB

Dependent
Depen_code (PK)
Depen_type

Potrebbero piacerti anche