Sei sulla pagina 1di 70

Welcome

1
2
uaLa Warehouse
An Cvervlew
WhaL ls uaLawarehouse
A daLa warehouse ls a collecLlon of buslness
lnformaLlon derlved dlrecLly from operaLlonal
sysLems and some exLernal daLa sources lLs
speclflc purpose ls Lo supporL buslness
declslons noL buslness operaLlons"
3
haracLerlsLlcs of uaLawarehouse
W Sub[ecLorlenLed uaLa
collecLs all daLa for a sub[ecL from dlfferenL sources
W 8eadonly 8equesLs
loaded durlng offhours readonly durlng day hours
W lnLeracLlve leaLures adhoc query
flexlble deslgn Lo handle sponLaneous user querles
W reaggregaLed daLa
Lo lmprove runLlme performance
W Plghly denormallzed daLa sLrucLures
faL Lables wlLh redundanL columns
4
Iarlous developmenL SLages of
uaLawarehouse
W 8uslness ase AssessmenL
W LnLerprlse lnfrasLrucLure LvaluaLlon
W ro[ecL lannlng
W llnallze ro[ecL requlremenL
W uaLa Analysls
W roLoLyplng
W uaLabase ueslgn
3
Iarlous developmenL SLages of
uaLawarehouse
W L1L lramework ueslgn
W L1L ackage uevelopmenL
W 8l AppllcaLlon uevelopmenL
W uaLa IalldaLlon
W lmplemenLaLlon
W 8elease LvaluaLlon
6
uaLa Modellng
7
uaLa Modellng
WPA1 lS A uA1A MCuLL?
A daLa model ls an absLracLlon of some aspecL of
Lhe real world (sysLem)
WP? A uA1A MCuLL?
Pelps Lo vlsuallse Lhe buslness
A model ls a means of communlcaLlon
Models help ellclL and documenL requlremenLs
Models reduce Lhe cosL of change
Model ls Lhe essence of uW archlLecLure based on
whlch uW wlll be lmplemenLed
8
Model depends on whaL klnd of daLa analysls we
wanL Lo do
W ulfferenL uaLa Analysls 1echnlques
Cuery and reporLlng
W ulsplay Cuery 8esulLs
MulLldlmenslonal analysls
W Analyze daLa conLenL by looklng aL lL ln dlfferenL
perspecLlves
uaLa mlnlng
W dlscover paLLerns and clusLerlng aLLrlbuLes ln daLa
WhaL do we wanL Lo do wlLh Lhe daLa?
9
Levels Cf Modellng
W oncepLual modellng
uescrlbe daLa requlremenLs from a buslness polnL
of vlew wlLhouL Lechnlcal deLalls
W Loglcal modellng
8eflne concepLual models
uaLa sLrucLure orlenLed plaLform lndependenL
W hyslcal modellng
ueLalled speclflcaLlon of whaL ls physlcally
lmplemenLed uslng speclflc Lechnology
10
oncepLual Model
W A concepLual model shows daLa Lhrough
buslness eyes
W All enLlLles whlch have buslness meanlng
W lmporLanL relaLlonshlps
W lew slgnlflcanL aLLrlbuLes ln Lhe enLlLles
W lew ldenLlflers or candldaLe keys
11
Loglcal Model
W 8eplaces manyLomany relaLlonshlps wlLh
assoclaLlve enLlLles
W ueflnes a full populaLlon of enLlLy aLLrlbuLes
W May use nonphyslcal enLlLles for domalns and
subLypes
W LsLabllshes enLlLy ldenLlflers
W Pas no speclflcs for any 8u8MS or
conflguraLlon
12
hyslcal Model
W A hyslcal daLa model may lnclude
8eferenLlal lnLegrlLy
lndexes
Ilews
AlLernaLe keys and oLher consLralnLs
1ablespaces and physlcal sLorage ob[ecLs
13
WhaL needs Lo be modeled durlng
a daLa warehouse pro[ecL
W S1AClnC A8LA
?LS ! (maybe mulLlple daLa models are requlred)
W CuS
?LS !
W uA1AWA8LPCuSL/uA1AMA81
?LS!
14
uaLa Modellng 1echnlques
W Modellng Lechnlques
L8 Modellng
ulmenslonal Modellng
13
lmplemenLaLlon and modellng sLyles
W Modellng versus lmplemenLaLlon
Modellng descrlbe whaL should be bullL Lo non
Lechnlcal folks
lmplemenLaLlon descrlbe whaL ls acLually bullL Lo
Lechnlcal folks
16
lmplemenLaLlon and modellng sLyles
W 8elaLlonal modellng
use for lmplemenLaLlon
ulfflculL Lo undersLand by nonLechnlcal folks
W ulmenslonal modellng
use for modellng durlng analysls and deslgn
phases
an be lmplemenLed uslng oLher modellng sLyles
eg ob[ecLorlenLed relaLlonal
17
LlmlLaLlons of L8 Modellng
W oor erformance
W 1end Lo be very complex and dlfflculL Lo
navlgaLe
18
ulmenslonal Modellng
W ulmenslonal modellng uses Lhree baslc
concepLs measures facLs dlmenslons
W ls powerful ln represenLlng Lhe requlremenLs
of Lhe buslness user ln Lhe conLexL of daLabase
Lables
W locuses on numerlc daLa such as values
counLs welghLs balances and occurrences
19
ulmenslonal modellng
W MusL ldenLlfy
8uslness process Lo be supporLed
Craln (level of deLall)
ulmenslons
lacLs
20
onvenLlons used ln ulmenslonal
modellng
W lacLs
W Measures(Iarlables)
W ulmenslons
ulmenslon members
ulmenslon hlerarchles
21
lacLs
W A facL ls a collecLlon of relaLed daLa lLems
conslsLlng of measures and conLexL daLa
W Lach facL Lyplcally represenLs a buslness lLem
a buslness LransacLlon or an evenL LhaL can be
used ln analyzlng Lhe buslness or buslness
process
W lacLs are measured conLlnuously valued"
rapldly changlng lnformaLlon an be
calculaLed and/or derlved
22
lacL 1able
W A Lable LhaL ls used Lo sLore buslness
lnformaLlon (measures) LhaL can be used ln
maLhemaLlcal equaLlons
CuanLlLles
ercenLages
rlces
alculaLed Ialues
23
ulmenslons
W A dlmenslon ls a collecLlon of members or
unlLs of Lhe same Lype of vlews
W ulmenslons deLermlne Lhe conLexLual
background for Lhe facLs
W ulmenslons represenL Lhe way buslness
people Lalk abouL Lhe daLa resulLlng from a
buslness process eg who whaL when
where why how
24
ulmenslon 1able
W 1able used Lo sLore quallLaLlve daLa abouL facL
records
Who
WhaL
When
Where
Why
23
ulmenslon daLa should be
W verbose descrlpLlve
W compleLe
W no mlsspelllngs lmposslble values
W lndexed
W equally avallable
W documenLed ( meLadaLa Lo explaln orlgln
lnLerpreLaLlon of each aLLrlbuLe)
26
ulmenslonal model
W Ilsuallze a dlmenslonal model as a u8L (hypercube
because dlmenslons can be more Lhan
3 ln number)
W CperaLlons for CLA
uri// uown Plgher level of deLall
o// up summarlzed level of daLa
(1he navlgaLlon paLh ls deLermlned by hlerarchles wlLhln
dlmenslons)
/ice cuLs Lhrough Lhe cube users can focus on speclflc
perspecLlves
uice roLaLes Lhe cube Lo anoLher perspecLlve (change Lhe
dlmenslon)
27
urlll down 8oll up
28
Sllce and ulce
29
ulmenslons
W ollecLlon of members or unlLs of Lhe same
Lype of vlews
W deLermlne Lhe conLexLual background for Lhe
facLs
W Lhe parameLers over whlch we wanL Lo
perform CLA (eg 1lme LocaLlon/reglon usLomers)
W Member ls a dlsLlncL name Lo deLermlne daLa lLem's
poslLlon (eg 1lme MonLh quarLer)
W Plerarchy arrange members lnLo hlerarchles or levels
30
Plerarchles
31
AggregaLes
W AggregaLe 1ables are presLored summarlzed Lables
creaLed aL a hlgher level of granularlLy across any or all of
Lhe dlmenslons
W lf Lhe exlsLlng granularlLy ls uay wlse sales Lhen creaLlng a
separaLe monLh wlse sales Lable ls an example of AggregaLe
1able
W 1he use of such aggregaLes ls Lhe slngle mosL effecLlve Lool
Lhe daLa warehouse deslgner has Lo lmprove query
performance
W usage of AggregaLes can lncrease Lhe performance of
Cuerles by several Llmes
32
Measures
W A measure ls a numerlc aLLrlbuLe of a facL
represenLlng Lhe performance or behavlour of
Lhe buslness relaLlve Lo dlmenslons
W 1he acLual numbers are called as varlables
eg sales ln money sales volume quanLlLy supplled supply
cosL LransacLlon amounL
W A measure ls deLermlned by comblnaLlons of
Lhe members of Lhe dlmenslons and ls locaLed
on facLs
33
1ypes of lacLs
W AddlLlve
Able Lo add Lhe facLs along all Lhe dlmenslons
ulscreLe numerlcal measures eg 8eLall sales ln $
W Seml AddlLlve
SnapshoL Laken aL a polnL ln Llme
Measures of lnLenslLy
noL addlLlve along Llme dlmenslon eg AccounL balance lnvenLory
balance
Added and dlvlded by number of Llme perlod Lo geL a Llmeaverage
W non AddlLlve
numerlc measures LhaL cannoL be added across any dlmenslons
lnLenslLy measure averaged across all dlmenslons eg 8oom
LemperaLure
1exLual facLs AIClu 1PLM
34
ommon sLrucLures for
uaLa MarLs uenormallze!
W SLar
Slngle facL Lable surrounded by denormallzed
dlmenslon Lables
1he facL Lable prlmary key ls Lhe composlLe of Lhe
forelgn keys (prlmary keys of dlmenslon Lables)
lacL Lable conLalns LransacLlon Lype lnformaLlon
Many sLar schemas ln a daLa marL
Laslly undersLood by end users more dlsk sLorage
requlred
33
Lxample of SLar Schema
36
ommon sLrucLures for
uaLa MarLsuenormallze!
W Snowflake
Slngle facL Lable surrounded by normallzed
dlmenslon Lables
normallzes dlmenslon Lable Lo save daLa sLorage
space
When dlmenslons become very very large
Less lnLulLlve slower performance due Lo [olns
W May wanL Lo use boLh approaches especlally
lf supporLlng mulLlple enduser Lools
37
Lxample of Snow flake schema
38
eys
W rlmary eys
unlquely ldenLlfy a record
W lorelgn eys
prlmary key of anoLher Lable referred here
W SurrogaLe eys
sysLemgeneraLed key for dlmenslons
key on lLs own has no meanlng
lnLeger key less space
39
More eys
W SmarL eys
prlmary key ouL of varlous aLLrlbuLes of dlmenslon
AIClu 1PLM!
!oln Lo lacL Lable should be on slngle surrogaLe
key
W roducLlon eys
uC nC1 uSL roducLlon deflned aLLrlbuLes
8uslness may reuse/change Lhem uW cannoL!
40
8aslc ulmenslonal Modellng
1echnlques
W Slowlng changlng ulmenslons
W 8apldly changlng Small ulmenslons
W Large ulmenslons
W 8apldly changlng Large ulmenslons
W uegeneraLe ulmenslons
W !unk ulmenslons
41
Slowly hanglng ulmenslons
W A dlmenslon ls consldered a S|ow|y Chang|ng
D|mens|on when lLs aLLrlbuLes remaln f/most
consLanL over Llme requlrlng relaLlvely mlnor
alLeraLlons Lo represenL Lhe evolved sLaLe
42
1ypes of Su
W 1ype l 8eLalns Lhe recenL updaLed Ialue
W 1ype ll LnLlre hlsLory of Lhe dlmenslonal daLa ls
malnLalned
Ierslon
llag
uaLe
W 1ype lll Cnly urrenL and one revlous value geL
malnLalned
43
1he 1lme ulmenslon
W Timekey
W dayoIweek
W daynumberinmonth
W daynumberoverall
W weeknumberinyear
W month
W quarter
W Iiscalperiod
W holidayIlag
W weekdayIlag
W lastdayinmonthIlag
W season
W event
44
1lme ulmenslon
W An excluslve 1lme dlmenslon ls requlred because Lhe SCL
daLe semanLlcs and funcLlons cannoL generaLe several
lmporLanL aLLrlbuLes requlred for analyLlcal purposes
W ALLrlbuLes llke weekdays weekends flscal perlod holldays
season cannoL be generaLed by SCL sLaLemenLs
W Moreover SCL daLe sLamps occupy more space largely
lncreaslng Lhe slze of Lhe facL Lable
W !olns on such SCL generaLed daLesLamps are cosLly
decreaslng Lhe query speed slgnlflcanLly
43
1lme ulmenslon
W 1he hollday flag and season aLLrlbuLes are
useful for hollday IS nonhollday analysls and
season buslness analysls
W LvenL aLLrlbuLe ls needed Lo record speclal
days llke sLrlke days eLc
46
uaLa Modellng for uaLa Warehouse
Steps
Study Lk Lva|uate and Ana|yse
3 kev|ew D|mens|on 4 Add 1|me D|mens|on
S Ident|fy Iacts 6 Granu|ar|ty
7 Merge Iacts 8 kev|ew Iacts
9 Name Iacts S|ze the mode|
kecord Metadata Va||date mode|
47
L1L Cvervlew
(LxLracL 1ransform Load)
48
LxLracLlon
Source SysLems (MulLlple Source SysLems)
W llaL flles Lxcel Legacy SysLems 8u8MS eLc
lrequency of LxLracLlon
SLaglng Area (lf any? Pow many?)
MosL 1ransformaLlons from Source Lo SLaglng
leanslng and uaLa CuallLy
W uaLa lnLegrlLy uedupllcaLlon compleLeness
correcLness
49
1ransformaLlon
usage of Lools
W 8eusablllLy of 1ransformaLlons
W 8eusablllLy of Mapplngs
ulfferenL Lools
W lnformaLlca
W Cracle Warehouse 8ullder
W L1l
W SagenL
W L/SCL scrlpLs
W Cpen Source 1ools
30
Loadlng
Loadlng lrequency
CpLlmlzed Loadlng
W lndexlng
W arLlLlonlng
AggregaLlon
W Sum
W Average
W Max
updaLe SLraLegy
Lrror Pandllng
31
S1AClnC A8LA Some larlLy
W SLaglng Area
opLlonal
Lo cleanse Lhe source daLa
AccepLs daLa from dlfferenL sources
uaLa model ls requlred aL sLaglng area
MulLlple daLa models may be requlred for parklng
dlfferenL sources and for Lransformed daLa Lo be
pushed ouL Lo warehouse
32
CuS Some larlLy
W CperaLlonal uaLa SLore
CpLlonal
Cranular deLalled level daLa
May feed warehouse (eg when warehouse ls
aggregaLed)
usually a relaLlonal model
May keep daLa for a smaller Llme perlod Lhan
warehouse
33
uaLa Warehouse ArchlLecLure
34
uW ArchlLecLure
W ArchlLecLure holces depend on
urrenL lnfrasLrucLure
8uslness envlronmenL
ueslred managemenL and conLrol sLrucLure
resources
commlLmenL
W uaLa Warehouse/daLa marL
33
1ypes of uaLa Warehouse
W LnLerprlse uaLa Warehouse
W uaLa MarL
nterprise
Data Warehouse
Datamart
Datamart
Datamart
36
LnLerprlse daLa warehouse
W onLalns daLa drawn from mulLlple
operaLlonal sysLems
W SupporLs Llme serles and Lrend analysls
across dlfferenL buslness areas
W an be used as a LranslenL sLorage area Lo
clean all daLa and ensure conslsLency
W an be used Lo populaLe daLa marLs
W an be used for everyday and sLraLeglc
declslon maklng
37
uaLa MarL
W Loglcal subseL of enLerprlse daLa warehouse
W Crganlzed around a slngle buslness process
W 8ased on granular daLa
W May or may noL conLaln aggregaLes
W Cb[ecL of analyLlcal processlng by Lhe end
user
W Less expenslve and much smaller Lhan a full
blown corporaLe daLa warehouse
38
uW lmplemenLaLlon
Approaches
W 1op uown
W 8oLLomup
W omblnaLlon of boLh
W cbolcesepeooo
currenL lnfrasLrucLure
resources
archlLecLure
8Cl
lmplemenLaLlon speed
39
1op uown lmplemenLaLlon
60
8oLLom up lmplemenLaLlon
61
CLA
62
WhaL ls CLA??????
A n |ne Ana|yt|ca| rocess|ng
W A enables analysLs managers and
execuLlves Lo galn lnslghL lnLo daLa Lhrough
fasL conslsLenL lnLeracLlve access Lo a wlde
varleLy of posslble vlews of lnformaLlon
W A Lransforms raw daLa so LhaL lL reflecLs
Lhe real dlmenslonallLy of Lhe enLerprlse as
undersLood by Lhe user
63
uaLa Warehouslng vs CLA
CLA focuses on
uaLa Lransformed lnLo lnformaLlon LhaL meeLs
Lhe enduser's analyLlcal requlremenLs
uaLa modellng and compuLaLlon processes ls
conslsLenL
CL1 and uW provldes Lhe source daLa
whereas CLA Lurns LhaL daLa lnLo
lnformaLlon
64
CLA luncLlonallLy
W CLA funcLlonallLy ls characLerlzed by
uynamlc mulLldlmenslonal analysls of consolldaLed enLerprlse
daLa supporLlng end user analyLlcal and navlgaLlonal acLlvlLles
alculaLlons and modellng applled across dlmenslons Lhrough
hlerarchles and/or across members
1rend analysls over sequenLlal Llme perlods
Sllclng subseLs for onscreen vlewlng
urlll down Lo deeper levels of consolldaLlon
8eachLhrough Lo underlylng deLall daLa
8oLaLlon Lo new dlmenslonal comparlsons ln Lhe vlewlng area
63
CLA luncLlonallLy
W CLA ls lmplemenLed ln a mulLluser cllenL/server
mode and offers conslsLenLly rapld response Lo
querles regardless of daLabase slze and
complexlLy
W CLA helps Lhe user synLheslze enLerprlse
lnformaLlon Lhrough comparaLlve personallzed
vlewlng as well as Lhrough analysls of hlsLorlcal
and pro[ecLed daLa ln varlous whaL lf daLa
model scenarlos
66
CLA
luncLlonal 8equlremenLs
lasL Access and alculaLlons
Speed ls crlLlcal Lo malnLaln an analysL's Lraln of
LhoughL
An analysL needs Lo navlgaLe LhroughouL Lhe daLa
whlch requlres aggregaLlons or rollups
owerful AnalyLlcal apablllLles
1here ls more compllcaLed calculaLlons Lo CLA
Lhan slmple aggregaLlons or rollups
67
CLA
luncLlonal 8equlremenLs
llexlblllLy
vlewlng graphs charLs row or columns
deflnlLlons formaL of numbers name
changes
analysls Sales analyze daLa dlfferenLly Lhan
markeLlng
lnLerfaces secLlon wlsereporL looks
68
CLA ArchlLecLure
Accounting Dir. View
Budget Dir. View
Ad Hoc View
A
c
c
o
u
n
t
i
n
g
Dept. Mgr. View
Time
Actuals Actuals
69
1hank ?ou
70

Potrebbero piacerti anche