Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Analytics System
An OLAP approach
ENGR 300
Prof. Tarek Sobh
Hector Valentin
Obasa Koswatta
Chatura Liyanage
Bhaskar Bhattarai
Page
Project Scope 3
Our Approach 7
Client Applications 11
Conclusion 22
References 23
2
Project Scope
Corporate data has grown consistently and rapidly in the last decade.
Contemporary enterprises have to manipulate data in the range of terabytes.
Today’s markets are much more competitive and dynamic than those in the past.
Business enterprises will prosper or fail according to the sophistication and speed
of their information systems, and their ability to analyze and synthesize
information using those systems.
A successful system to meet the above needs cannot be satisfied by simply using
a relational database model approach. While well suited to managing
transactions, in terms of extensibility and storing large amounts of data,
relational databases are typically unable to handle ad-hoc, speed of thought
analytical querying and sophisticating business calculations for large user
communities. The best practice strategy, therefore, is to couple an application-
neutral relational data warehouse staging area (the hub) with application specific
OLAP data marts (the spokes). A Gartner Group survey of 104 DW/BI decision-
makers published in February 2000 found an increase of approximately 60% in
3
the use of the ‘hub-and-spoke’ architecture, where multiple data marts are fed
from a central data warehouse (Gartner Group, 2000). This two-tier architecture
enables IT organizations to build, maintain and deploy data warehouses
efficiently, and also to meet business users’ requirements for analysis and rapid
response time.
Fig 1 shows a complete two-tier analytics system. In terms of our project we are
doing a complete conceptual study of this ‘hub and spoke’ architecture, however
in terms of implementation we are to concentrating on the OLAP data mart tier
of the system.
4
Tier 1
Tier 2
OLAP server
Data mart- Data cubes Hyperion Essbase, SQL Server analysis services, etc
design
5
There are two market leaders for OLAP products: Hyperion Solutions and
Microsoft Inc. As following trends indicate the competition among vendors have
been keen in the recent past. For our project we have used the Microsoft SQL
server and Microsoft Analysis Services to realize the OLAP architecture.
6
Our Approach
About 100 Megabytes of informative data was simulated and loaded into
the star schema database
Data mart with several cubes was implemented on a OLAP server. OLAP
server software considered is Hyperion Essbase and SQL Server Analysis
services. SQL Analysis Server was used.
Data mart is business-rule specific, with several cubes designed from 2-5
dimensions and having drill down capabilities to several levels of
members. Cubes satisfy analytical and complex metrics calculation
requirements, that are otherwise tedious, low performing in a traditional
SQL based relational database, thus showing the power of a OLAP
approach.
7
analysis, partner analysis, platform analysis, performance analysis and
loss analysis.
The system architecture and dataflow of our system is shown in fig 3 below
It should be noted that we have not implemented the first two stages shown
above: namely “Source data” and “Data Extraction”. It was assumed that the
Data-warehouse had already been populated with relevant data. This was done
by generating dummy data for simulation purposes.
8
Data-Warehouse: The Star Schema
We used the Microsoft SQL Server to implement the Star Schema. The main
table within this architecture is called the fact table. The other dimension table
are connected to the fact table through foreign keys. The following figure shows
a schematic representation of the Star Schema.
9
Data-Marts: The Cubes
A Data Cube has two main elements: measures and dimensions. Measures are
used for the analysis of the data, while dimensions represent different data items
that could be analyzed at same time in order to determine data correlations. For
example, a Data Cube would have sales dimension and product dimension in
order to determine the revenues generated by each product. It should be noted
that a Data Cube could have a maximum of 64 dimensions. There are no
constraints for the number of measures each cube could have. The following
diagram shows an instance of a Data Cube employed in our OLAP architecture. It
is used to analyze and drill-down the Sales data.
10
Figure 5: The Sales Cube
11
Client Applications
The data analyzed through Cubes could be viewed using a range of client
applications. Hyperion Analyzer, Microsoft Excel and Crystal Reports are several
such tools that could assist the management by providing valuable information
generated by Data Cubes. We used Excel pivot table services to present data for
management usage. The following procedure was used to employ Excel pivot
tables to connect to the Cubes and present the data.
1. Select the PivotTable and PivotChart Report… sub-menu under the Data
menu as shown in figure 6. This will launch PivotTable and PivotChart
wizard.
Figure 6
2. On the first step of PivotTable and PivotChart wizard dialog box as shown
in figure 7, select External data source and either the PivotTable or
PivotChart (with PivotTable) option. For this context, suppose that
PivotChart (with PivotTable) was selected. Then press the Next > button.
12
Figure 7
3. The next step of the wizard shows the dialog box as shown in figure 8.
Simply press the Get Data…button.
Figure 8
4. The next dialog box asks for the source of the data as shown in figure 9.
Choose the OLAP Cubes tab. In this tab, the OLAP cube can be selected
as a data source. In figure 9, four data sources have already been
created.
13
Figure 9
Now, select <New Data Source> and double-click or press the OK button.
The next dialog box that appears is shown in figure 10.
5. This dialog box lets the user declare a name for the data source and connect
to the OLAP server (Hyperion Essbase, Microsoft SQL Server with Analysis
Manager, etc.). Give a desired name for the data source (here, for instance
MY_DATA_SOURCE) and after selecting the Microsoft OLE DB Provider for OLAP
Services as an OLAP connectivity layer necessary for the client-server connection
press the Connect… button.
14
Figure 10
6. After the Connect… button has been pressed, the following dialog box
appears.
15
Figure 11
Choose the OLAP Server radio button and provide the IP address of the OLAP
server.
7. After pressing the Next > button, when the connection to the OLAP server
has been made, a dialog box as shown in figure 12 appears prompting the
user to select the appropriate data mart that may house any desired
number of data cubes. In this case, as shown in figure 12, the data mart
named OLAPMart has been selected. Now in the next step, cubes that
have been constructed within this data mart can be chosen for analysis.
16
Figure 12
8. Selecting the appropriate data mart and pressing the Finish button
displays the dialog box shown in figure 10 but this time with the cube
selection drop-down list activated as shown in figure 13. The number of
cubes that have been created out of this data-mart will appear in the
drop-down list and the desired cube can be chosen for analysis.
17
Figure 13
9. Suppose that the ‘sales’ cube was selected from the drop-down list. Then
proceeding further, the following dialog box appears.
Figure 14
After this point, the user can either continue with the wizard to organize
the various members of various dimensions for tabularization as well as
plotting. On the other hands, the user can press the Finish button that will
18
create the chart area in a new sheet, as shown in figure 15. Suppose the
Finish button was pressed.
Figure 15
10. This is a standard Excel chart layout. Note the PivotTable toolbar.
Different dimensions can be picked from the toolbar and dropped to either
an X-axis value (Category Fields), an Y-axis value (Series Fields) and for a
given range (Page Fields) to create the desired chart. The data used in
plotting the chart will also be tabulated in another sheet in the same Excel
book. Figure 17 illustrates one completed graph showing the quantity of
products under the different categories (as Category fields) sold by region
(as Series fields) for the given time (Page field). Note that each of the
19
fields can be further drilled down (as indicated by the drop-down list) into
the desired value at run-time. The data change takes place
instantaneously at the speed of thought and is reflect on the graph.
Similarly, figure 16 shows the tabularized data for the graph.
Figure 16
Figure 17
20
Hence, this was a complete process for creating table or a chart out of
OLAP data. This method is incomparably faster and efficient than writing SQL
queries to fetch data then having to tabularize them and plot them again.
21
Conclusion
22
References
Hyperion Solutions Corporation White papers “The role of the OLAP server in
a data warehousing solution”, February 2001- www.hyperion.com
23