Sei sulla pagina 1di 23

Business Intelligence

Analytics System
An OLAP approach

ENGR 300
Prof. Tarek Sobh

Hector Valentin
Obasa Koswatta
Chatura Liyanage
Bhaskar Bhattarai

April 30, 2002


Table of Content

Page

Project Scope 3

The best practice data warehousing/OLAP architecture 3

The Essence for an OLAP (Online Analytic Processing) Tier 4

Popularity of OLAP (the market analysis) 5

Our Approach 7

Data-Warehouse: The Star Schema 9

Data-Marts: The Cubes 10

Client Applications 11

Conclusion 22

References 23

2
Project Scope

Using a best practice data warehousing/OLAP architecture approach to design a


scalable high performance analytics system to be used in any company as a
business intelligence solution. Management, sales teams, etc will use this system
to analyze and make key decisions in a timely manner. System functionalities
include, but not limited to performance reporting, sales forecasting, product line
and customer profitability, sales analysis, market analysis, what-if analysis and
manufacturing mix analysis.

The best practice data warehousing: OLAP architecture

Corporate data has grown consistently and rapidly in the last decade.
Contemporary enterprises have to manipulate data in the range of terabytes.
Today’s markets are much more competitive and dynamic than those in the past.
Business enterprises will prosper or fail according to the sophistication and speed
of their information systems, and their ability to analyze and synthesize
information using those systems.

A successful system to meet the above needs cannot be satisfied by simply using
a relational database model approach. While well suited to managing
transactions, in terms of extensibility and storing large amounts of data,
relational databases are typically unable to handle ad-hoc, speed of thought
analytical querying and sophisticating business calculations for large user
communities. The best practice strategy, therefore, is to couple an application-
neutral relational data warehouse staging area (the hub) with application specific
OLAP data marts (the spokes). A Gartner Group survey of 104 DW/BI decision-
makers published in February 2000 found an increase of approximately 60% in

3
the use of the ‘hub-and-spoke’ architecture, where multiple data marts are fed
from a central data warehouse (Gartner Group, 2000). This two-tier architecture
enables IT organizations to build, maintain and deploy data warehouses
efficiently, and also to meet business users’ requirements for analysis and rapid
response time.

The Essence for an OLAP (Online Analytic Processing) Tier

The ability to scale to large volumes of data and large numbers of


concurrent users
Consistent, fast query response times that allow for iterative, speed of
thought analysis
Integrated meta data to ensure data and meta data stay synchronized
A calculation engine that includes robust mathematical functions for
computation to support matrix, cross-dimensional, procedural and OLAP-
aware calculations.
Seamless integration of historical, projected and derived data
A multi-user read-write environment to support what-if analysis, modeling
and planning
The ability to be deployed quickly and maintained cost effectively with
minimal user training
Robust data-access security and user management
Availability of a wide variety of viewing and analysis tools

Fig 1 shows a complete two-tier analytics system. In terms of our project we are
doing a complete conceptual study of this ‘hub and spoke’ architecture, however
in terms of implementation we are to concentrating on the OLAP data mart tier
of the system.

4
Tier 1

Source Data Source Systems, Flat files, Hyperion Pillar, etc

Informatica Powermart, Microsoft DTS services


Data extraction,
transformation and load

Database server- OLTP


Datawarehouse- Star
SQL Server, Oracle, Informix, etc
schema design, staging

Tier 2
OLAP server
Data mart- Data cubes Hyperion Essbase, SQL Server analysis services, etc
design

Front End reporting Hyperion Analyzer,Hyperion Report, Crystal reports


Excel (Essbase add-in), etc

Figure 1- A Complete Two-tier analytics system architecture

Popularity of OLAP (the market analysis)

Our project was particularly chosen to be a learning experience. A market


research that we conducted strongly indicated the growing popularity of OLAP
architecture. Its flexibility and robustness in handling large amounts of data (in
terabytes) have made it an attractive solution for today’s business needs. A
research conducted in 2000 by the Gartner Group indicated a 60% growth in
“data warehouse-data mart” (or the hub-and-spoke) architecture.

5
There are two market leaders for OLAP products: Hyperion Solutions and
Microsoft Inc. As following trends indicate the competition among vendors have
been keen in the recent past. For our project we have used the Microsoft SQL
server and Microsoft Analysis Services to realize the OLAP architecture.

Figure 2: The market analysis

6
Our Approach

A Star Schema database design is implemented on a relational database


to build the data warehouse. The star schema includes a fact table as well
as around 5 dimension tables with several members in each dimension
table. Extension to a snowflake or constellation schema is out of the scope
of this project. Relational databases such as SQL Server 2000, Informix
and Oracle were considered to build the data warehouse. The SQL Server
2000 was used due to financial constraints.

About 100 Megabytes of informative data was simulated and loaded into
the star schema database

Data mart with several cubes was implemented on a OLAP server. OLAP
server software considered is Hyperion Essbase and SQL Server Analysis
services. SQL Analysis Server was used.

Data mart is business-rule specific, with several cubes designed from 2-5
dimensions and having drill down capabilities to several levels of
members. Cubes satisfy analytical and complex metrics calculation
requirements, that are otherwise tedious, low performing in a traditional
SQL based relational database, thus showing the power of a OLAP
approach.

Front end reporting was implemented to show both summarized and, if


needed, a drill down analysis. Reports implementation could have been
done by using Hyperion Analyzer views presented via a web browser,
Microsoft Excel or Crystal reports. Microsoft Excel was used with pivot
tables. Reports include product analysis, customer analysis, pipeline

7
analysis, partner analysis, platform analysis, performance analysis and
loss analysis.

The system architecture and dataflow of our system is shown in fig 3 below

Figure 3- System architecture of our analytics system

It should be noted that we have not implemented the first two stages shown
above: namely “Source data” and “Data Extraction”. It was assumed that the
Data-warehouse had already been populated with relevant data. This was done
by generating dummy data for simulation purposes.

8
Data-Warehouse: The Star Schema

The data-warehouse is implemented with a Star Schema. Star Schema is a


special form of a relational database that is specifically geared towards the OLAP
architecture. It should be noted that the hub-and-spoke architecture is a key
feature of the two-tiered analytics system.

We used the Microsoft SQL Server to implement the Star Schema. The main
table within this architecture is called the fact table. The other dimension table
are connected to the fact table through foreign keys. The following figure shows
a schematic representation of the Star Schema.

Figure 4: The Star Schema

9
Data-Marts: The Cubes

As we have explained, the second important element of the OLAP architecture is


the Data Cube (or the “spokes”). Data Cubes allow us to process the data stored
in the Star Schema.

A Data Cube has two main elements: measures and dimensions. Measures are
used for the analysis of the data, while dimensions represent different data items
that could be analyzed at same time in order to determine data correlations. For
example, a Data Cube would have sales dimension and product dimension in
order to determine the revenues generated by each product. It should be noted
that a Data Cube could have a maximum of 64 dimensions. There are no
constraints for the number of measures each cube could have. The following
diagram shows an instance of a Data Cube employed in our OLAP architecture. It
is used to analyze and drill-down the Sales data.

10
Figure 5: The Sales Cube

The above is an example of Data Cubes. They enable the speed-of-thought


analysis of very large quantities of data. The above Sale Cube has four
dimensions: addresses, products, salesrep and time. It also has two measures
that enable the data analysis: namely quantity and revenue.

11
Client Applications

The data analyzed through Cubes could be viewed using a range of client
applications. Hyperion Analyzer, Microsoft Excel and Crystal Reports are several
such tools that could assist the management by providing valuable information
generated by Data Cubes. We used Excel pivot table services to present data for
management usage. The following procedure was used to employ Excel pivot
tables to connect to the Cubes and present the data.

1. Select the PivotTable and PivotChart Report… sub-menu under the Data
menu as shown in figure 6. This will launch PivotTable and PivotChart
wizard.

Figure 6

2. On the first step of PivotTable and PivotChart wizard dialog box as shown
in figure 7, select External data source and either the PivotTable or
PivotChart (with PivotTable) option. For this context, suppose that
PivotChart (with PivotTable) was selected. Then press the Next > button.

12
Figure 7

3. The next step of the wizard shows the dialog box as shown in figure 8.
Simply press the Get Data…button.

Figure 8

4. The next dialog box asks for the source of the data as shown in figure 9.
Choose the OLAP Cubes tab. In this tab, the OLAP cube can be selected
as a data source. In figure 9, four data sources have already been
created.

13
Figure 9

Now, select <New Data Source> and double-click or press the OK button.
The next dialog box that appears is shown in figure 10.

5. This dialog box lets the user declare a name for the data source and connect
to the OLAP server (Hyperion Essbase, Microsoft SQL Server with Analysis
Manager, etc.). Give a desired name for the data source (here, for instance
MY_DATA_SOURCE) and after selecting the Microsoft OLE DB Provider for OLAP
Services as an OLAP connectivity layer necessary for the client-server connection
press the Connect… button.

14
Figure 10

6. After the Connect… button has been pressed, the following dialog box
appears.

15
Figure 11

Choose the OLAP Server radio button and provide the IP address of the OLAP
server.
7. After pressing the Next > button, when the connection to the OLAP server
has been made, a dialog box as shown in figure 12 appears prompting the
user to select the appropriate data mart that may house any desired
number of data cubes. In this case, as shown in figure 12, the data mart
named OLAPMart has been selected. Now in the next step, cubes that
have been constructed within this data mart can be chosen for analysis.

16
Figure 12

8. Selecting the appropriate data mart and pressing the Finish button
displays the dialog box shown in figure 10 but this time with the cube
selection drop-down list activated as shown in figure 13. The number of
cubes that have been created out of this data-mart will appear in the
drop-down list and the desired cube can be chosen for analysis.

17
Figure 13

9. Suppose that the ‘sales’ cube was selected from the drop-down list. Then
proceeding further, the following dialog box appears.

Figure 14

After this point, the user can either continue with the wizard to organize
the various members of various dimensions for tabularization as well as
plotting. On the other hands, the user can press the Finish button that will

18
create the chart area in a new sheet, as shown in figure 15. Suppose the
Finish button was pressed.

Figure 15

10. This is a standard Excel chart layout. Note the PivotTable toolbar.
Different dimensions can be picked from the toolbar and dropped to either
an X-axis value (Category Fields), an Y-axis value (Series Fields) and for a
given range (Page Fields) to create the desired chart. The data used in
plotting the chart will also be tabulated in another sheet in the same Excel
book. Figure 17 illustrates one completed graph showing the quantity of
products under the different categories (as Category fields) sold by region
(as Series fields) for the given time (Page field). Note that each of the

19
fields can be further drilled down (as indicated by the drop-down list) into
the desired value at run-time. The data change takes place
instantaneously at the speed of thought and is reflect on the graph.
Similarly, figure 16 shows the tabularized data for the graph.

Figure 16

Figure 17

20
Hence, this was a complete process for creating table or a chart out of
OLAP data. This method is incomparably faster and efficient than writing SQL
queries to fetch data then having to tabularize them and plot them again.

Different front-end software utilities can be applied to process and analyze


the cube data. Standard APIs can be used in languages such as Visual Basic
and Visual C++ to create application specific software catering the desired
data analysis of a cube. On the other hand, existing applications such as
Microsoft Excel, Analysis Manager within the Microsoft SQL Server package
can be used to analyze the data as well. The PivotTable and PivotChart
Reports within Microsoft Excel provide an excellent example of how data
within a cube can be analyzed with speed of thought queries. These are
simple and easy to implement and very versatile and can be accomplished in
a relatively lesser time span than the normal transaction processing through
SQL queries.

21
Conclusion

Data warehousing is a best-in class approach to leverage corporate information.


However meaningful business analysis requires a multidimensional view of data
that has proven inefficient and cumbersome to express using relational
databases. Our analytics solution combines application-neutral data warehouses
with application specific OLAP data marts to combine needs of both IT
professionals and various kinds of end users throughout the enterprise. With our
solution they will be able to extend their decision-support systems by moving
beyond a historical focus to proactively chart the future direction of the business.

22
References

Veiria, T “Professional SQL Server 7 Programming” , Canada, Wrox, 1999

Hyperion Solutions Corporation White papers “Large-scale data warehousing


using Hyperion Essbase OLAP technology”, January 2000-
www.hyperion.com

Hyperion Solutions Corporation White papers “The role of the OLAP server in
a data warehousing solution”, February 2001- www.hyperion.com

Hyperion Solutions Corporation White papers “Hyperion Essbase data mart


design approaches”, Novemeber 2000- www.hyperion.com

Hyperion Solutions Corporation White papers “Analytical Processing: A


comparison of multidimensional and SQL-based approaches”- April
2001- www.hyperion.com

Codd E.F, Codd S.B, Salley C.T “Providing OLAP to user-Analysts: An IT


mandate”, E.F. Codd Associates

Dataquest/Gartner Group “DW/BI decision makers survey results”, February


2000- http://www4.gartner.com/Init

23

Potrebbero piacerti anche