Sei sulla pagina 1di 10

Data Warehouse Technology: Support of MIS/DISS System

Building a Data Warehouse


at the Housing and
Development Board
Singapore
(A Study Report)

By

AJI RAJ
(Student No. 0116904)
MBA Full-Time

Assignment II
On
MANAGEMENT INFORMATION SYSTEM

Submitted on 22-01-2002

University of East London


EAST LONDON BUSINESS SCHOOL
Data Warehouse Technology: Support of MIS/DISS System

Building a Data Warehouse at the Housing and


Development Board - Singapore
(A Study Report)

Abstract:

Data warehousing is the technological trend for the corporate decision support
process. This study is based on the development of a data warehouse at the
Housing and Development Board in Singapore investigates the current business
environment of the data warehouse, including OLAP, data mining, data
visualisation and other technologies and its future developments.

Keywords:

Operational System, Database Management, Data Processing, Decision-Support


System, Administration

1 Introduction
Conventional database applications have designed to handle high transaction
throughput. Such applications are frequently called on-line transaction processing
(OLTP) applications. The data available in such applications is important for
running the day-to-day operations of some organisation. The data is also likely to
be managed by relational or post-relational DBMS.

Contemporary organisations also need access to historical, summery data to


access data from other sources than those available through DBMS. For this
purpose, the concept of a data warehouse has been created. The data warehouse
requires extensions to conventional database technology and also a range of
application tools for On-line analytical processing (OLAP) and data mining.

Decision-makers need concise, reliable information about current operations,


trends, and changes. What has been immediately available at most firm is
current data only (historical data were available through IS reports that took a
long time to produce). Data often are fragmented in separate operational
systems such as sales or payroll so that different managers make decisions from
incomplete knowledge basis. Users and information system specialists may have
to spend inordinate amounts of time locating and gathering data (Waston & Haley
1998). Data warehousing addresses this problem by integrating key operational
data from around the company in a form that is consistent, reliable, and easily
available for reporting.

2 What is a Data Warehouse?


Data Warehouse is a database with reporting and query tools that stores current
and historical data extracted from various operational systems and consolidated
for management reporting and analysis. (Laudon & Laudon, Sixth Edition)

A data warehouse is a type of contemporary database system designed to fulfil


decision support needs. However, a data warehouse differs from a conventional
decision-support database in a number of ways:

2
Data Warehouse Technology: Support of MIS/DISS System

 Volume of data: A data warehouse is likely to hold far more data


than a decision-support database. Volumes of the order of over 400
GB of data are commonplace.

 Divers data sources: The stored in a warehouse is likely to have been


extracted from a diverse range of application systems, only some of
which may be database systems. These systems are described as data
sources.

 Dimensional access: A warehouse is designed to fulfil a number of


distinct ways (dimensions) in which users may wish to retrieve data.
This is sometimes referred to as the need to facilitate ad-hoc query.

Inmon (1993) defines a data warehouse as being a subject-oriented, integrated,


time-variant, non-volatile collection of historical data used in support of
management decision making.

 Subject-oriented: A data warehouse is structured in terms of the major subject


areas of the organisation such as in the case of university, students, lectures
and modules rather than in terms of application areas such as environment,
payroll and timetabling.

 Integrated: A data ware house provides a data repository which integrates


data from different systems with data frequently in different formats and are
stored in a globally accepted manner pertaining to naming conventions,
encoding structures, and physical attributes of the data. The objective is to
provide a unified view of data for users.

 Time-variant: A data warehouse explicitly associates time with data. Data in a


warehouse is only valid for some point or period in time. The operational data
are updated as condition change, creating a series of snapshots of information
that relate to particular points in time.

 Non-volatile: The data in a data warehouse is not updated in real time.


Instead, it is refreshed from data in operational systems on a regular basis. A
consequence of this is that the management of data integrity is not a critical
issue for data warehouse.

The data warehouse differ from operational data a number of ways. The below
given table shows the differences:

Operational Data Data Warehouse Data


Isolated data stored in and used by isolated Enterprise-wide integrated data collected from
legacy system legacy systems
Contains current operational data Contains recent data as well as historical data
Data are stored on multiple platform Data are stored on a single platform
Individual fields (such as customer number) may A single, agreed-upon definition exists for every
be inconsistent across the enterprise field stored in the system
Data are organised from an operational or Data are organised around major business
functional view - such as sales, production, informational subjects-such as customer or
purchasing, payroll, order processing product
Data are volatile to support operations within a Data are stabilised for decision making
company

3 Organisational Context
This Case Study examines the development of a data warehouse at the Housing
and Development Board (HDB), a statutory board in Singapore established in
1960. This data warehouse is perhaps the first to be developed by a statutory

3
Data Warehouse Technology: Support of MIS/DISS System

board. Data were gathered through interviews with the relevant persons in HDB
as well as through in-house materials about the data-warehousing project.
Comparisons between operational systems and the data warehouse are
presented, and insights gained from the development experience are discussed.

4 Background of the Information Services Department at


HDB
The Housing and Development Board (HDB) was established to provide
affordable, high quality public housing and to build cohesive communities among
the multiracial population in Singapore. Over the past 35 years, HDB has built
about 800,000 flats (apartments) housing about 86 percent of the population.
HDB expanded their business activities from the sales of new flats to the resale of
old flats, rental of flats, commercial properties, renewal, upgrading of old estates,
car park management, and land management.

A computer service department (CSD) was established by the HDB in 1979 to


support other departments, as a part of the government's civil service
computerisation policy. In 1991, CSD was renamed the Information Service
Department (ISD), and Mr. Alex Siow was taken charge as the new Chief
Information Officer (CIO). He had taken different steps to co-ordinate and
provides effective support to HDB's business activities.

In 1993, HDB produced management strategies and established guidelines for


HDB's information technology infrastructure to facilitate easy access to
information for HDB employees as well as to improve services to customers. Now,
ISD is running with 300 staff members to provide computer services and support
to over 800 staff members at HDB.

4.1 The need for data warehouse at HDB

The staff of the organisation find difficulty for retrieving relevant, potentially
valuable data from the large volume of data, accumulated over the years, with
the existing computer set-up. However the increased ad hoc requests for data
from different departments created a lot of problems and at the same time the
existing operational system fail to support management decision support. These
problems are:
1. Lack of integration: Users often have to access various operational
systems in the HDB in order to get the information they need. The
reason for this is that these operational systems have evolved since
the early 1980s and data captured within such systems were often
not integrated.
2. Lack of history: Users often need to obtain historical data in order to
analyse certain trends pertaining to public housing. Such historical
information is often either unavailable or difficult to obtain from
operational systems.
3. Lack of credibility: Since operational data change whenever
conditions change, repeat analysis was often difficult or not possible
at all.
4. Performance considerations: Users performing on-line queries on
operational systems often cause an uneven and unpredictable load
on the systems since user queries are usually ad hoc. This adversely
affects the performance of the operational system, which in turn
affects services to customers.

4
Data Warehouse Technology: Support of MIS/DISS System

To solve these problems, a data warehouse was conceptualised in late 1993;


development was started in June 1994 and the data warehouse (called the
Information Centre Database or ICDB) was completed in August 1995.
4.2 Steps in building a Data warehouse

The key steps involved in data warehousing project are outlined below (Inmon,
Welch et al. 1997)

1. Users specify information needs.


2. Analysts and users create a logical and physical design.
3. Sources of data are identified in operational systems, external
sources etc.
4. Source data is scrubbed, extracted and transformed.
5. Data is transferred and loaded into the warehouse periodically.
6. Users are given access to the warehouse data.
7. The warehouse is maintained in terms of changing requirements.

The first two stages are the same as conventional database development. Also
identifying and managing data sources may be a key activity of the data
administration function.

5 Technological analysis

The success of data warehousing depends on its use of on-line information


retrieval, artificial intelligence and graphic user interface tools. While data
warehousing focuses on the gathering, cleaning and storing of large volume of
information, the on-line analytical processing (OLAP) tools provide the means
needed to manipulate and analyse the information. Approaches to artificial
intelligence develop and refine new insights into the collected data.

5.1 Technology behind a Data Warehouse

A Data Warehouse extracts current and historical data from operational systems
inside the organisation. These data are combined with data from external sources
and reorganised into a central database designed for management reporting and
analysis. The information directory provides users with information about the
data available in the warehouse. See the Exhibit-I

5.2 Components of a Data Warehouse


Exhibit-II illustrates the major components of a data warehouse:

1. Production data: Data for the warehouse may be sourced in a number of ways,
e.g. from mainframe-based hierarchical or network databases, from relational
databases and from data in proprietary file systems.
2. Extraction, transformation and loading functions: These operations are
concerned with extracting data from source systems, transforming it into a
suitable form and loading the transformed data into the data warehouse.
3. Warehouse management: A series of functions must be provided to manage
the warehouse; consistency analysis, indexing, de-normalisation, aggregation,
back-up and archiving.
4. Query management: The warehouse must perform a series of operations
concerned with the management of queries for use by a variety of actors;
reporting and query (Data visualisation) tools, OLAP tools and tools for data
mining.

5.3 Designing the Data warehouse Schema

5
Data Warehouse Technology: Support of MIS/DISS System

Designing a schema for a data warehousing application is a specialist case of


database design. The issues: the large volume of data and the issue of achieving
satisfactory levels of retrieval performance assume particular performance in a
data warehouse. To provide satisfactory levels of performance frequently means
designing specialised physical schemas for warehousing applications. The below
given are these generic designs of warehousing schemas proposed by Anahory
and Murray (1997)

 STAR SCHEMAS
 SNOWFLAKE SCHEMAS
 STARFLAKE SCHEMA

6 The Benefits of Data Warehousing


A data warehouse is seen to deliver three major benefits for organisations:

• A data warehouse provides a single manageable structure for


decision-support data.
• A data warehouse enables organisational users to run complex
queries on data that traverses a number of business areas.
• A data warehouse enables a number of business intelligence
applications such as on-line analytical processing and data mining.

The overall objective for a data warehouse is to increase the productivity and
effectiveness of decision making in organisations. This in turn, is expected to
deliver competitive advantage to organisations.

6.1 Impact of Data warehousing at HDB

Discover Discrepancies among Operational Data:


The data warehouse project brought together users from various departments
who worked together to resolve discrepancies in their data.

More Users Are Able to Extract Data Themselves:


With the use of user-friendly data retrieval tools for the data warehouse, more
users are able to extract data themselves. In fact, 90 percent of users use pre-
canned queries and reports while the remaining 10 percent code their own
queries.

Less Time in Gathering Information:


By facilitating information gathering and reducing the need for data reconciliation,
the data warehouse reduces the time spent for such activities, thereby enabling
more time to be spent on analysis and decision making. Alternatively,
management reports can be prepared in a shorter time, say, one day instead of
three days.

Better What-If Analysis:


With the existence of the data warehouse, interactive analysis is possible since
the data are located centrally rather than in different operational systems. This
facilitates better what-if analysis compared to that based on hard copies.

Stabilise Requests for Ad Hoc Reports:


With the development of the data warehouse, the increase in the number of
requests for ad hoc reports have stabilised since users can now generate the
reports themselves.

6
Data Warehouse Technology: Support of MIS/DISS System

Planners' Needs Are Better Met:


With the data warehouse, planners now know what data are available which they
can use.

More Detailed Analysis:


Before the data warehouse was developed, planners needed to access different
operational systems to carry out data analysis. This was both cumbersome and
time-consuming, thereby often restricting the type of analysis that could be
made. With the data warehouse, planners can now easily perform different types
of analysis

More Efficient Use of Operational Systems:


By creating the data warehouse as a middle layer, disruptions to operational
systems are minimised and the loads on such operational systems become more
predictable. This is because access to operational systems for data analysis
usually creates an uneven load on the system and therefore interferes with its
operational efficiency.

7 Challenges of Data Warehousing


Data warehousing projects are large-scale development projects. Typically a
data-warehousing project may take of the order of three years. Some of the
challenges experienced in such projects are:

 Knowing in advance the data users require and determining the


ownership and responsibilities in terms of data sources.
 Selecting, installing and integrating the different hardware and
software required to set up warehouse. The large volume of data
needed in terms of a data warehouse requires large amounts of disk
space. This means that estimation of storage volume is a significant
activity
 Identifying, reconciling and cleaning existing production data and
loading it into the warehouse. The divers sources of data feeding a
data warehouse introduces problems of design in terms of creating a
homogeneous data store. The problem are also introduced in terms of
the effort required to extract, clean and load into the warehouse

8 Future Research and developments

Data warehousing has been around for ten years. In spite of this fact,
architectures, methodologies and tools for data warehouse implementation are
still being developed. The below given are some of the directions for the near- to
medium-term future of the data warehouse: single information source, distributed
information ability, information quality and ownership, and automated information
delivery.
8.1 Single information source
The key characteristic of the future data warehouse is its universality in the
enterprise. The business data warehouse will become an ultimate source for all
information because of the following potential developments:

• The scope of data in the warehouse will be expanded over the next few
years.
• The inter-enterprise data warehouse will span many companies.

7
Data Warehouse Technology: Support of MIS/DISS System

• The data modelling requirements and tools used by the data


warehouse are growing.

8.2 Distributed information availability

The end user needs the business information warehouse for independent access
to distributed information. The business information warehouse provides users
with independent data stores designed and structured according to their needs.
However, all data derive from a single, unequivocal business data warehouse. The
business information warehouse also promotes the widespread, distributed
availability of information. Its future directions are:
• Potential impacts of the Internet on the data warehousing.
• The implications of mobile computing for distributed data management.

8.3 Automated information delivery

Data replication tools have made major advances over the past few years,
partially due to the expanded functionality and usability of these tools. Vendors
will concentrate on additional source/target combinations, transformation
functions and providing updates from previously unsupported sources. However,
further fundamental research in this and related areas is needed in order to solve
some of the underlying problems of the complexity of reconciliation in the
business data warehouse population (Widom, 1995).
8.4 Information quality and ownership
A data warehouse should allow end users to feel confident on the quality of
information they use and to take the ownership of rightful business information.
However, data quality mixes and businesses rarely take information ownership
seriously. Therefore, every effort should be made to contribute to these goals.

9 Conclusion
In building the data warehouse, the HDB has adopted an incremental approach
and is currently in the process of including more data from other aspects of its
business into the data warehouse. The HDB will continue to refine the data
warehouse to provide better management and decision support and is currently
looking into the feasibility of employing data mining technologies to make the
data warehouse even more useful to decision makers.

Information is pivotal in today's business environment. Data warehousing transforms data into
information in a consistent and intelligent manner across the organisation. The data
warehouse has emerged as the leading way to implement high-performance decision support
systems for large-scale environments. It is valued as a significant shared asset of the
enterprise. From the business perspective, data warehousing can be the basis of reinventing
the business to achieve competitive advantages. A bright future for it is foreseeable.

Reference:

1. Hurwitz, J. (1995). "A Pragmatic Approach to Data Warehousing," DBMS,


October, pp. 12, 27.
2. Inmon, W.H. (1997). "Does Your Datamart Vendor Care about Your
Architecture?," Datamation, March, pp. 105-107.
3. McElreath, J. (1996). "An Architectural Perspective of Data Warehouses,"
Information Strategy: The Executive's Journal, Vol. 12, No. 4, pp. 30-41.

8
Data Warehouse Technology: Support of MIS/DISS System

4. McKnight, W. (1999). "Managing the People Issues of the Data Warehouse:


The 10 Most Challenging Data Warehouse People Issues," Journal of Data
Warehousing, Vol. 4, No. 1, Spring, pp. 14-18.
5. Menon, S., and Shards, R. (1999). "Digging Deeper," OR/MS Today, Vol. 26,
No. 3, June, pp. 26-29.
6. Morrill, J.M. (1996). "It's the Data Stupid!," Journal of Systems
Management, Vol. 47, No. 4, p. 41.
7. Poe, V. (1994). "Clear, Careful, and Realistic: Guidelines for Warehouse
Development," Database Programming and Design, September, pp. 60-64.
8. Radding, A. (1995). "Support Decision Makers with a Data Warehouse,"
Datamation, Vol. 41, No. 5, March 15, pp. 53-56.
9. Rautenstrauch, C. (1998), "Modeling and Implementation of Data
Warehouse Systems," in M. Khosrowpour (Ed), Effective Utilization and
Management of Emerging Information Technologies, 1998 Information
Resources Management Association International Conference, Boston, MA,
May 17-20, Hershey, PA: Idea Group Publishing, pp. 325333.
10. Scott, "Management Information Systems", Third Edition., Eastern
Economy Edition
11. N C Laudon & J P Laudon, " Management Information Systems", Sixth
Edition, Prentice Hall International Edition, UK
12. N C Laudon & J P Laudon, " Management Information Systems", Second
Edition, Prentice Hall International Edition, UK
13. C J Date, "Data Base Management Systems", Eastern Economy Edition
(1991)

9
Data Warehouse Technology: Support of MIS/DISS System

EXHIBIT - I

The Technical component of a data warehouse

Operational
Data

Historical
INTERNAL Data
DATA
SOURCES

Operational
Data
Extract and Data
Transform Access and
Data
Historical Analysis
Ware-
Data House

• Queries and
reports
• OLAP
Information • Datamining
Directory
External Data

EXTERNAL
DATA
SOURCES External Data

10

Potrebbero piacerti anche