Sei sulla pagina 1di 18

SCIENCE OF DATA,

ANALYTICS AND
REPORTING

This document provides a comprehensive detail about data analytics and
reporting as part of Business Intelligence (BI)
iSDP Data
Analytics and
Reporting



Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
Science of Data
Table of Contents
What is Data Literacy? ................................................................................................................ 2
What is Data? .............................................................................................................................. 2
Where do you find data? ............................................................................................................. 2
Forms of data ............................................................................................................................... 3
1. Structured Machine Readable .................................................................................................. 3
2. Unstructured - Readable by Human ............................................................................................ 3
Types of Structured Data............................................................................................................. 4
1. Quantitative Data ........................................................................................................................ 4
2. Qualitative Data .......................................................................................................................... 4
Types of Quantitative Data ......................................................................................................... 4
1. Discrete Data ............................................................................................................................... 4
2. Continuous Data .......................................................................................................................... 4
What is Database? ....................................................................................................................... 5
What is a Database Management System (DBMS)? ................................................................... 5
Types of DBMS .......................................................................................................................... 6
Well-known DBMS..................................................................................................................... 7
SQL ............................................................................................................................................. 7
What is Data Analysis? ............................................................................................................... 8
Eight levels of analytics .............................................................................................................. 9
Business Intelligence ................................................................................................................. 13
BI Applications ......................................................................................................................... 14
Types of Business Intelligence Tools........................................................................................ 15
Well-known BI Tools ................................................................................................................ 15
Role of Analyst in BI ................................................................................................................ 16
Required Technical Knowledge/Skills ...................................................................................... 17




Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
Science of Data
What is Data Literacy?
Data Literacy is ability to:
Formulate and answer questions using data as part of evidence-based thinking;
Use appropriate data, tools, and representations to support this thinking;
Interpret information from data; develop and evaluate data-based inferences and explanations;
Use data to solve real problems and communicate their solutions.

What is Data?
Data can be defined as a representation of facts, concepts or instruction in a formalized manner
which should be suitable for communication, interpretation or processing by human or electronic
machine.
Data is represented with the help of characters like alphabets (A-Z, a-z), digits (0-9) or special
characters (+,-, /,*, <,>, = etc.).

Where do you find data?
Data is found everywhere. It needs to be collected and stored digitally in an organized and
structural format for the machine to read.
In the recent past and also at present the data of business is collected by various enterprise
applications software used by industries. The data in these applications are growing rapidly and
are hidden with useful information required for the organizations for decision making process.




Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com

The data can be collected and stored in structured and unstructured formats like
1. Flat Files (TXT, CSV)
2. Spreadsheets, documents
3. Databases
4. Images, Audio, Video files
Databases are the most organized, secured way of storing data. Most of the organization stores
their structured data in the databases.
Forms of data
Data can be categories in two forms
1. Structured Machine Readable
Data stored and organized in a tabular form (in rows and columns) are known as structured data.
This form of the data is machine readable. Structured data used very less storage space compared
to unstructured

Examples of Structured Data
- Databases
- XML data
- Data warehouses
- Enterprise systems (CRM, ERP, etc.)
2. Unstructured - Readable by Human
Data that are not in tabular form like textual data, movies, images, audio...etc. are termed as
unstructured data. These types data cant be read by the machines
Examples of Unstructured Data
- Excel spreadsheets (non-tabular data)
- Word documents
- Email messages
- RSS feeds
- Audio files
- Video files



Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
Types of Structured Data

The two major categories are qualitative and quantitative data

1. Quantitative Data
It is a data that refers to a number. E.g. the number of golf balls, the size, the price, a score on a
test etc.

It is usually regarded as referring to the collection and analysis of numerical data which can be
put into categories or in rank order or measured in units of measurement.

2. Qualitative Data
Is everything that refers to the quality of something: A description of colors, texture and feel of
an object.

E.g. description of experiences; interview are all qualitative data.

It refers to forms of data collection and analysis which rely on understanding, with an emphasis
on meanings rather than numerical form. Its typically descriptive.

Types of Quantitative Data

The Qualitative data can be further classified into:

1. Discrete Data
Discrete data can only take certain values (like whole numbers)

2. Continuous Data
Continuous data can take any value (within a range)

Put simply: Discrete data is counted, Continuous data is measured



Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
Example: What do we know about the dog in the picture?


Qualitative:
He is brown and black
He has long hair
He has lots of energy
Quantitative:
Discrete:
o He has 4 legs
o He has 2 brothers
Continuous:
o He weighs 25.5 kg
o He is 565 mm tall
What is Database?
Its a BASE for storing and retrieval of data.
A database is a systematic collection of data. Databases support storage and manipulation
of data. Databases make data management easy.
Examples:
An online telephone directory would definitely use database to store data pertaining to
people, phone numbers, other contact details, etc.
Your electricity service provider is obviously using a database to manage billing , client
related issues, to handle fault data, etc.
You can consider the Facebook also. It needs to store, manipulate and present data related
to members, their friends, member activities, messages, advertisements and lot more.
What is a Database Management System (DBMS)?
Database Management System (DBMS) is a collection of programs which enables its users to
access database, manipulate data, reporting / representation of data. It also helps to control
access to the database.
Database Management Systems are not a new concept and as such had been first implemented in
1960s. Charles Bachmens Integrated Data Store (IDS) is said to be the first DBMS in history.
With time database technologies evolved a lot while usage and expected functionalities of
databases have been increased immensely.



Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
A Database Management System (DMS) is a combination of computer software, hardware, and
information designed to electronically manipulate data via computer processing.


Types of DBMS
There are 4 major types of DBMS. Lets look into them in detail.
Hierarchical this type of DBMS employs the parent-child relationship of storing
data. This type of DBMS is rarely used nowadays. Its structure is like a tree with nodes
representing records and branches representing fields. The windows registry used in
Windows XP is an example of a hierarchical database. Configuration settings are stored
as tree structures with nodes.
Network DBMS this type of DBMS supports many-to many relations. This usually
results in complex database structures. RDM Server is an example of a database
management system that implements the network model.
Relational DBMS this type of DBMS defines database relationships in form of
tables, also known as relations. Unlike network DBMS, RDBMS does not support many
to many relationships. Relational DBMS usually have pre-defined data types that they
can support. This is the most popular DBMS type in the market. Examples of relational
database management systems include MySQL, Oracle, and Microsoft SQL Server.
Object Oriented Relation DBMS this type supports storage of new data types. The
data to be stored is in form of objects. The objects to be stored in the database have
attributes (i.e. gender, age) and methods that define what to do with the data. PostgreSQL
is an example of an object oriented relational DBMS.
The most widely used DBMS is the relational model (RDBMS) that saves data in table formats
with relations built across tables.



Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
Well-known DBMS

MySQL, PostgreSQL, SQLite, Microsoft SQL Server, Oracle, SAP, dBase, FoxPro, IBM DB2,
LibreOffice Base and FileMaker Pro
SQL
Structured Query language (SQL) pronounced as S-Q-L or sometimes as See-Quelis
actually the standard language for dealing with Relational Databases. SQL can be effectively
used to insert, search, update, delete database records.
Relational databases like MySQL, Oracle, MS SQL server, Sybase, etc. uses SQL ! SQL
syntaxes used in these databases are almost similar, except the fact that some are using few
different syntaxes and even proprietary SQL syntaxes.
There are extensions to Standard SQL which add procedural programming language
functionality, such as control-of-flow constructs. These include:
Source
Common
name
Full name
ANSI/ISO
Standard
SQL/PSM SQL/Persistent Stored Modules
Interbase /
Firebird
PSQL Procedural SQL
IBM DB2 SQL PL SQL Procedural Language (implements SQL/PSM)
IBM Informix SPL Stored Procedural Language
Microsoft /
Sybase
T-SQL Transact-SQL
Mimer SQL SQL/PSM SQL/Persistent Stored Module (implements SQL/PSM)
MySQL SQL/PSM SQL/Persistent Stored Module (implements SQL/PSM)
Oracle PL/SQL Procedural Language/SQL (based on Ada)
PostgreSQL PL/pgSQL Procedural Language/PostgreSQL (based on Oracle PL/SQL)
PostgreSQL PL/PSM
Procedural Language/Persistent Stored Modules (implements
SQL/PSM)
Sybase Watcom-SQL SQL Anywhere Watcom-SQL Dialect
Teradata SPL Stored Procedural Language







Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
What is Data Analysis?
- From Data to I nformation to Knowledge.
Data, when collected and structured suddenly becomes a lot more useful. Lets do this in the
table below.
Color White
Category Sport Golf
Condition Used
Diameter 43mm
Price (per ball) $2.00 (USD)
But each of the data values is still rather meaningless by itself. To create information out of data,
we need to interpret that data.
Data Analysis is the process of transforming data into informational summaries in order to
monitor how different areas of a business are performing.
- Provides answers to the following questions
What happened? When did it happen?
How many? How often? Where?
Where exactly is the problem? How do I find the answers?




Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
Eight Levels of Analytics
Not all analytics are created equal. Like most software solutions, youll find a range of
capabilities with analytics, from the simplest to the most advanced. In the spectrum shown here,
your competitive advantage increases with the degree of intelligence.


1. STANDARD REPORTS
Answer the questions: What happened? When did it happen?
Example: Monthly or quarterly financial reports.

We all know about these. They're generated on a regular basis and describe just
"what happened" in a particular area. They're useful to some extent, but not for
making long-term decisions.


2. AD HOC REPORTS
Answer the questions: How many? How often? Where?
Example: Custom reports that describe the number of hospital patients for every
diagnosis code for each day of the week.

At their best, ad hoc reports let you ask the questions and request a couple of
custom reports to find the answers.



Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com


3. QUERY DRILLDOWN (OR OLAP)
Answers the questions: Where exactly is the problem? How do I find the answers?
Example: Sort and explore data about different types of cell phone users and their
calling behaviors.

Query drilldown allows for a little bit of discovery. OLAP lets you manipulate the
data yourself to find out how many, what color and where.


4. ALERTS
Answer the questions: When should I react? What actions are needed now?
Example: Sales executives receive alerts when sales targets are falling behind.

With alerts, you can learn when you have a problem and be notified when
something similar happens again in the future. Alerts can appear via e-mail, RSS
feeds or as red dials on a scorecard or dashboard.


5. STATISTICAL ANALYSIS
Answers the questions: Why is this happening? What opportunities am I missing?
Example: Banks can discover why an increasing number of customers are
refinancing their homes.

Here we can begin to run some complex analytics, like frequency models and
regression analysis. We can begin to look at why things are happening using the
stored data and then begin to answer questions based on the data.




Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com

6. FORECASTING
Answers the questions: What if these trends continue? How much is needed?
When will it be needed?
Example: Retailers can predict how demand for individual products will vary from
store to store.

Forecasting is one of the hottest markets and hottest analytical applications
right now. It applies everywhere. In particular, forecasting demand helps supply just
enough inventories, so you dont run out or have too much.


7. PREDICTIVE MODELING
Answers the questions: What will happen next? How will it affect my business?
Example: Hotels and casinos can predict which VIP customers will be more
interested in particular vacation packages.

If you have 10 million customers and want to do a marketing campaign, who's most
likely to respond? How do you segment that group? And how do you determine
who's most likely to leave your organization? Predictive modeling provides the
answers.


8. OPTIMIZATION
Answers the question: How do we do things better? What is the best decision for
a complex problem?
Example: Given business priorities, resource constraints and available technology,
determine the best way to optimize your IT platform to satisfy the needs of every
user.

Optimization supports innovation. It takes your resources and needs into
consideration and helps you find the best possible way to accomplish your goals.





Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com

The best analytics for your business problem
The majority of analytic offerings available today fall into one of the first four areas, which
report historical data on what happened in the past but no insight about the future. For simple
business problems, these analytic solutions will be all you need. But if you're asking more
complex questions or looking for predictive insight, you need to look at the second half of the
spectrum. Even better, if you can learn to use these technologies together and identify what type
of analytics to use for every individual situation, you'll really be increasing your chances for true
business intelligence.




Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
Business Intelligence
Business intelligence (BI) is a set of theories, methodologies, processes, architectures, and
technologies that transform raw data into meaningful and useful information for business
purposes.
BI can handle large amounts of information to help identify and develop new opportunities.
Making use of new opportunities and implementing an effective strategy can provide a
competitive market advantage and long-term stability.
Common functions of business intelligence technologies are reporting, online analytical
processing, analytics, data mining, process mining, complex event processing, business
performance management, benchmarking, text mining, predictive analytics and prescriptive
analytics.








Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
BI Applications
Business intelligence can be applied to the following business purposes, in order to drive
business value.
1. Measurement program that creates a hierarchy of performance metrics (see also Metrics
Reference Model) and benchmarking that informs business leaders about progress
towards business goals (business process management).

2. Analytics program that builds quantitative processes for a business to arrive at optimal
decisions and to perform business knowledge discovery.
Frequently involves: data mining, process mining, statistical analysis, predictive
analytics, predictive modeling, business process modeling, complex event processing and
prescriptive analytics.

3. Reporting/enterprise reporting program that builds infrastructure for strategic reporting
to serve the strategic management of a business, not operational reporting.
Frequently involves data visualization, executive information system and OLAP.

4. Collaboration/collaboration platform program that gets different areas (both inside and
outside the business) to work together through data sharing and electronic data
interchange.

5. Knowledge management program to make the company data driven through strategies
and practices to identify, create, represent, distribute, and enable adoption of insights and
experiences that are true business knowledge. Knowledge management leads to learning
management and regulatory compliance.
In addition to above, business intelligence also can provide a pro-active approach, such as
ALARM function to alert immediately to end-user. There are many types of alerts, for example
if some business value exceeds the threshold value the color of that amount in the report will turn
RED and the business analyst is alerted. Sometimes an alert mail will be sent to the user as well.
This end to end process requires data governance, which should be handled by the expert.









Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
Types of Business Intelligence Tools
The key general categories of business intelligence tools are:
Spreadsheets
Reporting and querying software: tools that extract, sort, summarize, and present selected
data
OLAP: Online analytical processing
Digital dashboards
Data mining
Data warehousing
Decision engineering
Process mining
Business performance management
Local information systems
Except for spreadsheets, these tools are sold as standalone tools, suites of tools, components of
ERP systems, or as components of software targeted to a specific industry. The tools are
sometimes packaged into data warehouse appliances.
Well-known BI Tools
Proprietary products:
IBM Cognos, Informatica, SAS, Microsoft BI SSIS, SSAS, SSRS, MicroStrategy, QlikView,
Hyperion, SAP Business Object, Outlooksoft, Tableau Software
Open source free tools
Eclipse BIRT Project, RapidMiner, SpagoBI, R, KNIME, TACTIC
Open source commercial
Jaspersoft: Reporting, Dashboards, Data Analysis, and Data Integration
Palo (OLAP database): OLAP Server, Worksheet Server and ETL Server
Pentaho: Reporting, analysis, dashboard, data mining and workflow capabilities
TACTIC: Reporting, analysis, dashboard, data mining and integration, workflow capabilities




Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
Role of Analyst in BI
Data Analyst & Report Developer

General Purpose:
Provide expertise to acquire, manage, manipulate, and analyze data and report results.

Responsibilities:

Daily Operations
Identify problematic areas and conduct research to determine the best course of action to
correct the data
Analyze and problem solve issues with current and planned systems as they relate to the
integration and management of patient data (for example, review for accuracy in record
merge, unmerge processes)
Analyze reports of data duplicates or other errors to provide ongoing appropriate inter-
departmental communication and monthly or daily data reports

Identify, analyze, and interpret trends or patterns in complex data sets
Monitor data dictionary statistics

Data Capture
In collaboration with others, develop and maintain databases and data systems necessary
for projects and department functions
Acquire and abstract primary or secondary data from existing internal or external data
sources
In collaboration with others, develop and implement data collection systems and other
strategies that optimize statistical efficiency and data quality
Perform data entry, either manually or using scanning technology, when needed or
required

Data Reporting
In collaboration with others, interpret data and develop recommendations based on
findings
Develop graphs, reports, and presentations of project results
Perform basic statistical analyses for projects and reports
Create and present quality dashboards
Generate routine and ad hoc reports









Gutsy Innovation
#201, BVS Towers, Bilekahalli, Bannerghatta Road, Bangalore 5600076, India, Phone: +91-90192 58582,
Email: info@gutsyinnovation.com, Website: www.gutsyinnovation.com
Required Technical Knowledge/Skills
Technical expertise regarding data models and database design development.
Proficient in MS Word, Excel, Access, and PowerPoint. Understanding of XML and SQL
Experience using SAS, SPSS, or other statistical packages is desirable for analyzing large
datasets
Programming skills preferred. Adept at queries and report writing
Knowledge of statistics, at least to the degree necessary in order to communicate easily
with statisticians
Experience in data mining techniques and procedures and knowing when their use is
appropriate
Ability to present complex information in an understandable and compelling manner

Potrebbero piacerti anche