Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Source: TDWI Best Practices Report on Big Data and Data Science, 2016
DW Still On Top, But Others Rising
Source: TDWI Best Practices Report on Big Data and Data Science, 2016
Analytics: Driving New User Requirements
• Looking beyond reports: What is
desired business outcome? How close
are we? Data interaction
• Optimization: Using insights to reduce
waste and improve efficiency in
business processes, such as marketing,
product/service development
• Analyzing real vs. forecast
performance; identifying outliers and
anomalies; improving agility
• Customer insight: Aspiring to
predictive and proactive engagement
12
Supporting Democratization of BI & Analytics
• Databases must fuel users with data
to “compete on analytics” at all levels
– Expectation that all decisions, strategic,
tactical, and operational, be data-driven
• “One size fits all” BI becoming
outmoded
– Executives, managers, and front-line
workers have distinct needs and require
self-service access, analysis, and
visualization
Demand for Data to Fill in Gaps in
Understanding Customer Lifecycles
• Customer loyalty: When is loyalty • Find answers: Without proper
most important, i.e., most profitable data analysis, marketing
and aligned with strategic managers struggle to know:
objectives? – When are customers most
• Showing ROI of analytics: likely to churn?
identifying cost to business of – What products or services
inadequate info in each phase of would prevent churn? When
customer lifecycle should offer new/additional
• See where missed opportunities & products?
unnecessary costs lie: Engaging – When is it too costly to try to
customer segments at the right time keep certain customers
From BI/OLAP to Analytics/Data Science
BI/OLAP Data Needs Analytics/Data Science
• Traditionally limited to query • Solving unknowns: Asking
and reporting on narrow right questions, not just
selection of structured data getting answers
• Reporting: Have to know • Investigative, iterative “what-
ahead what you want to know; if” data inquiry; questions
limits on interaction leading to more questions,
• Precise data for precise with different variables
answers: But are the answers • Multivariate analysis using
relevant to decisions? multiple types of data
• Metadata, schema limits of • Data lakes for all the data;
BI/OLAP: No one best way to “schema-on-read” for analysis
define, categorize all across sources – structured,
information; silos abound semi-structured, unstructured
Variety of Big Data Analytics and Data
Science Use Cases Driving Diversity and
Volume • Analyzing customer behavior
• Marketing personalization, segmentation
• Social media analysis
• Call detail record
• Systems of engagement (ecommerce, real-time bidding,
gaming)
• Fraud detection
• Risk analysis and loan approvals
• Facility/inventory monitoring using IoT
• Preventive maintenance
• IT operations analytics
Data-Intensive Applications: New
Demands on Database and Platforms
• Applications run on analytics: need data availability
• Data management must serve up diverse internal and
external data to feed analytics
– Could include interaction and observational data, machine data (IoT)
• Data-intensive applications: devoting most processing time to
data interaction and manipulation
– Parallel processing critical to performance and scalability
• Need to minimize data movement; executing algorithms
where the data resides; machine learning and automation
MPP and In-Memory for DW/Big Data
• Massively parallel processing
(MPP) databases and appliances
– Teradata, Pivotal Greenplum, IBM
PureData (formerly Netezza), Spice
Machine
• In-memory databases
– SAP HANA, MemSQL, Exasol,
Redis
The Data Lake: Open and Unstructured
• A “lake” of raw, natural-state data that is not
uniformly modeled and structured
– Set up for data science; BI use evolving
• Structure as needed: late binding, schema-
on-read
• Taking advantage of cheaper storage
(HDFS, cloud storage: Amazon S3, etc.)
• Data security: a work in progress
• Future: logical data warehouse, information
fabrics, hybrid architecture
NoSQL: Freeing Up Database Tech
• “Not only SQL” and often based on open source, enabling
tremendous diversity in how databases are set up
• Key focus: using massively parallel, distributed process
platforms and cheaper storage to enable better scalability
• Hadoop, MapReduce and related technologies part of the
NoSQL environment
• Column-oriented (“columnar): going deep rather than wide
– HPE Vertica, Amazon Redshift, SAP Sybase IQ, Apache Hbase,
Cassandra; storing data in columns rather than rows for faster
retrieval with less I/O to disk; many work with SQL (or variations)
NoSQL: Expanding to New Data Types
• Real-time data, event processing, and streaming
– Aerospike, Vitria
– From call detail records to capturing whole customer
data lifecycle
• Graph databases
– Neo4j, others
• Documents, media, key value, etc.
– MarkLogic, MongoDB, Datastax, Couchbase, Oracle
NoSQL
Conclusion: The Future is Hybrid
• Hybrid BI and analytics • Hybrid data and
– Enabling a broader analytics architecture
types of BI/analytics aims:
– Using the right – Flexibility and openness
platforms for analytics – Matching workloads with
workloads and meeting platforms
business requirements
– Knowledge about the
• Hybrid data platforms data; access to trusted
– Mixing on-premises data
and cloud platforms – Fit with skill sets
Thank You!
David Stodder
Director of Research for Business Intelligence
TDWI (www.tdwi.org)
dstodder@tdwi.org
(415) 859-9933
23
When to Switch?
Haarthi Sadasivam, Technical Product Marketing Manager
HAARTHI SADASIVAM
Technical Product
Marketing Manager,
Looker
The space and the needs have evolved
External
Companies with embedded analytics
provide insights to customers
‘’
“We feel very confident that
whatever we run into, we will be
able to scale the Snowflake
solution to meet the
performance requirements of
our pharmacy customers.”
— John Foss, Director of Business Intelligence and
Manufacturer Reporting at PDX
Volume of Data / Low Latency
Data • Centralization of
Disparate Data
Sources for
Analytics
‘’
“There was a point where really, if
we hadn’t made this move [to
Redshift], we would have plateaued
and peaked, and the data team
would have been an anchor slowing
down the rest of the company.”
tdwi.or
CONTACT INFORMATION
If you have further questions or comments:
tdwi.or
Learn More in Chicago!
TDWI Conference
“Modernizing Your Data Ecosystem”
Keynotes, Educational Classes, Networking, and More
Chicago, IL | May 7-12, 2017
http://www.tdwi.org/chicago
*
TDWI Leadership Summit
“Architecting a Modern Data Ecosystem”
Chicago, IL | May 8-9, 2017
http://www.tdwi.org/chicagosummit
40