Sei sulla pagina 1di 68

Tourism Social Science:

Big Data and Tourism

Dr Manuel Alector Ribeiro

Saturday, 21 March 2020 1


Agenda
After this lecture you will be able to…

• Explain the three sources of Big Data


• Summarize the characteristics of Big Data and explain the associated
issues
• Understand the basic requirements of hardware and software tools used
for Big Data analysis
• Explain how Big Data can be a strategic asset for organizations
• Understand the 5-step process of getting value from Big Data
• Identify common sources of Big Data used in Tourism Research
• Explain the methods for preparing Big Data for analysis
• Understand the five fundamental techniques for Big Data analysis
• Summarize the concepts of data visualization and visual analytics
• Explain the relationship between Big Data and decision making
•Understand Big Data detriments and responsible practice.

Saturday, 21 March 2020 2


Big Data Facts – How Big is Big Data

Monday, 23 March 2020 3


Big Data Facts – How Big is Big Data

Monday, 23 March 2020 4


Big Data Facts – How Big is Big Data

Monday, 23 March 2020 5


Big Data Facts – How Big is Big Data

Monday, 23 March 2020 6


Data Never Sleep

Monday, 23 March 2020 7


Saturday, 21 March 2020 8
Five Key Aspects of Big Dara – Economic factors

Monday, 23 March 2020 9


So what exactly is Big Data?
Definitions

Big data is a term for data sets that are so large or complex that
traditional data processing application softwares are inadequate to
deal with them.
But…
The term "big data" often refers simply to the use of predictive
analytics, user behaviour analytics, or certain other advanced data
analytics methods that extract value from data, and seldom to a
particular size of data set.
And…
What counts as "big data" varies depending on the capabilities of
the users and their tools, and expanding capabilities make big data a
moving target.
https://en.wikipedia.org/wiki/Big_data
Saturday, 21 March 2020 10
Objective of Big Data

Monday, 23 March 2020 11


Three Sources of Big Data

Saturday, 21 March 2020 12


3 Sources of Big Data
Machine Generated Data is Leading the Charge

THE DATA MULTIPLIER EFFECT


MACHINE
SENSOR DATA, COMPLEX DATA

VIDEO
RECORDING

HUMAN
SENSORS ENTERPRISE CONTENT, M2M LOG
EXTERNAL SOURCES FILES

DOCUMENTS
EMAIL
BUSINESS
SATELLITE
IMAGING
PROCESS BIO-
WEB LOGS DATABASE DATA SOCIAL
INFORMATICS
VARIETY
VOLUME VARIETY
VELOCITY VOLUME VOLUME
OLTP
1X 10X 100X
More Data with More Complex Relationships…in Real Time and At Scale
Monday, 23 March 2020 To manage, govern and analyze 13
Big Data Sources: Machine Generated Data

How many sensors are on your person right now?

Networks and Sensors → “Smart” devices


The Internet of Things (IoT)

Advantages:
• Remote monitoring
• Real time monitoring
See: http://greywale.blogspot.co.uk/2014/06/iot-success-batteries-backhaul.html
Saturday, 21 March 2020 14
Big Data Sources: People

Examples
• Social Media
• Blogging
• Commenting
• Email
• Images

Text heavy
Unstructured → doesn’t conform to a predefined data model
(doesn’t “fit” with traditional databases)
Saturday, 21 March 2020 15
Big Data Sources: Organizations

Examples
• Transactions
• Customer Relationship Management
• Website click stream
• Medical records
• Survey Data

Unique to organizations and business models


Highly structured (“traditional” data)

Saturday, 21 March 2020


The Complete Travel Cycle

Monday, 23 March 2020 17


Big Data Characteristics

Beyond the Buzz

Saturday, 21 March 2020 18


Characteristics Big Data
The 5 Vs of Big data

Monday, 23 March 2020 19


Big Data Characteristics - Volume

Monday, 23 March 2020 20


Big Data Characteristics - Velocity

Monday, 23 March 2020 21


Big Data Characteristics - Variety

Monday, 23 March 2020 22


Big Data Characteristics - Veracity

Monday, 23 March 2020 23


Big Data Characteristics - Value

Monday, 23 March 2020 24


Big Data Tools – Hardware and Software
Beyond the capabilities of a single PC with a single hard drive

Distributed Computing
• Split data, support large data
volumes
• Computing nodes connected by
network
• Fast access
• Fault tolerance
• Optimized for a variety of data types
• Shared environment
• Increased performance/reduced
costs
• Many applications (large support
community)
• Support for Scaling Out – adding
more computing nodes

Monday, 23 March 2020 25


Extracting Value from Big Data

Saturday, 21 March 2020 26


Strategy and Technology: Leveraging Big Data

• Technology can be easy to copy


• Technology alone rarely offers sustainable advantage
• Leverage technology (and time) for strategic positioning with
competitive assets
• Valuable
• Rare
• Difficult to Imitate
• Not substitutable

Photo credit: Matt


Stroshane/Disney

Monday, 23 March 2020 27


Value from Big Data → Business Intelligence Systems

Monday, 23 March 2020 28


7 out of 10 People are Sick of Surveys

Saturday, 21 March 2020 29


Key Term: Data Mining

Data mining is the process through


which previously unknown patterns
in data are discovered.

“a process that uses statistical,


mathematical, and artificial learning
techniques to extract and identify
useful information and subsequent
knowledge from large sets of data.”

This includes most types of


automated data analysis.
HUMANS AS SENSORS
Understanding Tourist Behaviours
Moments-based paradigm
• Humans process experiences in discrete chunks/episodes (Ariely & Zauberman, 2003;
Zacks & Tversky, 2001)
• Consistent with customer journey, tourism value chain, and destination touchpoints

Volunteered Geographic Information (VGI)


• Photo metadata to describe time and location
• More reliable, lower cost for measuring behaviour than other methods? (Gonzalez,
Lopez, & de la Rosa, 2003; Onder, Koerbitz, & Hubmann-Haidvogel, 2014)

Visitor evaluation of experiences


• Unstructured text used to describe photos (titles, keywords tags, descriptions)
• Extract attitudes and opinions of places and activities based on the linguistic
characteristics of the words (García, Gaines, & Linaza, 2012; Gkritzali, Gritzalis, &
Stavrou; Ma, Kirilenko, & Stepchenkova, 2020; Magnini, Crotts, & Zehrer, 2011;
Stepchenkova, Kirilenko, & Morrison, 2009)
Monday, 23 March 2020 31
The Big data Value Chain

Value comes from the integration of multiple sources and types of data

Data
Data Data Data Data Usage/
Acquisition Preparation Analysis Report Decision

Monday, 23 March 2020 32


10 minute break

Saturday, 21 March 2020 33


Big Data

Data Acquisition

Saturday, 21 March 2020 34


The Big data Value Chain

Value comes from the integration of multiple sources and types of data

Data
Data Data Data Data Usage/
Acquisition Preparation Analysis Report Decision

Monday, 23 March 2020 35


Organizations’ Data Sources
Data for every aspect of the value chain

Transaction Processing Systems


Customer Relationship Management Systems
Supply Chain Management Systems
Website Click Stream Data
Google Analytics
Matomo (open source)

Organisational data is an extremely valuable strategic asset

Monday, 23 March 2020 36


Social Media Data – APIs are the “key”
Many social media platforms provide free access to their data

• Data is downloaded through application programming interfaces (API)


• Registration required to receive a unique “key” for data access
• Usually limitations on the volume and speed for calling data
• Data use may be restricted – check End User License Agreement
• Require the creation of a small app

Popular APIs
Twitter
Instagram
Facebook
Linkedin
Pinterest
Foursquare
Flickr
Google Plus
Open Street Maps

Saturday, 21 March 2020 37


Data Scraping
When an API isn’t available

• Small programs that “read” or “crawl” webpages and extract


structured data
• Many websites (e.g. TripAdvisor) prohibit this practice
• Commonly known as “spiders”

Saturday, 21 March 2020 38


Big Data

Data Preparation and Curation

Saturday, 21 March 2020 39


Value from Big Data

Value comes from the integration of multiple sources and types of data

Data
Data Data Data Data Usage/
Acquisition Preparation Analysis Report Decision

Monday, 23 March 2020 40


Garbage in = Garbage Out

Saturday, 21 March 2020 41


Two Step Process for Preparing Data

1. Clean
• Invalid data
• Missing Values
• Duplicate Records
• Outliers

2. Transform → get it into the right format/structure


• Transformation (reduce noise by aggregation)
• Scaling (e.g. converting from inches to cm)
• Dimension Reduction (find smaller subset that captures most of the variation)
• e.g. Principal Component Analysis
• Feature Selection (remove, combine, or add features)
• Data Manipulation (Computing new variables)

Saturday, 21 March 2020 42


Text Mining, Sentiment Analysis and Content Analysis
Unstructured data -> Structured data

Sentiment Analysis Tools


• SentiStrength (lexicon-based sentiment
analysis) → ideal for Twitter
• Deeply Moving (artificial neural networks)
→ “off the shelf ”
• RapidMiner (machine learning environment
for sentiment detection) → requires
“training” the program

Image Analysis of photos/videos


• Clarifai
• Google Vision
• Imagga

Saturday, 21 March 2020 43


Big Data

Data Analysis / Analytics

Saturday, 21 March 2020 44


Value from Big Data

Data
Data Data Data Data Usage/
Acquisition Preparation Analysis Report Decision

Monday, 23 March 2020 45


Analytics Overview
Five Categories of Analysis Techniques
Choosing and specifying a quantitative model

1. Classification
Predict category (will the visitor come back?)
2. Clustering
Organize similar items into groups (market segmenting)
3. Regression
Predict numeric value (forecasting visitor arrivals)
4. Association Analysis
Fine rules to capture associations between items (market basket analysis →
recommender systems)
5. Graph Analytics
Find connections between entities (Visitor Flow Analysis)

Saturday, 21 March 2020 47


Big Data Analysis Process

Validate Select
Model Technique

Build Model

Saturday, 21 March 2020 48


Big Data

Reporting

Saturday, 21 March 2020 49


Value from Big Data

Value comes from the integration of multiple sources and types of data

Data
Data Data Data Data Usage/
Acquisition Preparation Analysis Report Decision

Monday, 23 March 2020 50


Visual Analytics

A recently coined term


– Information visualization +
predictive analytics
Information visualization
– Descriptive, backward focused
– “what happened” “what is
happening”
Predictive analytics
– Predictive, future focused
– “what will happen” “why will it
happen”
There is a strong move toward
visual analytics
Tableau

Saturday, 21 March 2020 52


Big Data

Decision Making and Strategic Advantage

Saturday, 21 March 2020 53


Value from Big Data

Value comes from the integration of multiple sources and types of data

Data
Data Data Data Data Usage/
Acquisition Preparation Analysis Report Decision

Monday, 23 March 2020 54


Decisions Making Scenarios

Structured Decisions
– established situation, programmable decision, situation
fully understood, routine,
Unstructured Decisions
– emergent situation, creative decision, situation unclear,
one-shot, general processes
Semi-structured decisions
– have some structured elements and some unstructured
elements

From SHARDA, RAMESH; DELEN, DURSUN; TURBAN, EFRAIM, BUSINESS INTELLIGENCE AND
ANALYTICS: SYSTEMS FOR DECISION SUPPORT, 10th Edition, © 2015. Used by permission of
Pearson Education, Inc., New York, NY. All Rights Reserved.
Business Intelligence Applications

Business intelligence (BI) used


for decision making can be
broken into three main types of
applications:
1. Strategic
2. Tactical
3. Operational

See White, C. Critical Agility: Operational BI Generates Faster and


Smarter Decisions., TeraData Magazine Volume 9, No. 1, March
2009.
How decisions are supported by Big Data

1. Intelligence Phase
Enabling continuous scanning of external and internal information sources
to identify problems and/or opportunities
2. Design Phase (generating alternatives)
Structured/simple problems → standard and/or special models
Unstructured/complex problems
human experts, brainstorming, OLAP, data/text mining
3. Choice Phase
Use sensitivity analyses, what-if analyses, goal seeking
Simulation and other descriptive models
4. Implementation Phase
Decision communication, explanation and justification to reduce resistance
to change
Big Data

London Visitor Flows & Sentiment Case Study

Stienmetz, J. L. (2018). Deconstructing Visitor Experiences: Structure and Sentiment. In Information and
Communication Technologies in Tourism 2018 (pp. 489-500). Springer, Cham.

Saturday, 21 March 2020 58


VISITOR FLOWS

4. Path Aggregation

1. Photos Uploaded

2. VGI data

3. Individual Activated Paths


Saturday, March 21, 2020
VISITOR FLOWS – NETWORK METRICS

Residents Visitors

Resident Network Visitor Network


Density 0.145 0.119
Average Degree 131.683 108.141
Average Clustering Coefficient 0.302 0.306
Modularity 0.454 0.402
Subcommunity Detected 10 10
QAP Correlation for structural equivalence r=.367, p<.001
Saturday, 21 March 2020 60
NET SENTIMENT MAPS

Residents = 19.9% Negative Visitors = 13.2 % Negative

Strongest positive: St. Ethelburga Centre for Reconciliation and Peace


Strongest negative: Imperial War Museum

Saturday, 21 March 2020 61


SENTIMENT BY TRIP CHARACTERISTICS

Mean Net Sentiment Mean Net Sentiment


0.055 0.06

0.055
0.05

0.05
0.045

0.045
0.04

0.04

0.035
0.035

0.03
0.03
1 day 2 3 4 5 6 7 8-14 15-21 22 to
days days days days days days days days 31
days

Saturday, 21 March 2020 62


Big Data

Responsible Practice

Saturday, 21 March 2020 63


The Dark Side of Big Data

Saturday, 21 March 2020 64


Future of Big Data
Huge potential of big data must be balanced with responsible practices

1. User Centric Approaches


– Personal Data Stores, secure control of personal data, etc.
2. Ethical principals needed
– Ethical codes of conduct, Chief Ethics Officer?
3. Algorithmic Transparency
– Explainable algorithms
4. Discrimination-aware decision making
– Define fairness constraints and adjust data/algorithm/decision
5. Living Labs and Sandboxes
– Provide volunteers with beneficial opportunities based on their data
6. Diverse teams are a must
– Research, design, and development are currently dominated by
highly educate males, but affecting diverse populations
Oliver, Nuria. “The tyranny of the data? The bright and darks sides of data driven
algorithmic decision making” Keynote address, AOM Specialized Conference on Big Data
and Managing the Digital Economy, Surrey, UK; 20 April 2018.
Saturday, 21 March 2020 65
Practice Question for Exam

Imagine that you are the chief information officer for Thomson Travel and that
you have been asked to use Big Data to make recommendations for the
creation of a new holiday package for the millennial market. Describe using
examples and illustrations the five-step process you would take using Big
Data. Discuss both the challenges and benefits of using Big Data for this
purpose.

Describe Data Acquisition with examples/illustrations 15


Describe Data Preparation with examples/illustrations 15
Describe Data Analysis with examples/illustrations 15
Describe Data Reporting with examples/illustrations 15
Describe Decision Making with examples/illustrations 15
Discuss challenges and benefits of Big Data 15
Structure and presentation 10
100

Saturday, 21 March 2020 66


Thank you for your attention

Saturday, 21 March 2020 67


Further Reading

Fuchs, M., Hopken, W., & Lexhagen, M. (2014). Big data analytics for
knowledge generation in tourism destinations. Journal of Destination Marketing
& Management, 3, 198–209.

Data-Pop Alliance. (2015) “Reflections on Big Data and the Sustainable


Development Goals.” http://datapopalliance.org/wp-
content/uploads/2016/03/NoteBigDataSDGsGlobalSustDevReportELetouze201
5.pdf

Davenport, T.H. (2007). “Competing on Analytics.” Harvard Business Review

IBM (2013). Descriptive, Predictive, Prescriptive: Transforming asset and


facilities management with analytics.

Saturday, 21 March 2020 68

Potrebbero piacerti anche