Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Rattapoom Tuchinda
*Some of the slides are from
Jaideep Srivastava @
http://www.cs.umn.edu/faculty/srivasta.html
Mike Kassoff @
http://logic.stanford.edu/classes/cs246/lect
ures2001/mkassoff_lecture.ppt
So far…
DATA
Data overloaded
z Gene data
z Customer/Sales data
z Astrophysics data
z Pricing
z ….
z Link analysis
z Frauds detection
z New medicines
z Revenue Management/Discriminatory pricing
z Marketing
z Stocks
z ….
Outline
z Introduction
z Data cleaning
z Data mining techniques
– Classification
– Clustering
– Association Rules
– Sequential Patterns
– Regression
– Deviation detection
– Meta-learning
z Case study: Biddingfortravel
Traditional Data Mining Process
Data is often of low quality
z Why?
– You didn’t collect it yourself!
What we want:
z Redundancy!
Problems not due to lack of structure
(it’s in a database)
z Introduction
z Data cleaning
z Data mining techniques
– Classification
– Clustering
– Association Rules
– Sequential Patterns
– Regression
– Deviation detection
– Meta-learning
z Case study: Biddingfortravel
Classification: Definition
z Introduction
z Data cleaning
z Data mining techniques
– Classification
– Clustering
– Association Rules
– Sequential Patterns
– Regression
– Deviation detection
– Meta-learning
z Case study: Biddingfortravel
Case study: Bidding for travel
180 C
Priceline Winning: A
$68
A: 120 Æ 60
B: 200 Æ 65 120 < 200 < 180
C: 180 Æ 68
Biddingfortravel cleaning
Hotel 1
postdata
Hotel 2 join
Hotel 3 Biddingfortravel
.
(area, stars,hotels)
.
Hotel N