Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
predictors
n # samples X
n k
k # independen t variable s
m # dependent variables response
Y
n m
Goal: Fit a curve to the data
Linear Regression
Y XB E
unexplained
by regression
Offsets (a.k.a. “residuals”)
n
V2 [y i f (x i , a0, a1 )]2
i1
(V ) 2
the minimum:
• Finding 0
ak
Linear Least-Squares Fitting
f (a0,a1 ) a0 a1 x
Solving for a0 and a1
n n n
n
y a a x 0
V 2 (a0 , a1 ) [y (a0 a1 x i )]2
i 0 1 i
i1 n n n
(V 2 ) n
y a a x
2 [y i (a0 a1 x i )] 0
i 0 1 i
i 1 i 1 i 1
a0 i1 n n
(V 2 ) n
2 [y i (a0 a1 x i )]x i 0
y
i 1
i na0 a1 xi
i 1
a1 i1
n n
n
y x a x ax
i i
n
0 i
n
2
1 i 0 n x i a0 y i
i1 i1 i1 i1 i1
n n
2 a1 n
x i x i y i x i
n n n
y x i i a0 x i a1 x i2
i1 i1 i1
i1 i1 i1
Linear Least-Squares Fitting
n n
n x i a0 y i
i1 i1
n n
2 a1
n
x i x i y x
i i
i1 i1 i1
n 1 n
x i y i a b 1 1 d b
a0
n
n i1 ad bc c
i1
c d a
a1
n n
x i x i2 y i x i
i1 i1 i1
n n n n n n n
x i2 x i y i x i y i x i y i x i
2
a0 1 1 i1 i1
n i1 i1 i1 i1 i1
a1 n
2 n
n n n 2 n n n
n x i2 x i x i n y i x i n x i2 x i x i y i n y x
i i
i1 i1 i1 i1 i1 i1 i1 i1 i1
Linear Least-Squares Fitting
n 2 n n n
a0 1
i i i i i
x y x y x
n i1 i1 i1 i1
n n n
a1 n 2
n x i x i
2 x i y i n y x
i i
i1 i1 i1 i1 i1
n n n n n n
x y xyx
2
i i i i i y x x yi xi
2
i
a0 i1 i1 i1 i1
i1 i1
n n 2 n
n x i x i
2
x 2
i nx
2
n
y i x i nxy
n n n
n y i x i x i y i
i1
a1 i1 i1 i1
n n 2 n
n x i x i
2
i1
x 2
i nx 2
i1 i1
Alternate Perspective
n
n
V (a0 , a1 ) [ yi (a0 a1 xi )]2
i i nx y
2
y x i 1
a1 i 1n cov( x, y ) (V 2 )
n
2 [ yi (a0 a1 xi )] 0
a0
i
var( x )
x 2
nx 2 i 1
i 1
n n n
y a a x
i 0 1 i 0
i1 i1 i1
a0 y a1 x
n n n
y a a x
i 1
i
i 1
0
i 1
1 i
n n
y i
na
a1 xi
i 1
0 i 1
n n n
y a0 a1 x
How Well Does Our Model Fit?
• Proportion of variability in Y that is explained
– Coefficient of Determination
n
i 0 1i
y ( a a x ) 2
cov( x, y )2
r2 1 i 1
n
y y
2 var( x ) var( y )
i
i 1
– Correlation Coefficient
cov( x, y )2 cov( x, y )
r r2
var( x ) var( y ) stdev ( x ) stdev ( y )
Problem of Overfitting
• Fitting true patterns or noise?
Underfitting Overfitting
# Predictors/Features
Overfitting
# Samples
Real-World Data
• Constrained sample size
– Data collection can be challenging, expensive
• e.g. Manual human annotation
– Outliers, inconsistencies, annotator disagreements
• Unreliable for modeling, thrown out
• Multicollinearity
– Some predictors are highly correlated with others
– Creates noise in the data
Solution?
• Reduce dimensionality
• Eliminate multicollinearity
– Produce a set orthogonal predictor variables
Factor Analysis
Factor Analysis
Given predictors X, response Y
Goal: Derive a model B such that Y=XB+E, with factor
analysis to reduce dimensionality
1. Compute new factor (i.e. “component”) space W
2. Project X onto this new space to get factor scores:
T = XW
3. Represent the model as:
Y = TQ+E = XWQ+E
…where B = WQ
Principal Components Regression
(PCR)
• Applies PCA to predictors prior to regression
• Pros:
– Eliminates collinearity orthogonal space
– Produces model that generalizes better than MLR
• Cons:
– May be removing signal & retaining noise
– How to retain the predictors X that are most
meaningful in representing response Y?
Partial Least Squares Regression
(PLS)
• Takes into account Y in addition to X
empty
W Weight matrix for X, such that T = XW
m c Unexplained
empty variance in X
Outputs to be computed
for h = 1,..., c
qh dominant eigenvecto r of AhT1 Ah 1
wh Ah 1qh
ch whT M h 1wh
wh
wh store into column h of W Finally…
ch
ph M h 1wh store into column h of P P, Q, W already assembled, iteratively
qh A wh store into column h of Q
T
h 1 Compute T XW
vh Ch ph
Compute B WQT
vh
vh (normalize )
vh
Ch Ch 1 vh vhT
M h M h 1 ph phT
Ah Ch Ah 1
end loop
Application Example
Automatic Vocal Recognition of a Child’s Perceived
Emotional State within the Speechome Corpus
Sophia Yuditskaya
• Ultra-dense, longitudinal
– Comprehensive record
– Insights at variety of time scales
• (e.g.) Hourly, Daily, Weekly, Monthly, Yearly
• Up to millisecond precision
Challenges
Previous & Ongoing Efforts
Speechome Metadata
Transcribed Speech
Speaker ID
Movement Trajectories
Head Orientation
Child Presence
Thesis Goal
Child’s
Child’s
Perceived
Emotional
Emotional
State
State
Speechome Metadata
Transcribed Speech
Speaker ID
Movement Trajectories
Head Orientation
Child Presence
Thesis Goal
Child’s
Perceived
Emotional
State
Speechome Metadata
Transcribed Speech
Speaker ID
Movement Trajectories
Head Orientation
Child Presence
Questions for Analysis
• Developing a Methodology
– How to go about creating a model for automatic vocal recognition
of a child’s perceived emotional state?
• Evaluation
– How well can our model simulate human perception?
• Exploration
– Socio-behavioral
• Do correlations change in social situations or when the child is crying, babbling, or
laughing?
– Dyadic
• How much does proximal adult speech reflect the child’s emotional state?
• Do certain caretakers correlate more than others?
– Developmental
• Are there any developmental trends in correlation?
Methodology
Representing
Emotional State
5 Speaker Categories:
• Child
• Mother
• Father
• Nanny
• Other – e.g. Toy sounds, sirens, speakerphone
Filtering by Speaker ID
5 Speaker Categories:
• Child Keep all
• Mother
• Father
• Nanny Apply Filtering
• Other – e.g. Toy sounds, sirens outside, speakerphone
• P(Error)
– Probability of two annotators agreeing by chance
– Version B
n # samples
p # input variables
• Adjusted R-squared Bj Regression Coefficient for feature j
– Normalizing for sample size Vector of multidimensional output values
Yi
and # Components for sample i
xij Input value for feature j in sample I
_
Y Sample mean of output data
q # Components
2
Interpreting R adj
• Adult speech
– Matters…
• Medium effect size (≥ 0.13) in many cases