Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
UNIT-1
1.INTRODUCTION
4. EXAMPLES:
Association rule: Basket Analysis (Finding associations
between products bought by customers. Customers who bought
both X and Y (bread & jam, pencil & sharpner).
UNIT-2
1. Introduction:
1. Introduction:
UNIT-3
5.MULTIVARIATE METHODS
1. Introduction:
2. Multivariate Data:
3. Parameter Estimation:
Mean vector: The mean vector μ is defined such that each of its
elements is the mean of one column of X.
Covariance matrix: Represented as d x d matrix. Diagonal terms
are the variances, off diagonal terms (other terms in matrix) are
covariances and the matrix is symmetric (equal to its transpose).
Correlation: Representing correlation between variables (-1 and
+1). If two variables are independent, then their correlation is 0.
Sample mean: The maximum likelihood estimator for the mean.
Estimation of sample covariance and sample correlation.
6. Multivariate Classification:
Main advantages:
Analytical simplicity
Useful approximation for real data.
Robust due to its mathematical tractability (easily handled).
Requirements: The sample of a class should form a single group. If
there are multiple groups, one should use a mixture model.
Procedure
Define the discriminant function.
Estimates for the mean and covariances are found using
maximum likelihood separately for each class.
These are plugged into the discriminant function to get the
estimates for the discriminants.
Defines a quadratic discriminant.
Estimate the number of parameters for means and covariance
matrices.
Estimate a common covariance matrix for all classes.
Unequal priors shift the boundary toward the less likely class.
Decision boundaries are linear that leads to linear discriminant.
Methods:
Naive Bayes’ classifier: Used for further simplification, by
assuming all off-diagonals of the covariance matrix to be 0, thus
assuming independent variables.
Euclidean distance: Used for further simplification from the
above, if we assume all variances to be equal, the Mahalanobis
distance reduces to Euclidean distance.
Nearest Mean Classifier: Assigns the input to the class of the
nearest mean.
Template matching procedure: Each mean is thought of as the
ideal prototype or template for the class.
7. Tuning Complexity:
8. Discrete features:
9. Multivariate Regression: