Sei sulla pagina 1di 6

Course in Data Science

Contact: +917095167689

About the Course:


In this course you will get an introduction to the main tools and ideas which are required for Data
Scientist/Business Analyst/Data Analyst. The course gives an overview of the data, questions, and
tools that data analysts and data scientists work with. There are two components to this course. The first
is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a
practical introduction to the tools that will be used in the program like R Programming, SAS, MINITAB and
EXCEL.
Course features:
140+ hours of teaching
Exam on every weekend
Exclusive doubt clarification session on every weekend
Real Time Case Study driven approach
Live Project
Placement Assistance
Qualification
Any Graduate. No programming and statistics knowledge or skills required
Duration of the course:
3 months
Mode of course delivery
Online Training
Faculty Details:
A team of faculty having an average 20 + years experience in the data analysis across various
industries and training.
Module:1 - Descriptive & Inferential Statistics:(30 Hrs)
1. Turning Data into Information 5. Hypothesis Testing
Data Visualization Hypothesis Testing
Measures of Central Tendency Type I and Type II Errors
Measures of Variability Decision Making in Hypothesis Testing
Measures of Shape Hypothesis Testing for a Mean,
Covariance, Correlation Variance, Proportion
Using Software-Real Time Problems Power in Hypothesis Testing
2. Probability Distributions Using Software-Real Time Problems
Probability Distributions: Discrete 6. Comparing Two Groups
Random Variables Comparing Two Groups
Mean, Expected Value Comparing Two Independent Means,
Binomial Random Variable Proportions
Poisson Random Variable Pairs wise testing for Means
Continuous Random Variable Two Variances Test(F-Test)
Normal distribution Using Software-Real Time Problems
Using Software-Real Time Problems 7. Analysis of Variance (ANOVA)
3. Sampling Distributions One-Way and Two-way ANOVA
Central Limit Theorem ANOVA Assumptions
Sampling Distributions for Sample Multiple Comparisons (Tukey, Dunnett)
Proportion, p-hat Using Software-Real Time Problems
Sampling Distribution of the Sample 8. Association Between
Mean, x-bar Categorical Variables
Using Software-Real Time Problems Two Categorical Variables Relation
4. Confidence Intervals Statistical Significance of Observed
Statistical Inference Relationship / Chi-Square Test
Constructing confidence intervals to Calculating the Chi-Square Test
estimate a population Mean, Variance, Statistic
Proportion Contingency Table
Using Software-Real Time Problems Using Software-Real Time Problems
Module:2 Prediction Analytics (25Hrs)
1. Simple Linear Regression 5. Diagnostics for Leverage and
Simple Linear Regression Model Influence
Least-Square Estimation of the Leverage/ Cooks D /DFFITS/DFBETAS
Parameters Treatment of Influential Observations
Hypothesis Testing on the Slope and Using Software-Real Time Problems
Intercept 6. Polynomial Regression
Coefficient of Determination Polynomial Model in One/ Two /More
Estimation by Maximum Likelihood Variable
Using Software-Real Time Using Software-Real Time Problems
2. Multiple Regression 7. Dummy Variables
Multiple Regression Models The General Concept of Indicator
Estimation of Model Parameters Variables
Hypothesis Testing in Multiple Linear Using Software-Real Time Problems
Regression 8. Variables Selection and Model
Multicollinearity Building
Using Software-Real Time Problems Forward Selection/Backward Elimination
3. Model Adequacy Checking Stepwise Regression
Residual Analysis Using Software-Real Time Problems
The PRESS Statistic 9. Generalized Linear Models
Detection and Treatment of Outliers Concept of GLM
Lack of Fit of the Regression Model Logistic Regression
Using Software-Real Time Problems Poisson Regression
4. Transformations Negative Binomial Regression
Variance-Stabilizing Transformations Exponential Regression
Transformations to Linearize the Model 10. Autocorrelation
Analytical Methods for selecting a Regression Models with Autocorrelation
Transformation Errors
Generalized and Weighted Least Squares
Using Software-Real Time Problems
Module:3 Applied Multivariate Analysis (25hrs)
1. Measures of Central Tendency, 5. Discriminant Analysis
Dispersion and Association Discriminant Analysis (Linear/Quadratic)
Measures of Central Tendency/ Estimating Misclassification Probabilities
Measures of Dispersion Using Software-Real Time Problems
Using Software-Real Time Problems
2. Multivariate Normal Distribution
Exponent of Multivariate Normal
Distribution
Multivariate Normality and Outliers
Eigenvalues and Eigenvectors
Spectral Value Decomposition
Single Value Decomposition
Using Software-Real Time Problems
3. Principal Components Analysis
(PCA)
Principal Component Analysis (PCA)
Procedure
Using Software-Real Time Problems
4. Factor Analysis
Principal Component method
Communalities
Factor Rotations
Using Software-Real Time Problem
Module:4 - Machine Learning(30hrs)
1. Introduction 6. Support Vector Machine
Application Examples Maximum Marginal Classifier
Supervised Learning Support Vector Classier
Unsupervised Learning Support Vector Machine
2. Regression Shrinkage Methods SVMs with More than Two Classes
Ridge Regression Using Software-Real Time Problems
Lasso Regression 7. Cluster Analysis
Using Software-Real Time Problems Agglomerative Hierarchical Clustering
3. Classification K-Means Procedure
Logistic Regression Meloid Cluster Analysis
Bayes Rule and Classification Problem Using Software-Real Time Problems
Discriminant Analysis(LDA/QDA) 8. Dimensionality Reduction
Nearest-Neighbor Methods (K-NN Principal Component Analysis
Classifier) Using Software-Real Time Problems
Using Software-Real Time Problems 9. Association rules
4. Tree-based Methods Market Basket Analysis
The Basics of Decision Trees Using Software-Real Time Problems
Regression Trees
Classication Trees
Ensemble Methods
Bagging, Bootstrap, Random Forests
Using Software-Real Time Problems
5. Neural Networks
Introduction
Single Layer Perceptron
Multi-layer Perceptron
Forward Feed and Backward Propagation
Using Software-Real Time Problems
Module:5 - R Programming (30hrs)
1. R Programming 2. Data Analytics Using R
R Basics Module 1-4 demonstrated using R
Numbers, Attributes programming
Creating Vector
Mixing Objects
Explicit Coercion
Formatting Data Values
Matrices, List, Factors, Data Frames,
Missing Values, Names
Reading and Writing Data
Interface to the Outside world
Sub setting R objects
Vectorized Operations
Dates and Times
Managing Data Frames with the DPLYR
package
Control Structures
Functions
Lexical /Dynamic Scoping
Loop Functions
Debugging

Potrebbero piacerti anche