Sei sulla pagina 1di 2

UNIT I

1 Understanding Data: Data Wrangling Exploratory 01


Analysis, 01
2
Data Transformation 01
3 01
4 Cleaning
01
5 Feature Extraction, Data Visualization
Introduction to contemporary tools and 01
6
programming languages for data analysis like R
7
&Python 02

08

Unit – II
Statistical & Probabilistic analysis of Data:
Multiple hypothesis testing 01
1
Parameter Estimation methods 01
2 01
3 Confidence intervals
01
4 Bayesian statistics
02
Data Distributions 02
5
08
Unit – III
1 Introduction to machine learning: 01
Supervised & unsupervised learning classification 01
2
Algorithms 02
3 01
4 clustering Algorithms,
01
5 Dimensionality reduction: PCA & SVD, Correlation
01
& Regression analysis, 01
6
Training & testing data: Over fitting & Under fitting
7
08
Unit – IV
Introduction to Information Retrieval:
Boolean Model, 01
1
Vector model, 02
2 01
3 Probabilistic Model,
01
4 Text based search: Tokenization,
01
TF-IDF, stop words and n-grams, 01
5
Synonyms and parts of speech tagging. 01
6
08
Unit – V
Introduction to Web Search& Big data: Crawling
and Indexes, 01
1
Search Engine architectures, 01
2 02
3 Link Analysis and ranking algorithms such as
4 HITS and PageRank,
01
Hadoop File system & 01
5
MapReduce Paradigm 08
6
Total Class Required 40

References :

1. Peter Bruce, “Practical Statistics for Data Scientists:


50 Essential Concepts”, Shroff/O'Reilly; First edition,
2017
2. Pang-Ning Tan, “Introduction to Data Mining”,
Pearson Edu.
3. Ricardo Baeza-Yates and Berthier Ribeiro-Neto,
“Modern Information Retrieval”, Pearson Education

Text Book:

1. Field Cady, “The Data Science Handbook” , 1/e


,2018,Publisher: Wiley
2. Sinan Ozdemir, “Principles of Data Science “, 1/e,
2016 Packt Publishing Limited

Potrebbero piacerti anche