Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Revision (—)
⁃
Topics for Today (22th July 2017) :-
⁃ Introductory Presentation
⁃ Your introduction (In Process)
⁃ Name
⁃ Location
⁃ Years of Exp
⁃ A note on my blog
⁃ A note on technical and related queries
⁃ Google Drive link
⁃ TOC File daily updates and uploads
⁃ Java Modules
⁃ Books reference
⁃ Interview Book
⁃ Big Data Case Studies and setting the context
⁃ Quick Case Studies
⁃ Customer Churn Analysis - Slide 11
⁃ Point of sale transaction - Slide 12
⁃ What is big data ? - Slide 15-16
⁃ 3Vs - Slide 14
⁃ 5Vs - Slide 20
⁃ Evolution of Big Data - SR Slide 7
⁃ Types of Data - SR Slide 7
⁃ Big Data Challenges
⁃ Introduction to Hadoop
⁃ Software Know how - done
⁃ Local Set up
⁃ Mac
⁃ Unix
⁃ Hadoop Philosophies
a) Storage
b) Processing
c) Mannual Distributed Computing
Apache Hadoop is an open source framework which provides an
automated distributed computing environment that supports
storage of big data sets. It does that storage using a cluster of
commodity machines. It then analyses this stored big data using a
very simple programming model.
Hadoop 2.x
A1
A2
F1 - A1 - 100 xxxxx
F2 - A1 - 100
blog
http://syed-rizvi.blogspot.in/
https://drive.google.com/folderview?
id=0BwfmpHQetSFES3UzTDhITkR2Q3c&usp=sharing#list
Case Studies
http://www.informationweek.com/it-leadership/why-sears-is-going-all-in-
on-hadoop/d/d-id/1107038?
http://www.computerweekly.com/news/2240219736/Case-Study-How-big-
data-powers-the-eBay-customer-journey
- Basics of statistics
- R, Tableu