Sei sulla pagina 1di 9

E-Learning Certificate Programs in Big Data

Certificate Program in Accelerated Excellence (E-learning Mode)

Engineering Big Data with R and Hadoop Ecosystem Essential of Applied Predictive Analytics

CSE 7304co Engineering Big Data with R and Hadoop Ecosystem


Companies collect and store large amounts of data during daily transactions. This data is a combination of structured, semi-structured and unstructured data. The volume of the data being collected daily in many organizations has grown from MB (106) to TB (1012) in the past few years and is continuing to grow at an exponential pace. The very large size, lack of structure and the pace at which it is growing characterize the "Big Data" revolution. To analyze long-term trends and patterns in the data and provide actionable intelligence to managers, this data needs to be consolidated and processed in specialized processes; those techniques form the core of this module. The use cases for the program are "analyzing a customer in near real-time" as applied in Retail, Banking, Airlines, Telecom or Gaming industries. At the end of the program, the participants will be able to set up a Hadoop cluster and write a Map Reduce program that uses pre-built libraries to solve typical CRM data mining tasks like recommendation engines. This course thoroughly trains candidates on the following techniques: 1. SQL querying (with a focus on statistical analysis) 2. Hadoop and Map Reduce methods of programming 3. Designing columnar databases From a tools perspective, this course introduces you to Hadoop. You will learn one of the most powerful combinations of Big Data, viz., "R and Hadoop". In addition, all the essential content required to build powerful Big Data processing applications and to acquire respected industry certifications like Cloudera's Apache Hadoop Developer certification will be covered in the course. The emphasis is not on abstract theory or on mindless coding. The emphasis is, instead, placed on learning concepts and real-world programming techniques. Schedule A 40-hour (20 Sessions), 7-week program each session lasts 2 hours and we meet every alternate day (3 sessions/week)

INTERNATIONAL SCHOOL OF ENGINEERING

http://www.insofe.edu.in

Session# 1 2 3 4 5 6 7 8 9 10 11 12 13

Lecture Session Introduction to Big Data & Applications The Hadoop Eco-system Parallel architectures and concurrent algorithms Distributed File Systems, GFS & HDFS HDFS (continued), CDH4 HDFS Map Reduce Map Reduce (continued) Map Reduce (continued), YARN, Hadoop Streaming Sqoop, Hive R-Hadoop NoSQL databases including HBase PIG, Oozie Machine Learning on Hadoop Mahout

Lab Session (15-30min) Live demo of an Internet-based big data application (10-15min) Different Hadoop installations (20min) Linux shell, Java basics demo (5+20min)

Shakeup Quiz (5 -7min)

Yes Yes Yes

Using HDFS from shell & from programs, HDFS Configuration & Log files (30min) MR configuration and log files (15min) Word Count with MR (20min) Hadoop streaming (in some language popular with this batch); CDH4 features demo? (20+ 5-10min) Sqoop, Hive demo (5+20min) Demonstration of Word Count in RHadoop, contrast with MR version (30min) More examples on Hive and R-Hadoop. Small demo of H-Base (20-25min) PIG, Oozie demo (20+5min) Demonstrate Mahout. Run on movie reco data (30min). MR Demo of Text index building. Assign Text Search homeworks (Homeworks can be done in any one of R-Hadoop / PIG / Hive / Java MR / Hadoop Streaming, as per individual preference) - 25+15min

Yes Yes Yes Yes Yes Yes Yes Yes Yes

14

Text Search Application on Hadoop

Yes

15 16

Other ecosystem components Text Classification, text clustering Graph processing & Applications including SSSP PageRank, BSP, Hama Pregel, Giraph, Social Network Mining Certification & Wrap up Mahout for text classification. Text search student submissions discussion (15+20min). MR demo of SSSP on a non-trivial graph (20min). Assign graph processing homework. PageRank demo on MR and Hama (10+10min). Graph homework student submissions discussion (20min) Interaction session with certified professionals (20min)

Yes Yes

17 18 19 20

Yes Yes Yes Yes

INTERNATIONAL SCHOOL OF ENGINEERING

http://www.insofe.edu.in

CSE 7301co Essential of Applied Predictive Analytics


If you believe that an ability to analyze, forecast and predict using data will help you grow well in your current job, then this 40-hour instructor-led online course is the easiest way to achieve that. Professionals from a diverse set of verticals and horizontals like Marketing, HR, Engineering, Banking, Pharmaceutical, Healthcare, Retail, Telecom, Manufacturing, Data Warehousing, etc. are finding that decisions cannot be taken intuitively anymore. Data is becoming the biggest source of knowledge, differentiation and progress. This course teaches robust and systematic methods that enable gaining insights from data just as a specialist does. At the end of the program, the participants are able to answer business questions such as who is likely to buy a new product amongst the existing customers, which customers are most likely to default on a loan or an insurance payment and of a given set of transactions, which are most likely to be fraudulent. This course thoroughly trains candidates on the following techniques: Pre-processing Techniques: Graphical Visualization, Handling Missing Values, Data Standardization; Predictive Models: Decision Trees, Linear Regression, Logistic Regression; Model Selection Techniques: Concepts of Overfitting, Bias and Variance; Cross Validation; Error metrics like Precision, Accuracy and Recall; Introduction to solving analytics problems using R. Schedule: A 40-hour, 8-week program. Each session lasts 2 hours. Day 1: Introduction to Big Data; Course Motivation; Logistics; Analysis through Data Visualization Day 2: Understanding the business case and defining a solution framework Day 3: An introduction to R programming language and environment Day 4: Techniques of Pre-processing data (Binning, Normalizing, Filling missing values, removing noise) Day 5: Data Pre-processingcontinued Day 6: Traps and Errors: Confusion matrix, Analyze False positives and False Negatives from a problem perspective; Different error measures used in Forecasting Day 7: Model Selection: K-fold validation Day 8: Introduction to Decision Trees and their structure Day 9: Construction of Decision Trees through simplified examples; Choosing the best attribute at each non-leaf node; Entropy; Information Gain

INTERNATIONAL SCHOOL OF ENGINEERING

http://www.insofe.edu.in

Day 10: Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with numerical variables; other measures of randomness Day 11: Inductive learning from a 500-ft view; Issues in inductive learning like curse of dimensionality; Overfitting; Bias-Variance tradeoff Day 12: Pruning a Decision Tree, Cost as a consideration; Unwrapping Trees as rules Day 13: A mathematical model for association analysis Day 14: Large itemsets and Association Rules; Apriori: Constructs large itemsets with minisup by iterations Day 15: Interestingness of discovered association rules; Application examples; Association analysis vs. Classification Day 16: Using Association Rules to compare stores; Dissociation Rules; Sequential Analysis Using Association Rules Day 17: Data visualization and Story-telling: Anatomy of a graph Day 18: Animated graphs, BI dashboards and the latest trends in data visualization Days 19 and 20: An end-to-end case study in R involving understanding the data, filling the missing values, applying and assessing models and reporting the results.

INTERNATIONAL SCHOOL OF ENGINEERING

http://www.insofe.edu.in

Mentors Profiles
Dr. SREERAMA K MURTHY
Co-founder and CEO, Teqnium Consultancy Services PhD in Data Mining, Johns Hopkins University Classes Taught Engineering Big Data with R and Hadoop Ecosystem Brief Profile Ph.D. - Johns Hopkins University M.Tech. IIT, Chennai (Madras) B.E. - NIT, Allahabad 17 years of work experience after Ph.D. (USA: 5 years, India: 11 years) 21 US Patent applications (8 issued), 2 Indian patent applications Many invention disclosures, numerous journal and conference papers. Designed, managed, built and deployed large software systems. Technocrat, combining love for technology with entrepreneurship and business

management. Helped conceptualize business plans of three ventures. Obtained millions of dollars in funding.

Chairman & CEO - Teqnium Consultancy Services Director, Technology - Globarena ITeknowledge Pvt Ltd Managing Director - Globarena Web Technologies Senior Manager and Head, E-Commerce Research group - IBM India Research Lab Researcher - Siemens Corporate Research Areas of Expertise: Technology Enabled Education and Training, e-Skilling, Outsourced R&D, Data Mining, Digital Security, Healthcare Informatics Specialties: Education Strategy, Role of Technology in Skills Development, Instructional Design, Research, Intellectual Property, Novel Product Design

INTERNATIONAL SCHOOL OF ENGINEERING

http://www.insofe.edu.in

Dr. DAKSHINAMURTHY V KOLLURU


President, International School of Engineering PhD in Materials Science and Engineering, CMU

Classes Taught
Essentials of Applied Predictive Analytics

Brief Profile
Ph.D. Carnegie Mellon University (CMU) M.S. Carnegie Mellon University (CMU) B.E. NIT, Tiruchirapalli 15 years of work experience after Ph.D. in diverse organizations ranging from Defense Research to Web startup and mid-size IT services companies. President - International School of Engineering, Chief Research Officer - Prithvi Information Solutions Ltd., Hyderabad, Founder and Managing Director - Axaya Cybertech Pvt Ltd, Co-founder and Managing Director - Globarena ITeknowledge Pvt. Ltd Scientist - Defence Metallurgical Research Laboratory, Hyderabad, During his years of experience as a scientist and entrepreneur, Murthy has applied his strengths in logical thinking, math and science to solving industrial and societal problems, designing solutions from fundamentals, identifying, training and motivating high quality individuals, and to articulating the findings in a lucid manner to all the stakeholders. Over the past few years, Dr. Murthy has been actively teaching Data Analytics to working professionals with wide range of experience and from diverse industries. He has also been consulting on Data Science projects with Fortune 25 to IT Services to Startup companies. During his years of experience as a scientist and entrepreneur, Dr. Murthy has applied his strengths in logical thinking, math and science to solving industrial and societal problems, designing solutions from fundamentals, identifying, training and motivating high quality individuals, and to articulating the findings in a lucid manner to all the stakeholders. He built the Business Analytics and Optimization division of a mid-tier IT services company from scratch and filed for 5 patents in Retail and Telecom Analytics, during which time he also acquired Fortune 500 clients and turned the division into a profitable delivery center.

INTERNATIONAL SCHOOL OF ENGINEERING

http://www.insofe.edu.in

Fee Structure
Program Fee for Each Individual Module: For International Students: $9 for Application Fees and $640 for Program Fees For Indian Students: Rs. 500 for Application Fees and Rs. 35,000 for Program Fees Program Fee for Two Modules: For International Students: $9 for Application Fees and $960 for Program Fees For Indian Students: Rs. 500 for Application Fees and Rs. 54,000 for Program Fees For more details, please visit: http://insofe.edu.in/init/default/elearning_engineering_big_data

For any queries; Contact: - +91 9502334561 or email us at elearning@insofe.edu.in

INTERNATIONAL SCHOOL OF ENGINEERING

http://www.insofe.edu.in

International School of Engineering


Address: 1st Floor, Plot No 63/A, Road No 13, Film Nagar, Jubilee Hills, Hyderabad 500033 Contact Number: +91 9618 483 483; Website: www.insofe.edu.in Facebook: www.facebook.com/insofe Linkedin: http://goo.gl/VzC9s Twitter: @INSOFEedu Slideshare: http://www.slideshare.net/INSOFE

INTERNATIONAL SCHOOL OF ENGINEERING

http://www.insofe.edu.in

Potrebbero piacerti anche