Sei sulla pagina 1di 2

* The Motivation For Hadoop o Problems with traditional large-scale systems o Requirements for a new approach * Hadoop: Basic

Concepts o What is Hadoop? o The Hadoop Distributed File System o How MapReduce Wor s o Anatomy of a Hadoop Cluster * Writing a MapReduce Program o Examining a Sample MapReduce Program o Basic API Concepts o The Driver Code o The Mapper o The Reducer o Hadoop's Streaming API * The Hadoop Ecosystem o Hive and Pig o HBase o Flume o Other Ecosystem Projects * Integrating Hadoop Into The Wor flow o Relational Database Management Systems o Storage Systems o Importing Data from RDBMSs With Sqoop o Importing Real-Time Data with Flume * Delving Deeper Into The Hadoop API o Using Combiners o The configure and close Methods o SequenceFiles o Partitioners o Counters o Directly Accessing HDFS o ToolRunner o Using The Distributed Cache * Common MapReduce Algorithms o Sorting and Searching o Indexing o Classification/Machine Learning o Term Frequency - Inverse Document Frequency o Word Co-Occurrence * Using Hive and Pig o Hive Basics o Pig Basics * Debugging MapReduce Programs o Testing with MRUnit o Logging o Other Debugging Strategies * Advanced MapReduce Programming o A Recap of the MapReduce Flow o Custom Writables and WritableComparables o The Secondary Sort o Creating InputFormats and OutputFormats o Pipelining Jobs With Oozie * Joining Data Sets in MapReduce Jobs o Map-Side Joins o Reduce-Side Joins * Graph Manipulation in Hadoop o Introduction to graph techniques o Representing Graphs in Hadoop o Implementing a sample algorithm: Single Source Shortest Path

* The New Hadoop API * Cloudera Certified Hadoop Developer Exam

Potrebbero piacerti anche