Sei sulla pagina 1di 2

Big data Proof of Concept

Scope:
Banking Domain

Problem definition:
How effectively ETL & DW can be replaced with HADOOP

Stages:
1. 2. 3. 4. 5. 6. 7. 8. Data Source Data Model ETL Conversion DW OLAP Data Model for Analysis Reporting Visualization Value

Solution
Phase I
1. Replace ETL & DW a. Collect structured data. b. Data Model for HIVE/Pig. c. Load data into HDFS via scripts.

Phase II
1. Data model for OLAP using Hive 2. Script for loading data to OLAP.

RASIC Task 1. Data Source 2. Cluster Setup 3. Data Model 4. Compare DB vs HDFS Resrouce Duration 2 days 2 days 2 days 2 days

POC 29-07-2013 1. Difference between HDFS, OLTP, HBASE, PIG, HIVE 2. Pros and Cons. 3. Log file analysis. a. Fetch source log (existing log/realtime log) b. Load to Hadoop (using Flume for realtime logs) c. Cleansing of log data in Hadoop (Ts Phase of EsTsL) d. Join data using PIG. e. Store joined data in HADOOP. Final Stage.

Potrebbero piacerti anche