SAmple Hadoop

CURRICULUM VITAE
Certified Python Developer with up to 11 years of experience in Big Data, Application

Development, Data warehousing, Oracle Business Intelligence and Transactional
applications. Extensively carried out Gathering Business Requirements, Data Modeling,
System Design, Analysis and Production Support.
■■■
EXECUTIVE SUMMARY
 Experience in all phases of Software Development Life Cycle (SDLC) including

analysis, specification, software and database administration, development,
maintenance, testing and documentation
 Hands on experience with implementation of Bigdata ecosystem which includes
Hadoop MapReduce, NoSQL, Spark, Python, Hive, Impala, Sqoop, Flume, Kafka,
Azure, Oozie
 Proficient knowledge and hands on experience with writing shell scripts in Linux
 Experience in analysing the data using Hive User Defined Functions
 Expertise in all layers of Hadoop Framework - Storage (HDFS), Analysis (Spark, Hive
and Impala), Engineering (Oozie Jobs and Workflows)
 Extensive usage of Sqoop, Flume, Oozie for data ingestion into HDFS & Hive
warehouse
 Hands on performance improvement techniques for data processing in Hive, Impala,
Spark using methods like dynamic partitioning, bucketing, file compression.
 Expertise in MapR, Cloudera & Hortonworks distributions.
 Experience with different data formats like Json, Avro, parquet, RC and ORC formats
and compressions like snappy & bzip
 Proficient with data tools like Nifi.
 Worked on Spark for reducing the execution time of current processing in Hadoop
utilizing PySpark components like Spark Context, Spark-SQL, Data Frames and RDDs
 Hands on experience with Real time streaming using Flume, Kafka & Spark into HDFS
 Hands on experience with Spark Application development using Python
 Worked on On-Prem servers to Azure Cloud migration POC
 Excellent ETL proficiency which includes Data Extraction, Transformation and
Loading, Database Modeling and Data Warehouse tools and technologies
 Worked extensively with Dimensional Modeling (Data Marts, Facts and Dimensions),
Data Migration, loading high volume of data, Data Cleansing, ETL Processes
 Strong technical expertise in creating and maintaining database objects - Tables,
Views, Materialized Views, Indexes, Sequences, Synonyms, Database Links and
Database Directories
 Working knowledge on different data sources ranging from Flat files, Excel, Oracle,
SQL server and DB2 databases
 Expertise in Data Migration, Data loading and Exporting using Import Export, *SQL
Loader and UTL_File Utilities
 Strong team player, ability to work independently and in a team as well, ability to
adapt to a rapidly changing environment, commitment towards learning, Possess
excellent communication, project management, documentation, interpersonal skills
■■■
TECHNICAL SKILLS
Big Data : Hadoop, HDFS, MapReduce, Hive, Impala, Tez, Spark, Sqoop,
Pig, HBase, Flume, Kafka, Oozie,Nifi
Hadoop : MapR, Cloudera (CDH), Hortonworks
Distributions
Database : Oracle, SQL Server, Netezza, DB2, HBase, Mongo DB
Languages : SQL, HTML, Java, Python, UNIX Shell Scripting
ETL/BI Tools : SSIS, Informatica 8.x, OBIEE 10.1.3.x / 11.1.1.x, SSRS
Query Tools : SQL Developer, SQL Navigator, SQL* Plus, TSQL, SQL
Explorer, TOAD, SQL Loader
Operating Systems : Windows Server 98/2000/2003, Solaris7.0/8.0, UNIX, LINUX
Packages : MS Office Suite, MS Visio, MS Project Professional2003
Other Tools : PyCharm, Autosys, Anthill, Maven, GitHub, Tortoise SVN,
TIBCO DataSynapse, IBM Symphony

■■■
RELEVANT INDUSTRY EXPERIENCE
Wells Fargo & Co., Charlotte, NC 12/2016 – Till Date

Sr. Hadoop Developer
Responsibilities:
 Highly knowledgeable in end to end functioning of Capital Markets - Counterparty
Credit Risk Management Process.
 Worked on Potential Future Exposure of Collateral Securities and Asset Management
project; using multiple source systems like ENDUR, CALYPSO, FENICS, GMI OTC,
BROADRIDGE, etc.
 Worked on designing, development and delivery of software solutions in the FX
Derivatives group for use by business users.
 Extensively worked on Hadoop with Cloudera Distribution
 Coordinated with Business Team and Source Teams to get the requirements
 Used SQOOP to import data from RDBMS source system, Spark for data cleansing
and loaded data into Hive staging and base tables. Developed permanent connectors
to automate this process
 Handled different kinds of files types like JSON, XML, Flat Files and CSV by using
appropriate SERDES or Parsing logic for loading into Hive tables
 Implemented Hive UDF's and did performance tuning using Partitioning and
bucketing for better results
 Analysed the data by using Impala queries and Spark to view transaction information
and validate the data
 Implemented optimized map joins to get data from different sources to perform
cleaning operations before applying the algorithms
 Coordinated with Cloudera team and Admin team to fix Cloudera update issues
 Implemented Spark job to extract data from RDBMS systems which reduced the job
process time
 Developed workflow in OOZIE to manage and schedule jobs on Hadoop cluster for
extracting data from source on daily, weekly, monthly, quarterly and annually basis
 Used Autosys scheduler to schedule the workflows
 Analysed the production jobs in case of abends and fixed the issues within 30 mins
 Reduced the daily batch cycle time for some systems to less than 15 mins from 2 hrs
using forks and join concept in Oozie
 Worked on POC to migrate existing big data platform (on premises Cloudera) to
Azure
 Exported data from Hive table to EDW using SQOOP
 Created staging tables, developed work flow to extract data from different source
systems in Hadoop and load data into these tables. The data from these staging table
is exported using SFTP to third party system to execute data models
 Used JIRA and KANBAN to update tasks, code check in, code deployment and
maintain documentation
 Worked in Agile development environment in monthly sprint cycles by dividing and
organizing tasks. Participated in daily scrum and other design related meetings
 Participated in Cloudera updates and tested the regions once they are done
 Used SQL explorer, TOAD to view source data in DB2, Oracle, Netezza
Environment: Hadoop, Cloudera, Hive, Pig, SQOOP, Kafka, Spark, OOZIE, Python,
PySpark, UNIX, Shell scripting, RDBMS, Azure, Autosys
Premier Inc, Charlotte, NC 05/2016 – 12/2016

Sr. Hadoop Developer
Responsibilities:
 Implemented Hive Ad-hoc queries to handle Member hospital data from different data
sources such as Epic and Centricity
 Implemented Hive UDF's and did performance tuning for better results
 Analysed the data by performing Hive queries and running Pig Scripts to study patient
and practitioner behaviour
 Implemented optimized map joins to get data from different sources to perform cleaning
operations before applying the algorithms
 Experience in using Sqoop to import and export the data from Netezza and Oracle DB
into HDFS and HIVE
 Implemented POC to introduce Spark Transformations
 Worked in transforming data from map reduce into HBase on bulk operations
 Developed workflow in Oozie to manage and schedule jobs on Hadoop cluster for
generating reports on nightly, weekly and monthly basis
 Implemented test scripts to support test driven development and continuous integration
 Used JIRA and Confluence to update tasks and maintain documentation
 Worked in Agile development environment in sprint cycles of two weeks by dividing
and organizing tasks. Participated in daily scrum and other design related meetings
 Used SQOOP to export the analysed data to relational database for analysis by data
analytics team
Environment: Hadoop, Cloudera, Hive, Sqoop, Flume, HBase, Spark, Oozie, Linux, Python,
UNIX
Wells Fargo & Co., Charlotte, NC 11/2014 – 04/2016
Sr. SQL Developer
Responsibilities:
 Worked in Capital Markets and gained expertise on Counterparty Credit Risk
Management
 Performed Midyear and Annual CCAR exercises as per the Stress Values provided
by Federal Government
 Involved in gathering and analysing the requirements; and preparing business rules
 Coordinated with the front-end design team to provide them with the necessary
stored procedures, packages and necessary insight into the data
 Wrote Unix Shell Scripts to process the files on daily basis like renaming the file,
extracting date from the file, unzipping the file and to remove the characters from the
file before loading them into base tables
 Involved in the continuous enhancements and fixing of production problems
 Partitioned the fact tables and materialized views to enhance the performance
 Handled errors using exception handling and added check points extensively for the
ease of debugging and displaying the error messages in the application
 Used Tortoise SVN to Check in all the code changes to repository and generated Build
life using Anthill Pro; later deployed the code onto Grid and Coherence Machines
 Extracted data from multiple data sources such as .bin, .dat and xml files; loaded the
data into stage tables using Autosys jobs and Unix Pearl/Shell scripts
 Created new simulation and valuation models for calculating exposure values used
in credit risk process
 Received Shared Success Award for my contributions to the team
Environment: Oracle 9i/10g, SQL Server 2012, UNIX, HP ALM, Autosys, Tibco
DataSynapse, Anthill, Tortoise SVN
Taya Technologies, Hyderabad, India 12/2011 – 09/2014

Team Lead - Oracle Development
Responsibilities
 Interacting with users to gather Business requirements and Analysing, Designing
and Developing the Data feed and Reporting systems
 Designed and developed the Package, Stored Procedures, functions efficiently in
loading, validating and cleansing the data. Also worked on creating users and roles
as needed for the applications
 Created Cursors, Collections and database triggers for maintaining complex integrity
constraints and implementing the complex business rules
 Worked on Performance tuning using the Partitioning and indexing concepts (Local
and Global indexes on partition tables)
 Performed Data Extraction, Transformation and Loading (ETL process) from Source
to target systems
 Worked on Windows Batch scripting, scheduling jobs and monitoring logs
 Created UNIX Shell Scripts for Informatica feeds and to execute Oracle procedures
 Designed processes to extract, transform, and load data to the Data Mart
 Performed ETL Process by using Informatica Power Center
 Optimized SQL used in Reports and Files to improve performance drastically
 Used various transformations like aggregator, Update strategy, Lookup, Expression,
Stored procedure Filter, Source Qualifier, Sequence generator, Router, Update
strategy etc
Environment: Oracle 11g, Informatica, SQL, PL/SQL, SQL Loader, SQL*Plus, Autosys,
Tortoise SVN, TOAD and UNIX
Syntel Ltd, Chennai, India 11/2010 – 12/2011

Sr. Oracle Developer
Responsibilities
 Worked closely with architects, designers and developers to translate data
requirements into the physical schema definitions for Oracle
 Trapped errors such as INVALID_CURSOR, VALUE_ERROR using exception
handler
 Tuning Database and SQL scripts for Optimal Performance, Redesign and build the
schemas to meet Optimal Performance measures
 Extensively Involved in Performance tuning and Optimization of the SQL queries
 Wrote Stored Procedures, Functions, Packages and triggers using SQL to implement
business rules and processes and also performed code debugging using TOAD
 Set up batch and production jobs through Autosys
 Created Shell scripts to access and setup runtime environment, and to run Oracle
Stored Procedures, Packages
 Executed Batch files for loading database tables from flat files using SQL*loader
 Created UNIX Shell Scripts for automating the execution process
Environment: Oracle 9i, SQL, PL/SQL, TOAD, Unix, SQL Server 2003, XML
Infosys Limited, Hyderabad, India 07/2007–09/2009
Senior Systems Engineer - Java Development
Responsibilities
 Key responsibilities included requirements gathering, designing and developing the
Java application.
 Identified and fixed transactional issues due to incorrect exceptional handling and
concurrency issues due to unsynchronized block of code.
 Created Java application module for providing authentication to the users for using
this application and to synchronize handset with the Exchange server.
 Performed unit testing, system testing and user acceptance test.
 Gathered specifications for the Library site from different departments and users of
the services.
 Wrote SQL scripts to create and maintain the database, roles, users, tables, views,
procedures and triggers.
 Designed and implemented the UI using HTML and Java.
 Worked on database interaction layer for insertions, updating and retrieval
operations on data.
Environment: Java, JDBC, HTML, SQL, Oracle, BM Rational, Eclipse IDE
■■■
EDUCATIONAL QUALIFICATION
Master of Business Administration : Mumbai Business School (’09 – ’10)

Bachelor of Engineering : Anna University, Chennai (’03 – ’07)
■■■
CERTIFICATIONS
 Python 3 Certification (Certificate # - 1073 – 11297679)

 Microsoft Certification (Certificate # - SR5319070)

SAmple Hadoop

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

SAmple Hadoop

Caricato da

Copyright:

Formati disponibili

CURRICULUM VITAE

Certified Python Developer with up to 11 years of experience in Big Data, Application

 Experience in all phases of Software Development Life Cycle (SDLC) including

Pig, HBase, Flume, Kafka, Oozie,Nifi

Hadoop : MapR, Cloudera (CDH), Hortonworks

Database : Oracle, SQL Server, Netezza, DB2, HBase, Mongo DB

Languages : SQL, HTML, Java, Python, UNIX Shell Scripting

ETL/BI Tools : SSIS, Informatica 8.x, OBIEE 10.1.3.x / 11.1.1.x, SSRS

Explorer, TOAD, SQL Loader

Operating Systems : Windows Server 98/2000/2003, Solaris7.0/8.0, UNIX, LINUX

Packages : MS Office Suite, MS Visio, MS Project Professional2003

Other Tools : PyCharm, Autosys, Anthill, Maven, GitHub, Tortoise SVN,

TIBCO DataSynapse, IBM Symphony

Wells Fargo & Co., Charlotte, NC 12/2016 – Till Date

Premier Inc, Charlotte, NC 05/2016 – 12/2016

Taya Technologies, Hyderabad, India 12/2011 – 09/2014

Syntel Ltd, Chennai, India 11/2010 – 12/2011

Environment: Java, JDBC, HTML, SQL, Oracle, BM Rational, Eclipse IDE

Master of Business Administration : Mumbai Business School (’09 – ’10)

 Python 3 Certification (Certificate # - 1073 – 11297679)

Potrebbero piacerti anche