Sei sulla pagina 1di 8

Deepak Sanagapalli

Hadoop Developer
Email: deepaksanagapalli93@gmail.com
Phone no: (614)-726-2750

PROFESSIONAL SUMMARY:

8+ years of overall experience in IT Industry which includes experience in Java, Big data
technologies and web applications in multi-tiered environment using Java, Hadoop, Hive, HBase,
Pig, Sqoop, J2EE (Spring, JSP, Servlets), JDBC, HTML, Java Script(Angular JS).
Working knowledge with various other Cloudera Hadoop technologies (Impala, Sqoop, HDFS,
SPARK , SCALA etc)
4 years of comprehensive experience in Big Data Analytics.
Extensive experience in Hadoop Architecture and various components such as HDFS, Job
Tracker, Task Tracker, Name Node, Data Node, and MapReduce concepts.
Expertise in Apache Spark Development (Spark SQL, Spark Streaming, MLlib, GraphX,
Zeppelin, HDFS, YARN, and NoSQL).
Well versed in installation, configuration, supporting and managing of Big Data and underlying
infrastructure of Hadoop Cluster along with CDH3&4 clusters.
Worked on designed and implemented a Cassandra based database and related web service for
storing unstructured data.
Experience on NoSQL databases including HBase, Cassandra.
Designed and implemented a Cassandra NoSQL based database and associated restful web
service that persists high-volume user profile data for vertical teams.
Experience in building large scale highly available Web Applications .Working knowledge of web
services and other integration patterns.
Experience in managing and reviewing Hadoop log files.
Experience in using Pig, Hive, Scoop and Cloudera Manager.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database
Systems and vice-versa.
Hands on experience in RDBMS, and Linux shell scripting
Extending Hive and Pig core functionality by writing custom UDFs.
Experience in analyzing data using HiveQL, Pig Latin and Map Reduce.
Developed MapReduce jobs to automate transfer of data from HBase.
Knowledge in job work-flow scheduling and monitoring tools like Oozie and Zookeeper.
Knowledge of data warehousing and ETL tools like Informatica and Pentaho.
Experienced in Oracle Database Design and ETL with Informatica.
Mentored, coached, cross-trained junior developers by providing domain knowledge, design advice.
Proven ability in defining goals, coordinating teams and achieving results.
Procedures, Functions, Packages, Views, materialized views, function based indexes and Triggers,
Dynamic SQL, ad-hoc reporting using SQL.
Business Intelligence (DW) applications.
Worked hands on ETL process
Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper, of
NoSQL databases such as HBase, Cassandra, and of administrative tasks such as installing
Hadoop, Commissioning and decommissioning, and its ecosystem components such as Flume,
Oozie, Hive and Pig.
Extensive experience in using MVC architecture, Struts, Hibernate for developing web
applications using Java, JSPs, JavaScript, HTML, jQuery, AJAX, XML and JSON.
Excellent Java development skills using J2EE, spring, J2SE, Servlets, JUnit, MRUnit, JSP,
JDBC.
Techno-functional responsibilities include interfacing with users, identifying functional and
technical gaps, estimates, designing custom solutions, development, leading developers, producing
documentation, and production support.
Excellent interpersonal and communication skills, creative, research-minded, technically competent
and result-oriented with problem solving and leadership skills.

EDUCATION:

Bachelors of Technology from Jawaharlal Technological University, Hyderabad, India

TOOLS AND TECHNOLOGIES:

Programming Languages C, C++, Java, Python, Scala, Shell Scripting, SQL, PL/SQL
J2EE Technologies Core Java, Spring, Servlets, SOAP/REST services , JSP, JDBC, SML, Hibernate.
BigData Ecosystem HDFS, HBase, MapReduce, Hive, Pig, Sqoop, Impala, Cassandra, Oozie,
Zookeeper, Flume, Ambary, Storm, Spark and Kafka.
Databases NoSQL, Oracle 10g/11g/12C, SQL Server 2008/2008 R2/2012/2014/2016/2017,
MySQL 2003-2016.
Database Tools Oracle SQL Developer, MongoDB, TOAD and PLSQL Developer
Web Technologies HTML5, JavaScript, XML, JSON, jQuery, Ajax
Web Services Web Logic, Web Sphere, Apache Cassandra, Tomcat
IDEs Eclipse, NetBeans, WinSCP.
Operating systems Windows, UNIX, Linux (Ubuntu), Solaris, Centos, Ubuntu, Windows Server
2003/2006/2008/2009/2012/2013/2016.
Version and Source Control CVS, SVN, Clear Case
Servers IBM WebSphere 4.0/5.x/8.5/9.0, Apache Tomcat 4.x/5.x/6.x/7.0/8.x/9.0, JBoss
3.2/4.0/5.1/7.1/8.0/9.0/10.1
Frameworks MVC, Struts, Log4J, Junit, Maven, ANT, Web Services.

PROFESSIONAL EXPERIENCE:

Client: Nationwide Insurance, Columbus, OH Jan 2016 Till date


Role: Hadoop Developer

Description: Nationwide Insurance is a leading company which provides many types of


insurances which includes commercial auto insurance, Health insurance etc. We process huge
chunks of customer data into structure formats using various tuning techniques and develop
web application to facilitate a leading commercial auto insurance company(Nationwide) for
validating insurance claims. The application assigns a claim number with the customers details
bound to it which is sent to a claim administrator and then are assigned to the officers to
process claims.

Responsibilities:

Installed and Configured multi-nodes fully distributed Hadoop cluster.


Analyzed large and critical datasets using Cloudera, HDFS, HBase, MapReduce, Hive, Hive UDF,
Pig, Sqoop, Zookeeper and Spark.
Imported data into HDFS from various SQL databases and files using Sqoop and from streaming
systems using Storm into Big Data Lake.
Worked with NoSQL databases like Base to create tables and store the data Collected and aggregated
large amounts of log data using Apache Flume and staged data in HDFS for further analysis.
Developed custom aggregate functions using Spark SQL and performed interactive querying.
Wrote Pig scripts to store the data into HBase
Created Hive tables, dynamic partitions, buckets for sampling, and worked on them using Hive QL
Stored the data in tabular formats using Hive tables and Hive Sere.
Exported the analyzed data to Teradata using Sqoop for visualization and to generate reports for the BI
team. Experienced on loading and transforming of large sets of structured, semi structured and
unstructured data.
Spark Streaming collects this data from Kafka in near-real-time and performs necessary transformations
and aggregation on the fly to build the common learner data model and persists the data in NoSQL store
(HBase).
Involved in Installing Hadoop Ecosystem components.
Responsible to manage data coming from different sources.
Setup Hadoop Cluster environment administration that includes adding and removing cluster nodes,
cluster capacity planning and performance tuning.
Written Complex Map reduce programs.
Involved in HDFS maintenance and administering it through Hadoop-Java API
Configured Fair Scheduler to provide service level agreements for multiple users of a cluster
Loaded data into the cluster from dynamically generated files using FLUME and from RDBMS using
Sqoop.
Involved in writing Java APIs for interacting with HBase
Involved in writing Flume and Hive scripts to extract , transform and load data into Database
Used HBase as the data storage
Installed and configured Hadoop Map Reduce, HDFS, Developed multiple Map Reduce jobs in java for
data cleaning and preprocessing.
Experienced in installing, configuring and using Hadoop Ecosystem components.
Experienced in Importing and exporting data into HDFS and Hive using Sqoop.
Knowledge in performance troubleshooting and tuning Hadoop clusters.
Participated in development/implementation of Cloudera Hadoop environment.
Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
Experienced in running query using Impala and used BI tools to run ad-hoc queries directly on Hadoop.
Installed and configured Hive and also written Hive UDFs and Used Map Reduce and Junit for unit
testing.
Experienced in working with various kinds of data sources such as Teradata and Oracle. Successfully
loaded files to HDFS from Teradata, and load loaded from HDFS to Hive and impala.
Load and transform large sets of structured, semi structured and unstructured data using Hadoop/Big
Data concepts.
Developed and delivered quality services on-time and on-budget. Solutions developed by the team use
Java, XML, HTTP, SOAP, Hadoop, Pig and other web technologies.
Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time
and data availability.
Developed MapReduce programs to parse the raw data, populate staging tables and store the refined
data in partitioned tables in the EDW.
Monitored and managed the Hadoop cluster using Apache Ambary
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and
troubleshooting, manage and review data backups, manage and review Hadoop log files.

Environment: Java, Hadoop, Hive, Pig, Sqoop, Flume, HBase, Oracle 10g/11g/12C, Teradata, Cassandra,
HDFS, Data Lake, Spark, MapReduce, Ambari, Cloudera, Tableau, Snappy, Zookeeper, NoSQL, Shell Scripting,
Ubuntu, Solar.

Client: AT&T, Middletown NJ Jan 15 -Dec 2015


Role: Hadoop Developer

Description: The project is to build a dashboard to display visual data moreover in the form of charts which is
used by VPs and senior VPs inside AT&T. After login into secure global page this dashboard will demonstrate
the information related to projects which are being monitored under them.

Responsibilities:

Responsible for building scalable distributed data pipelines using Hadoop.


Used Apache Kafka for tracking data ingestion to Hadoop cluster.
Wrote Pig scripts to debug Kafka hourly data and perform daily roll ups.
Data Migration from existing Teradata systems to HDFS and build datasets on top of it.
Built a framework using SHELL scripts to automate Hive registration, which does dynamic table
creation and automated way to add new partitions to the table.
Designed Hive external tables using shared meta-store instead of derby with partitioning, dynamic
partitioning and buckets.
Setup and benchmarked Hadoop/HBase clusters for internal use. Developed Simple to complex
MapReduce programs.
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
Developed Oozie workflows that chain Hive/MapReduce modules for ingesting periodic/hourly input
data.
Wrote Pig & Hive scripts to analyze the data and detect user patterns.
Implemented Device based business logic using Hive UDFs to perform ad-hoc queries on structured
data.
Storing and loading the data from HDFS to Amazon S3 and backing up the Namespace data into NFS
Filers.
Prepared Avro schema files for generating Hive tables and shell scripts for executing Hadoop
commands for single execution.
Continuously monitored and managed the Hadoop cluster by using Cloudera Manager.
Worked with administration team to install operating system, Hadoop updates, patches, version
upgrades as required.
Developed ETL pipelines to source data to Business intelligence teams to build visualizations.
Involved in unit testing, interface testing, system testing and user acceptance testing of the workflow
Tool.

Environment: Cloudera Manager, Map Reduce, HDFS, Pig, Hive, Sqoop, Apache Kafka, Oozie, Teradata, Avro,
Java (JDK 1.6), Eclipse.

Client: VISA Inc, Wellesley, MA May 2013 Dec 2014


Role: Hadoop Developer

Description: Visa is a global payments technology company that connects consumers, businesses, banks and
governments in more than 200 countries and territories, enabling them to use electronic payments instead of cash
and checks.

Responsibilities:

Created dashboards according to user specifications and prepared stories to provide an understandable
visions.
Resolving User Support requests
Administer and Support Hadoop Clusters
Loaded data from RDBMS to Hadoop using Sqoop
Providing solutions to ETL/Data warehousing teams as to where to store the intermediate and final
output file in the various layers in Hadoop
Worked collaboratively to manage build outs of large data clusters.
Helped design big data clusters and administered them.
Worked both independently and as an integral part of the development team.
Communicated all issues and participated in weekly strategy meetings.
Administered back end services and databases in the virtual environment.
Implemented system wide monitoring and alerts.
Implemented big data systems in cloud environments.
Created security and encryption systems for big data.
Performed administration troubleshooting and maintenance of ETL and ELT processes
Collaborated with multiple teams for design and implementation of big data clusters in cloud
environments
Developed PIG Latin scripts for the analysis of semi structured data.
Developed and involved in the industry specific UDF (user defined functions)
Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
Used Sqoop to import data into HDFS and Hive from other data systems.
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Migration of ETL processes from RDBMS to Hive to test the easy data manipulation.
Developed Hive queries to process the data for visualizing.
Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster.
Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
Developed a custom file system plugin for Hadoop to access files on data platform.
The custom file system plugin allows Hadoop Map Reduce programs, HBase, Pig, and Hive to access
files directly.
Experience in defining, designing and developing Java applications, specially using Hadoop
[Map/Reduce] by leveraging frameworks such as Cascading and Hive.
Teradata vast knowledge experience.
Extracted feeds from social media sites such
Imported data using Sqoop to load data from Oracle to HDFS on a regular basis.
Developing scripts and batch jobs to schedule various Hadoop Programs.
Have written Hive Queries for data analysis to meet the business requirements.

Creating Hive Tables and working on them using Hive QL.

Environment: HDFS, Hive, ETL, PIG, UNIX, Linux, CDH 4 distribution, Tableau, Impala, Teradata, Pig
,Sqoop, flume, Oozie

Client: GNS Healthcare - Cambridge, MA Aug 2012- Apr 2013


Role: Hadoop Developer

Description: Health record team of GNS Health initiative gathers patient/person information across all the data
sources and creates Person record that will be used by downstream systems for running analytics against that data.

Responsibilities:

Oversee the performance of Design to develop technical solutions from Analysis documents.
Exported data from DB2 to HDFS using Sqoop.
Developed MapReduce jobs using Java API.
Installed and configured Pig and also wrote Pig Latin scripts.
Wrote MapReduce jobs using Pig Latin.
Developed workflow using Oozie for running MapReduce jobs and Hive Queries.
Worked on Cluster coordination services through Zookeeper.
Worked on loading log data directly into HDFS using Flume.
Involved in loading data from LINUX file system to HDFS.
Responsible for managing data from multiple sources.
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Responsible to manage data coming from different sources.
Assisted in exporting analyzed data to relational databases using Sqoop.
Implemented JMS for asynchronous auditing purposes.
Created and maintained Technical documentation for launching Cloudera Hadoop Clusters and for
executing Hive queries and Pig Scripts
Experience with CDH distribution and Cloudera Manager to manage and monitor Hadoop clusters
Experience in defining, designing and developing Java applications, specially using Hadoop
[Map/Reduce] by leveraging frameworks such as Cascading and Hive.
Experience in Develop monitoring and performance metrics for Hadoop clusters.
Experience in Document designs and procedures for building and managing Hadoop clusters.
Strong Experience in troubleshooting the operating system, maintaining the cluster issues and also java
related bugs.
Experienced import/export data into HDFS/Hive from relational database and Teradata using Sqoop.
Involved in Creating, Upgrading, and Decommissioning of Cassandra clusters.
Involved in working on Cassandra database to analyze how the data get stored.
Successfully loaded files to Hive and HDFS from Mongo DB Solar.
Experience in Automate deployment, management and self-serve troubleshooting applications.
Define and evolve existing architecture to scale with growth data volume, users and usage.
Design and develop JAVA API (Commerce API) which provides functionality to connect to the
Cassandra through Java services.
Installed and configured Hive and also written Hive UDFs.
Experience in managing the CVS and migrating into Subversion.
Experience in managing development time, bug tracking, project releases, development speed, release
forecast, scheduling and many more.

Environment: Hadoop, HDFS, Hive, Flume, Sqoop, HBase, PIG, Eclipse, MySQL and Ubuntu, Zookeeper, Java
(JDK 1.6).

Client: Order fulfillment system Bengaluru, INDIA Jan 2011- Jul 2012
Role: Java Developer

Description: Cadence enables global electronic-design innovation and plays an essential role in the creation of
today's integrated circuits and electronics. Customers use Cadence software and hardware, methodologies, and
services to design and verify advanced semiconductors, printed-circuit boards and systems used in consumer
electronics, networking and telecommunications equipment, and computer systems.

Responsibilities:

Gathered user requirements followed by analysis and design. Evaluated various technologies for the
client.
Developed HTML and JSP to present Client side GUI.
Involved in development of JavaScript code for client side Validations.
Designed the HTML based web pages for displaying the reports.
Developed the HTML based web pages for displaying the reports.
Developed java classes and JSP files.
Extensively used JSF framework.
Created Cascading Style Sheets that are consistent across all browsers and platforms
Extensively used XML documents with XSLT and CSS to translate the content into HTML to present to
GUI.
Developed dynamic content of presentation layer using JSP.
Develop user-defined tags using XML.
Developed Cascading Style Sheets(CSS) for creating effects in Visualforce pages
Developed Java Mail for automatic emailing and JNDI to interact with the knowledge server.
Used Struts Framework to implement J2EE design patterns (MVC).
Developed, Tested and Debugged the Java, JSP and EJB components using Eclipse.
Developed Enterprise java Beans like Entity Beans, session Beans (both Stateless and State full Session
beans) and Message Driven Beans.

Environment: Java, J2EE 6, EJB 2.1, JSP 2.0, Servlets 2.4, JNDI 1.2, Java Mail 1.2, JDBC 3.0, Struts,
HTML, XML, CORBA, XSLT, Java Script, Eclipse3.2, Oracle10g, Weblogic 8.1, Windows 2003.

Client: Maruthi Insurance, Hyd, India Mar 2009 - Dec 2010


Role: Java Developer

Description: The project basically involved the automation of the existing system of generating Quotations for
Automobile Insurance. The Insured Auto project has 2 modules, the User module RetailAutoQuote is used to
register new customers of a group Insurance policy, generate Quotations for Automobile Insurance, edit details
and place requests.

Responsibilities:

Created the Database, User, Environment, Activity, and Class diagram for the project (UML).
Implement the Database using Oracle database engine.
Designed and developed a fully functional generic n-tiered J2EE application platformthe environment
was Oracle technology driven. The entire infrastructure application was developed using Oracle
JDeveloper in conjunction with Oracle ADF-BC and Oracle ADF- RichFaces.
Created an entity object (business rules and policy, validation logic, default value logic, security)
Created View objects, View Links, Association Objects, Application modules with data validation rules
(Exposing Linked Views in an Application Module), LOV, dropdown, value defaulting, transaction
management features.
Web application development using J2EE: JSP, Servlets, JDBC, Java Beans, Struts, Ajax, JSF, JSTL,
Custom Tags, EJB, JNDI, Hibernate, ANT, JUnit and Apache Log4J, Web Services, Message
Queue (MQ).
Designing GUI prototype using ADF 11G GUI component before finalizing it for development.
Used Cascading Style Sheet (CSS) to attain uniformity through all the pages
Create Reusable Component (ADF Library and ADF Task Flow)
Experience using Version controls such as CVS, PVCS, and Rational Clear Case.
Creating Modules Using Task Flow with Bounded and Unbounded
Generating WSDL (Web Services) And Create Work Flow Using BPEL
Handel the AJAX functions (partial trigger, partial Submit,auto Submit)
Created the Skin for the layout.

Environment: Java core, Servlet, JSF, ADF Rich client UI Framework ADF-BC (BC4J) 11g, web services Using
Oracle SOA (Bell), Oracle WebLogic.

Potrebbero piacerti anche