Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Professional Summary:
• Over 6+ years of IT work experience with 5+ years of relevant experience in the field of
Big Data, Hadoop, Hive, Pig, Java, Spark, Sqoop related technologies, 1+ year
experience in Oracle Technologies and 8+ months of experience on Java development
• Seeking a challenging position in Software development industry that needs innovation,
creativity and dedication and enable me to continue to work in a demanding and fast
paced environment, leveraging my current knowledge and fostering creativity with many
learning opportunities.
• Problem solving capability with main design goals of ER modeling for OLTP and OLAP.
And, implemented the best solutions suitable for the business needs. Developed core
modules in large cross-platform applications using JAVA, J2EE, JVM
• Repurposed Python scripts to Java/Python/AWS Lambda environments.
• Monitoring the Cassandra cluster using Splunk.
• Hands-On Experience in Hadoop/HDFS, MapReduce, Hive, HBase, Pig, Sqoop,
Amazon Elastic Map Reduce (EMR), Spark, Cloudera (CDH 3, 4, 5) sandbox
environments
• Having rich experience on big data, data warehouse, database, and business
intelligence experience using the following technologies: Oracle Database (10g, 11g).
And, involved in the project activities related to Requirements Gathering, Systems
Analysis and Design, Code Generation, Testing, Implementation, Support and
Maintenance.
• Experience in developing and automating application’s using Unix Shell Scripting in the
field of Big Data using Map-Reduce Programming for batch processing of jobs on a
HDFS cluster, Hive and Pig.
• Developed real-time Big Data solutions using No-SQL Column Oriented Databases
using Hbase, Cassandra, MongoDB, CouchDB which can handle petabytes of data at
one time.
• Working on Spark, SparkSQL environments using Scala programming. Involved in the
operations of Spark Streaming, Spark SQL, Scala Programming and Performance
Tuning
• Involved in the activities of RDD Creations and Operations of Data Frames and
Datasets for the Use-Case.
• Worked on HBase Shell, CQL, HBase API developing ingestion and clustering
frameworks with respect to Kafka, Zookeeper, YARN, Spark, Mesos and Kafka.
• Capturing data from existing databases that provide SQL interfaces using Sqoop and
processing stream data using Kafka, Spark Streaming and Flume.
• Hands-On experience in setting up Zookeeper for providing High Availability to clusters.
Hands on programming with Oozie and having Good knowledge of programming with
log data using Apache Flume.
• Developed Python based API for converting the files to Key-Value pairs for getting the
files sourced to the Splunk Forwarder.
• Developed a fully automated continuous integration system using Git, Jenkins, Splunk,
Hunk, Oracle and custom tools developed in Python and Bash.
• Strong experience in RDBMS using Oracle 10g, SQL Server, PL-SQL programming,
schema development, Oracle performance tuning.
• Active participation in tomcat server, webserver and Oracle problems (killing
instances, debugging server logs, applications logs).
• Written SQL queries, stored procedures, modifications to existing database structure
as required per addition of new features.
• Designed and developed Enterprise Eligibility business objects and domain objects with
Object Relational Mapping framework such as Hibernate.
• Experienced in design and development of various web and enterprise applications
using J2EE technologies like JSP, Servlets, JSF, EJB, JDBC, Hibernate, Spring MVC,
XML, JSON, AJAX, ANT and Web Services (SOAP, REST, WSDL).
• Experienced in WEB and GUI development using HTML, DHTML, XHTML, CSS,
JavaScript, JSP, Angular JS, JQuery, AJAX technologies.
• Working knowledge of Spring and Hibernate framework.
• Experience utilizing best practices for getting data into Splunk and the Common
Information Model.
• Led team to plan, design, and implement applications and software.
• Collaborated with business analysts, developers, and technical support teams to define
project requirements and specifications. Designed, developed, and managed map-
reduce-based applications, integrating with databases, establishing network
connectivity, and developing programs.
• Experience in provisioning Amazon Web Services (AWS) like EC2, ELB, S3, EBS, VPC,
RDS, DynamoDB, IAM, SNS, SQS, SWF, Route 53, Auto Scaling, Lambda, CloudFront,
CloudWatch, CloudFormation, Security Groups, ACL, NACL.
• Good knowledge at SOAP/WSDL and REST-FUL interfaces in Java. Created and
executed both load and functional tests for web services.
• Assisted project manager in defining project scope, time & effort estimates and
deliverable management.
• Developed a fully automated continuous integration system using Git, Jenkins, Splunk,
Oracle and custom tools developed in Python and Bash.
• Developed a proof of concept for using Spark and Kafka to store and process data
• Capturing data from existing databases that provide SQL interfaces using Sqoop.
• Importing and exporting data in HDFS and Hive using Sqoop
• Knowledge of machine learning, data science, artificial intelligence platform and
algorithms within the Big Data Python environment using Tensorflow.
• Familiar with machine learning algorithms related to supervised (Classification,
Regression), unsupervised (Dimensionality Reduction, Clustering), reinforcement (Real-
time decision making) algorithms.
• Involved in doing research in the areas of Cloud Computing, Data Science, Machine
Learning, Artificial Intelligence, Robotics, Automation
• Involved in the activities of doing research in the areas of Autonomous Vehicles, and,
Robotic Programming
Education Details:
Big Data/ Hadoop Cloudera CDH 5.1.3, Hortonworks HDP 2.0, MongoDB, Python,
shell script, Hadoop, HDFS, MapReduce (MRV1, MRV2 YARN),
HBase, Pig, Hive, Sqoop, Flume, ZooKeeper, Oozie, Lucene,
Cassandra, CouchDB, MongoDB, Kafka, Scala, R, Kafka
Languages Java, C, HTML, SQL, PL/SQL, Scala
OS Windows 8, Windows 7, Windows XP/98, UNIX/LINUX, MAC
Databases Oracle (SQL / PLSQL), MySQL, NoSQL, Teradata
Web Technologies HTML, DHTML, XML, WSDL, SOAP, Joomla, Apache Tomcat
Professional Experience:
Anthem, Inc. is an American health insurance company previously known as WellPoint Inc. It is
the largest for-profit managed health care company in the Blue Cross Blue Shield Association. It
was formed when Anthem Insurance Company acquired WellPoint Health Networks Inc., with the
combined company adopting the name WellPoint Inc.; trading on the NYSE for the combined
company began under the WLP symbol. In 2014, the company changed its corporate name to
Anthem Inc., and its NYSE ticker changed from WLP to ANTM.
Job Responsibilities:
• Worked on the Oracle, Teradata, Cloudera (CDH) for doing operations on the storage,
processing, migration of data.
• Involved in development of processing of the files stored in HDFS for the analytical
purposes.
• Performance optimizations based on SQL, PL/SQL, HiveQL scripts, python scripts, shell
scripts and CRON schedule.
• Performed testing activities related to Performance Testing, Unit Testing, Load Testing,
Functional Testing, Automated Testing for the HiveQL, Spark-SQL, PySpark scripts
developed relate to performance, scalability, reliability, availability, maintainability.
• Involved in the activities of RDD Creations and Operations of Data Frames and
Datasets for the Use-Case
• Proficient knowledge in bash shell scripting.
• Worked on parallel processing of data using in-built functions within shell script in the
UNIX environment
• Discovered the applications of data science algorithms to the big data platform within
Spark environment.
Environment:
Apache Hadoop, Apache Hive, Cloudera (CDH 5), Ubuntu, HDFS, MapReduce, Amazon Web
Services (AWS), Python, Splunk, Spark, Teradata, Oracle.
Servers: Redhat Enterprise Linux
Job Responsibilities:
Environment: Apache Hadoop, Apache Hive, Apache Pig, Cloudera (CDH 5), MapR, Ubuntu,
HDFS, MapReduce, Amazon Web Services(AWS), Python, Splunk, Elastic-Search, Logstash,
Kibana (ELK), Tableau, Spark, Cassandra, Teradata, Oracle.
Servers: Redhat Enterprise Linux, Mainframe
Job Responsibilities:
• Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce
jobs in Java for data cleaning and preprocessing.
• Worked on Installing and configuring the HDPHortonWork2.X and Cloudera (CDH 5.5.1)
Clusters in Dev and Production Environments.
• Volumetric Analysis for 43 feeds (CurrentApproximate Size of Data (70TB), Based on
which size of ProductionCluster is to be decided.
• Involved in loading data from UNIX file system to HDFS.
• Involved in creating Hivetables, loading with data and writing hive queries which will run
internally in MapReduce way.
• Responsible for implementation and ongoing administration of Hadoop infrastructure
• Importing and exporting data from different databases like MySQL, RDBMS into HDFS
and HBASE using Sqoop.
• Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning
Datanodes, Troubleshooting, Manage and review data backups, Manage & review log
files.
• In-depth knowledge of LDAP and Identity & Access management products.
• Designed LDAP Schemas, DITs to implement enterprise wide centralized repository.
• Hands on Experience in configuring LDAP, SSL, SSO and Digital Signatures.
Environment: Hadoop, MapReduce, HDFS, HBase, HDP Horton, Sqoop, Data Processing
Layer, HUE, AZURE, UNIX, MySQL, RDBMS, Ambari, Solr Cloud, Cloudera, Lily HBase, Cron.
Job Responsibilities:
• Participate in business and system requirements sessions
• Provided inputs on solution architecture based on solution alternatives, frameworks,
products
• Enhanced Search Query Performance based on Splunk Search Queries
• Performance optimizations based on python scripts, shell scripts and CRON schedule
• Involved in Resolving technical issues during development, deployment, and support
• Performed testing activities related to performance testing, Unit Testing, Load Testing,
Functional Testing, Automated testing for the python scripts developed
• Requirements elicitation and translation to technical specifications
• Actively involved in mounting file-systems, software installations, establishing
connectivity for the WAS, JBOSS, IAAS Servers to the integration systems related to
Databases (Oracle, Mainframe)
• Actively involving in monitoring the server’s health using the Splunk Monitoring and
Alerting tool, Tivoli Alerting tool
• Anchor proof of concept (POC) development to validate proposed solution and
reduce technical risk.
• Perform performance optimizations on Java/JVM frameworks and UNIX Shell Scripts
• Engaged multiple teams for sourcing the data files from the databases (Oracle,
Mainframe) to the servers involved in the platform
• Involved in configuring Load Balancer Configuration on the servers
• Involved in setting up Kafka and Zookeeper Producer-Consumer components for the
Big Data Environments
• Got Certified as Splunk Certified Power User
• Used Java Collection Framework for developing Map-Reduce applications and APIs for
NoSQL databases
• Involved in the activities of Python and shell-scripting for the Key-Value Pairs creation
and masking the PII data fields
• Developed Spark scripts by using Scala shell commands as per the requirement.
• Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
• Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data
Frames and Pair RDD's.
• Worked on migrating Map Reduce programs into Spark transformations using Spark and
Scala
Environment: Apache Hadoop, Apache Hive, Apache Pig, Cloudera (CDH 5), MapR, Ubuntu,
HDFS, MapReduce, Amazon Web Services(AWS), Python, Splunk, Supervisor, Monit,
Hazelcast, HAProxy, Kafka, Zookeeper, Elastic-Search, Logstash, Kibana (ELK), Servers:
JBOSS, WAS, IAAS, E-PAAS, Redhat Enterprise Linux, Talend, Microsoft Azure
Job Responsibilities:
• Worked as a Dev-Ops Engineer.
• Involved in the activities of Release Planning
• Involved in the activities of deployments, developments, Change Request Creations,
Environment Readiness
• Performed activities on the development and production clusters
• Documented Design Documents for Big Data Analytics & Reporting
• Involved in the activities of daily standups and scrum planning
• Worked using Azure Data Lake Store for analyzing the data stored on YARN and
HDFS including multiple access methods related to Spark, Hive, HBase
• Analyzed the different kinds of structured and unstructured data including the
processing of files within the data stored in the Data Lake.
• Worked on App Engine and Amazon AWS back-ends and in the front-ends as well
• Migrated Hadoop metadata to Docker container
• Involved in the activities of Amazon EMR, S3, setting up connectivity using VPC
Connection
• Performed map-reduce operations using Amazon EMR
• Experienced in Data Modelling in SQL and NoSQL Databases
• Hands on experience in NOSQL databases like HBase, Cassandra, MongoDB.
• Worked using the tools of JIRA and Jenkins within the project.
Environment: Apache Hadoop, Apache Hive, Ubuntu, HDFS, MapReduce, Shell Scripting,
Python, HBase, Mongo DB, Couch DB, JIRA, Jenkins
Nike, Inc. is an American multinational corporation that is engaged in the design, development,
manufacturing and worldwide marketing and sales of footwear, apparel, equipment, accessories
and services. The company is headquartered near Beaverton, Oregon, in the Portland
metropolitan area. It is one of the world's largest suppliers of athletic shoes and apparel.
Job Responsibilities:
• Worked with the QA team to perform testing of the components related to Big Data
environment and leveraging the capabilities of the existing scripts on the servers and
automating the scripts execution.
• Worked with Data Analytics team for meeting the testing requirements involved with the
Hive & Pig scripts for different Use-Cases in Hadoop.
• Documented Design Documents for Big Data Analytics & Reporting
• Performed Unit Testing for the python scripts
• Performed automation testing for the java scripts development involved.
• Involved in the operations of Cloudera, Hortonworks, MapR environments
• Performed End-to-End testing for the scripts execution in the big-data clusters
• Written test results and verified actual test results with the expected results for SQL and
HiveQL Queries
• Analyzed large data sets by running Hive queries and Pig scripts.
• Worked with the Data Science team to gather requirements for various data mining
projects .
• Developed multiple MapReduce jobs in Java for data cleaning and preprocessing
• Involved in loading data from LINUX file system to HDFS and then to Amazon S3.
• Responsible for creating and managing HBase Data Store.
• Hands on experience in NOSQL databases like HBase, Cassandra, MongoDB.
Environment: Apache Hadoop, Apache Hive, Apache Pig, Cloudera (CDH 5), Ubuntu, Auto-
CAD, Sqoop, HDFS, MapReduce, NoSQL, HBase, CouchBase, Oozie, Amazon Web
Services(AWS), Spark, Storm, Flume, Python, Shell Scripting
Environment:
CloudEra (CDH 3,4,5), Apache Hadoop, Linux, HDFS, Hive, Pig, Sqoop, Flume, Zookeeper,
HBase, Oozie, Flume, HortonWorks, MongoDB, Java, Map Reduce , Amazon EC2
infrastructure, Amazon Elastic Map Reduce (EMR), MySQL, shell scripts.