Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
ASSESSMENT
april 15, 2020
Overview
Primarily, Sprocket Central Pty Ltd needs help with its customer and transactions data. The
organisation has a large dataset relating to its customers, but their team is unsure how to
effectively analyse it to help optimise its marketing strategy.
Goals
1. To explore the given data set and analyze the quality of the data set.Find the flaws and
issues with the dataset.
Specifications
The given data set named “KPMG_VI_New_raw_data_update_final.xlsx” contains 5
sheets(datasets) namely CustomerDemographic ,NewCustomerList , CustomerAddress,
Transactions,Title sheet.
Milestones
1. CustomerDemographic dataset:
When we analyse the data set the first problem we encounter one problem that is the
columns aren't properly named, all the columns are unnamed on further analysis we
realize that first row of the dataset are row of column name so we fix this.
2
Thereafter we observe that there are many numbers of fields which are empty.
So we fixed it.
Next,we have a column of date of birth but date of birth is not really usefull for our
analysis so we make a column with age of customers.
3
Next,we se there is a column names default with contains random characters which
makes no sense so to make a futher analysis easy we will drop this column.
4
2. NewCustomerList dataset:
Similar to the the demographic dataset the columns are not properly named over
here.
We fixed it.
secondly ,there are lot of field in the number of purchases columns which are missing
and that particular filed plays a really important role in out analysis so we will drop all
rows with missing columns.
6
GITHUB:https://github.com/sparky1911/k_p_m_g