Sei sulla pagina 1di 9

Data Science with R

Project Submission
-Swedish Insurance Project

Submitted by : xxxx

Submitted on dd-mmm-yyyy
18-Aug-2017 1
Problem Statement
The data gives the details of third party motor insurance claims in
Sweden for the year 1977.
In Sweden, all motor insurance companies apply identical risk
arguments to classify customers, and thus their portfolios and their
claims statistics can be combined.
The data were compiled by a Swedish Committee on the Analysis of
Risk Premium in Motor Insurance.
The Committee was asked to look into the problem of analyzing the
real influence on the claims of the risk arguments and to compare this
structure with the actual tariff.
18-Aug-2017 2
Problem Statement - continued
1. The committee is interested to know each field of the data collected through descriptive
analysis to gain basic insights into the dataset and to prepare for further analysis.
2. The total value of payment by an insurance company is an important factor to be monitored. So
the committee has decided to find whether this payment is related to the number of claims and
the number of insured policy years. They also want to visualize the results for better
understanding.
3. The committee wants to figure out the reasons for insurance payment increase and decrease.
So they have decided to find whether distance, location, bonus, make, and insured amount or
claims are affecting the payment or all or some of them are affecting it.
4. The insurance company is planning to establish a new branch office, so they are interested to
find at what location, kilometer, and bonus level their insured amount, claims, and payment get
increased.
5. The committee wants to understand what affects their claim rates so as to decide the right
premiums for a certain set of situations. Hence, they need to find whether the insured amount,
zone, kilometer, bonus, or make affects the claim rates and to what extent.
18-Aug-2017 3
Question-1
The committee is interested to know each field of the data collected through descriptive analysis to gain basic insights into
the dataset and to prepare for further analysis.

Step-1: Univariate Analysis:- To find the number of variables in the given data and find its data type. 2nd level
analysis to find the unique data for each variable.

Step-2: Summarizing the given data. Kilometres, Zone, Bonus and Make are categorical variables. So that
converted into factors. Mean of Insured is 1092.2, Mean of Claims is 51.87 and Mean of Payment is 257008.
Finding-1: Claims & Payment have 0 minimum value, ie, there is no claim when there is no payment

18-Aug-2017 4
Question-2
The total value of payment by an insurance company is an important factor to be monitored. So the committee has decided
to find whether this payment is related to the number of claims and the number of insured policy years. They also want to
visualize the results for better understanding.

Finding: Claims have 99% correlation with Payment Finding: Insured have 93% correlation with Payment

Claims : 0.9954003 Insured : 0.933217


10000000

10000000
Payment

Payment
0

0
0 500 1500 2500 0 40000 80000 120000

Claims Insured

18-Aug-2017 5
Question-3
The insurance company is planning to establish a new branch office, so they are interested to find at what location, kilometer,
and bonus level their insured amount, claims, and payment get increased.

Perform Linear Regression function to find the impact for Dependent variable: Payment
Independent variables: Insured, Claims, Make, Bonus, Zone, and Kilometers

18-Aug-2017 6
Question-4
The committee wants to figure out the reasons for insurance payment increase and decrease. So they have decided to find
whether distance, location, bonus, make, and insured amount or claims are affecting the payment or all or some of them are
affecting it.
Grouped the Insured, Claims and Payments based on a)Zone, b)Kilometre & c) Bonus.
The findings are marked below in the screenshot

18-Aug-2017 7
Question-5
The committee wants to understand what affects their claim rates so as to decide the right premiums for a certain set of
situations. Hence, they need to find whether the insured amount, zone, kilometer, bonus, or make affects the claim rates and
to what extent.
Perform Linear Regression function to find the impact for Dependent variable: Claims
Independent variables: Insured, Claims, Make, Bonus, Zone, and Kilometers

18-Aug-2017 8
Thank You

18-Aug-2017 9

Potrebbero piacerti anche