Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Question 1
actual values are known. The confusion matrix have four possible outcome that include true
positives, true negatives, false positives and false negatives. True positive are those cases that are
predicted and actually have the outcome. True negatives are those cases that are predicted and do
not have the measure value. False positive are cases that are predicted to have the elements but
do not have and the false negative are predicted not to have and actually do not have.
Question 2
In business field a confusion matrix have various applications. For instance the matrix can be
used to determine accuracy of a customer buying a particular brand. Using the true and false
value of either positive or negative the shop will determine brand that have the highest
preference. The second application is whether a customer shops on online or in a physical store
that will be key in decision making on the area to increase advertisement by the business.
Question 3
The two example that over sampling method is required to be applied before using the building
method include brand selection of the customer purchasing time. It should be note a business will
have many brands and the customer have different preferences for those brands. Over sampling
method is used to determinant the most popular brand that can be used in the confusion matrix.
On the other customer have different purchasing time by the use of oversampling method the
For the first explanatory variable named X1 the levels will be assigned numerical variables
starting from one. The low level will be assigned 1, average level will be assigned two and the
high level will be assigned 3. With this type of levels it will be possible to have a quantitative
analysis of the variables. On the other hand, for the level categorical variable named X2 it will be
assigned 0 and 1; Sydney will be 0 and Melbourne will be 1. For that variable the coefficient will
be 0 when it is Sydney and the value of the coefficient when it is not Sydney.
Section B
Question 5a
The following are steps to follow to develop a model that will determine the customer that will
Step 1: Definition of the predictive model to be built. The current case want to build a model that
will determine the customer that will spend more than $ 800 in future
Step 2: Determining the data that is needed and whether it is available. The current that data is
needed include the family size, purchase amount, discount type and membership.
Step 3: Hypothesis is then developed and a model is build. The model is built by a regression
analysis, the explanatory variables are family size, discount type, product and member while the
Step 4: The model is then incorporate into the business process to help achieve the outcome.
Question 5b
The model is developed using regression analysis, the value of adjusted R-square was 0.01.
Y=1212.48+0.93*Age-10.45*Gender-5.75*Family size+2.71*membership-35.70*discount1
For the new male customer with age 30 living in a family with a size of 2 and is not a member
Y=1212.48+0.93*30-10.45*1-5.75*2+2.71*0-35.70*2
Y=1147.03
Question 6a
For a company to make the highest profit it needs to provide the highest number of services at
the least cost. The company providing maintenance services for washing machine the time taken
to provide the services of crucial. The indicate that the mean time taken to provide the service is
15.50 minutes that can be estimated to be 4 machines per hour and twenty four machines per
day. The average month since the previous repair was for to be 5.6 months. It was found that 83
of the repairs were done in the Morning while 87 was done in afternoon. Those with electrical
problems were 80 and those with mechanical problems were 91. Regarding the person who made
the repairs Bob made 91 repairs, James made 11 repairs and john made 51 repairs.
It was observed that most of the repairs were done on the afternoon. The manager needs to
allocate more workers in the afternoon section. Bob had the highest number of repairs while
James had the least. Electrical problems were the most popular the manager needs to have more
expert to handle them. The manager needs to put in places to motivate him will James should
Question 6b
The model for a future booking service that needs to be done by John and is an electrical
problems is given by
The model indicate that time of repair will increase by 1.08 of it is an electrical problems and
reduce by 4.23 if it is done by John. The repair should be done in the morning shift because the
Question 6c
The other factors that should be considered in the model is the years of experience of the person
offering the service and the level of training. The model of the machine should also be factored
in the equation.
Question 7a
The best method to deal with the issue of missing value is to fill the mode in the missing spaces.
Step 1: The mode of the data will be obtained the use of the excel command
Step 2: Once the mode is obtained the data will be filtered for the blank spaces
Step3: The blank spaces will then be filled with the mode of the blood type.
Question 7b
Step one: Each of the blood type will be filtered
Step two: The filtered data will then copied and pasted into a new work sheet
Step Three: The average for the protein for each of the blood types will then be obtained.
Table shows a summary for the mean protein for the blood types
A- 9.48
A+ 9.33
AB- 9.33
AB+ 9.54
B- 9.29
B+ 9.49
O- 292.53
O+ 8.85
Question 7c
Step one: in the new work sheet for each blood type the minimum value will be obtained
Step three: The minimum and maximum values for each data will give the range of the data.
A- 7.33 11.35
A+ 7.33 11.425
B- 7.23 11.28
B+ 7.55 11.58
O- 7.475 11.5
O+ 6.6 11.5
Question 7d
Correlation between age and total protein is -0.16 indicating that total protein declines with
increase in age.
Question 7e
Chart Title
60
50
40
30
20
10
0
201
626
1
26
51
76
101
126
151
176
226
251
276
301
326
351
376
401
426
451
476
501
526
551
576
601
651
676
50
40
30
20
10
0
201
626
1
26
51
76
101
126
151
176
226
251
276
301
326
351
376
401
426
451
476
501
526
551
576
601
651
676
Total Protoean level (g/dL) Age
Question 8a
The model that will be suggested to relate to risk of diabetes, age, weight and gender will involve
a multiple regression with risk of diabetes as the independent variable and age, weight and
gender as the explanatory variables. The model will be a straight line equation that will show the
Question 8b
0.0326*life style
The constant value is 6.6 indicate that holding all factors constant the expected remaining age of
a person is 6.6 years. Females have a lower expected remaining life of 0.541. Increase age
reduces expected life by 0.0151, increase in weight increase expected life by 0.0069 and those in
the country have a higher expected life than those in a big city.
Question 8c
The expected remaining life for a 59-year old male in a small town with 72 kg and a risk of 25%
Question 9a
Question 9b
Question 10 a
Min
cost=4X1+4X2+4X3+4X4+4X5+4X6+8X7+8X8+8X9+8X10+8X11+8x12+8X13+8X14+8X15
+8X16+8X17+8X18+10X19+10X20+10X21+10X22+10X23+10X24
Subject to X1<5
X2<4
X3<4
X4<4
X5<6
X6<6
X7<6
X8<8
X9<8
X10<8
X11<8
X12<5
X13<5
X15<5
X16<8
X17<8
X18<8
X19<9
X20<10
X21<10
X22<10
X23<10
X24<8
The objective function indicates that in the first Shift will start with 4 staff, second and third shift
will have 8 staff and the last shift will have 10 staff. The linear programing sought to minimum
the cost the cost that is subject to number of workers per every hour. Cost of obtained by
multiplying pay per hour with the number of workers. The number of workers should not be less
Solution
Value
Number of staff 4 5
X1
Number of staff 4 4
X2
Number of staff 4 4
X3
Number of staff 4 4
X4
Number of staff 4 4
X5
Number of staff 4 6
X6
Number of staff 8 6
X7
Number of staff 8 8
X8
Number of staff 8 8
X9
Number of staff 8 8
X10
Number of staff 8 8
X11
Number of staff 8 5
X12
Number of staff 8 5
X13
Number of staff 8 5
X14
Number of staff 8 5
X15
Number of staff 8 8
X16
Number of staff 8 8
X17
Number of staff 8 8
X18
Number of staff 12 9
X19
Number of staff 12 10
X20
Number of staff 12 10
X21
Number of staff 12 10
X22
Number of staff 12 10
X23
Number of staff 12 8
X24
Question 10 c
If it is possible for a staff to stay for extra two hours then the problem will be as follows
Min
cost=4X1+4X2+4X3+4X4+4X5+4X6+4X7+4X8+8X9+8X10+8X11+8x12+8X13+8X14+8X15
+8X16+8X17+8X18+8X19+8X20+8X21+8X22+8X23+8X24
Subject to X1<5
X2<4
X3<4
X4<4
X5<6
X6<6
X7<6
X8<8
X9<8
X10<8
X11<8
X12<5
X13<5
X15<5
X16<8
X17<8
X18<8
X19<9
X20<10
X21<10
X22<10
X23<10
X24<8
The objective function indicate that the staff will have three shifts. The first shift will start with
four while the second and third will have eight staff members. The objective equation is subject
Question 10 c
Solution
Value
Number of staff X1 4 5
Number of staff X2 4 4
Number of staff X3 4 4
Number of staff X4 4 4
Number of staff X5 4 4
Number of staff X6 4 6
Number of staff X7 4 6
Number of staff X8 4 8
Number of staff X9 8 8
The solution indicates the optimal number of staff the company can have in every hour.