Sei sulla pagina 1di 15

Section A

Question 1

Confusion matrix is applied to determine performance of a classification model on data whose

actual values are known. The confusion matrix have four possible outcome that include true

positives, true negatives, false positives and false negatives. True positive are those cases that are

predicted and actually have the outcome. True negatives are those cases that are predicted and do

not have the measure value. False positive are cases that are predicted to have the elements but

do not have and the false negative are predicted not to have and actually do not have.

Question 2

In business field a confusion matrix have various applications. For instance the matrix can be

used to determine accuracy of a customer buying a particular brand. Using the true and false

value of either positive or negative the shop will determine brand that have the highest

preference. The second application is whether a customer shops on online or in a physical store

that will be key in decision making on the area to increase advertisement by the business.

Question 3

The two example that over sampling method is required to be applied before using the building

method include brand selection of the customer purchasing time. It should be note a business will

have many brands and the customer have different preferences for those brands. Over sampling

method is used to determinant the most popular brand that can be used in the confusion matrix.

On the other customer have different purchasing time by the use of oversampling method the

purchasing time is determined.


Question 4

For the first explanatory variable named X1 the levels will be assigned numerical variables

starting from one. The low level will be assigned 1, average level will be assigned two and the

high level will be assigned 3. With this type of levels it will be possible to have a quantitative

analysis of the variables. On the other hand, for the level categorical variable named X2 it will be

assigned 0 and 1; Sydney will be 0 and Melbourne will be 1. For that variable the coefficient will

be 0 when it is Sydney and the value of the coefficient when it is not Sydney.

Section B

Question 5a

The following are steps to follow to develop a model that will determine the customer that will

spend over $ 800 in future

Step 1: Definition of the predictive model to be built. The current case want to build a model that

will determine the customer that will spend more than $ 800 in future

Step 2: Determining the data that is needed and whether it is available. The current that data is

needed include the family size, purchase amount, discount type and membership.

Step 3: Hypothesis is then developed and a model is build. The model is built by a regression

analysis, the explanatory variables are family size, discount type, product and member while the

dependent variable is the purchase amount.

Step 4: The model is then incorporate into the business process to help achieve the outcome.

Question 5b
The model is developed using regression analysis, the value of adjusted R-square was 0.01.

Equation 1 shows the model of the regression analysis.

Y=1212.48+0.93*Age-10.45*Gender-5.75*Family size+2.71*membership-35.70*discount1

For the new male customer with age 30 living in a family with a size of 2 and is not a member

and hold discount card type 2 the amount will be given by

Y=1212.48+0.93*30-10.45*1-5.75*2+2.71*0-35.70*2

Y=1147.03

The new customer is expected to send at least $ 1147.03

Question 6a

For a company to make the highest profit it needs to provide the highest number of services at

the least cost. The company providing maintenance services for washing machine the time taken

to provide the services of crucial. The indicate that the mean time taken to provide the service is

15.50 minutes that can be estimated to be 4 machines per hour and twenty four machines per

day. The average month since the previous repair was for to be 5.6 months. It was found that 83

of the repairs were done in the Morning while 87 was done in afternoon. Those with electrical

problems were 80 and those with mechanical problems were 91. Regarding the person who made

the repairs Bob made 91 repairs, James made 11 repairs and john made 51 repairs.

It was observed that most of the repairs were done on the afternoon. The manager needs to

allocate more workers in the afternoon section. Bob had the highest number of repairs while

James had the least. Electrical problems were the most popular the manager needs to have more
expert to handle them. The manager needs to put in places to motivate him will James should

encouraged to increase his production.

Question 6b

The model for a future booking service that needs to be done by John and is an electrical

problems is given by

Time of repair=16.48+0.43*shift of service+1.08*repair type-4.23*repairperson

The model indicate that time of repair will increase by 1.08 of it is an electrical problems and

reduce by 4.23 if it is done by John. The repair should be done in the morning shift because the

repair time will reduce by 0.43.

Question 6c

The other factors that should be considered in the model is the years of experience of the person

offering the service and the level of training. The model of the machine should also be factored

in the equation.

Question 7a

The best method to deal with the issue of missing value is to fill the mode in the missing spaces.

The following steps will be followed

Step 1: The mode of the data will be obtained the use of the excel command

Step 2: Once the mode is obtained the data will be filtered for the blank spaces

Step3: The blank spaces will then be filled with the mode of the blood type.

Question 7b
Step one: Each of the blood type will be filtered

Step two: The filtered data will then copied and pasted into a new work sheet

Step Three: The average for the protein for each of the blood types will then be obtained.

Table shows a summary for the mean protein for the blood types

Blood type Mean (Protein)

A- 9.48

A+ 9.33

AB- 9.33

AB+ 9.54

B- 9.29

B+ 9.49

O- 292.53

O+ 8.85

Question 7c

Step one: in the new work sheet for each blood type the minimum value will be obtained

Step two: The maximum value will also be obtained

Step three: The minimum and maximum values for each data will give the range of the data.

Blood type Minimum Maximum

A- 7.33 11.35
A+ 7.33 11.425

AB- 7.33 11.43

AB+ 7.43 11.48

B- 7.23 11.28

B+ 7.55 11.58

O- 7.475 11.5

O+ 6.6 11.5

Question 7d

Correlation between age and total protein is -0.16 indicating that total protein declines with

increase in age.

Question 7e

Chart Title
60

50

40

30

20

10

0
201

626
1
26
51
76
101
126
151
176

226
251
276
301
326
351
376
401
426
451
476
501
526
551
576
601

651
676

Age Total Protoean level (g/dL)


Chart Title
60

50

40

30

20

10

0
201

626
1
26
51
76
101
126
151
176

226
251
276
301
326
351
376
401
426
451
476
501
526
551
576
601

651
676
Total Protoean level (g/dL) Age

Question 8a

The model that will be suggested to relate to risk of diabetes, age, weight and gender will involve

a multiple regression with risk of diabetes as the independent variable and age, weight and

gender as the explanatory variables. The model will be a straight line equation that will show the

impact of each variable of the risk of diabetes.

Question 8b

Regression equation for the risk of diabetes

Expected remaining life=6.6-0.0473*risk-0.5481*gender-0.0151*age+0.0069*weight-

0.0326*life style

The constant value is 6.6 indicate that holding all factors constant the expected remaining age of

a person is 6.6 years. Females have a lower expected remaining life of 0.541. Increase age

reduces expected life by 0.0151, increase in weight increase expected life by 0.0069 and those in

the country have a higher expected life than those in a big city.
Question 8c

The expected remaining life for a 59-year old male in a small town with 72 kg and a risk of 25%

in diabetes is shown below

Expected remaining life=6.6-0.0473*0.25-0.5481*0-0.0151*59+0.0069*72-0.0326*2

Expected remaining life=6.128875

Question 9a

On the excel file

Question 9b

To have a return of 1000,000 the investor is supposed to have a cumulative investment of

1666,000. The investor is supposed to investor 48.9% of their salary.

Question 10 a

Four staff should start working to minimize cost

Min

cost=4X1+4X2+4X3+4X4+4X5+4X6+8X7+8X8+8X9+8X10+8X11+8x12+8X13+8X14+8X15

+8X16+8X17+8X18+10X19+10X20+10X21+10X22+10X23+10X24

Subject to X1<5

X2<4

X3<4
X4<4

X5<6

X6<6

X7<6

X8<8

X9<8

X10<8

X11<8

X12<5

X13<5

X15<5

X16<8

X17<8

X18<8

X19<9

X20<10

X21<10

X22<10
X23<10

X24<8

The objective function indicates that in the first Shift will start with 4 staff, second and third shift

will have 8 staff and the last shift will have 10 staff. The linear programing sought to minimum

the cost the cost that is subject to number of workers per every hour. Cost of obtained by

multiplying pay per hour with the number of workers. The number of workers should not be less

that required number of workers.

Solution

Name Original Final Value

Value

Number of staff 4 5

X1

Number of staff 4 4

X2

Number of staff 4 4

X3

Number of staff 4 4

X4

Number of staff 4 4

X5

Number of staff 4 6

X6
Number of staff 8 6

X7

Number of staff 8 8

X8

Number of staff 8 8

X9

Number of staff 8 8

X10

Number of staff 8 8

X11

Number of staff 8 5

X12

Number of staff 8 5

X13

Number of staff 8 5

X14

Number of staff 8 5

X15

Number of staff 8 8

X16

Number of staff 8 8

X17
Number of staff 8 8

X18

Number of staff 12 9

X19

Number of staff 12 10

X20

Number of staff 12 10

X21

Number of staff 12 10

X22

Number of staff 12 10

X23

Number of staff 12 8

X24

Question 10 c

If it is possible for a staff to stay for extra two hours then the problem will be as follows

Min

cost=4X1+4X2+4X3+4X4+4X5+4X6+4X7+4X8+8X9+8X10+8X11+8x12+8X13+8X14+8X15

+8X16+8X17+8X18+8X19+8X20+8X21+8X22+8X23+8X24

Subject to X1<5

X2<4
X3<4

X4<4

X5<6

X6<6

X7<6

X8<8

X9<8

X10<8

X11<8

X12<5

X13<5

X15<5

X16<8

X17<8

X18<8

X19<9

X20<10

X21<10
X22<10

X23<10

X24<8

The objective function indicate that the staff will have three shifts. The first shift will start with

four while the second and third will have eight staff members. The objective equation is subject

to the required number of staff per shift.

Question 10 c

Solution

Name Original Value Final

Value

Number of staff X1 4 5

Number of staff X2 4 4

Number of staff X3 4 4

Number of staff X4 4 4

Number of staff X5 4 4

Number of staff X6 4 6

Number of staff X7 4 6

Number of staff X8 4 8

Number of staff X9 8 8

Number of staff X10 8 8

Number of staff X11 8 8


Number of staff X12 8 5

Number of staff X13 8 5

Number of staff X14 8 5

Number of staff X15 8 5

Number of staff X16 8 8

Number of staff X17 8 8

Number of staff X18 8 8

Number of staff X19 8 9

Number of staff X20 8 10

Number of staff X21 8 10

Number of staff X22 8 10

Number of staff X23 8 10

Number of staff X24 8 8

The solution indicates the optimal number of staff the company can have in every hour.

Potrebbero piacerti anche