Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Correlation
Regression
If there is a strong correlation between two variables, Regression is used to determine
the value of dependent variable (Y) from the value of independent variable (X)
Types
Simple Linear Regression
Determines the value of a Dependent Variable based on a single independent
variable
Simplest form of Regression Analysis
Multiple Linear Regression
Used when the Dependent Variable is a continuous variable and independent
variables are continuous or categorical.
Logistic Regression
Logistic Regression is used when the outcome variable is categorical
The independent variables could be either categorical or continuous
Logistic Regression determines the Odds Ratio for various independent variables
for the dichotomous dependent variable
Correlation Analysis is a group of statistical techniques to measure the association between two
variables.
The Dependent Variable is the variable being predicted or estimated.
The Independent Variable provides the basis for estimation. It is the predictor variable.
A Scatter Diagram is a chart that portrays the relationship between two variables.
The Coefficient of Correlation (r) is a measure of the strength of the relationship between two
variables. Also called Pearsons r and Pearsons product moment correlation coefficient.
1. It is a non parametric test. Assumptions about the form of the distribution or its
parameters are not required
2. It is a distribution free test, which can be used in any type of distribution of population.
3. It is easy to calculate chi square statistics
4. It analyses the differences between a set of observed frequencies and a set of
corresponding expected frequencies.
5. It is a multinomial distribution
6. The variable varies from 0 to
7. It is a one tailed test
2 Distribution
Type - I
1. Explain 2 test. How will you use this to test a hypothesis? What are the precautions
which are to be taken for 2 testing? Mention the uses of 2.
2. The following data shows the distribution of frequency between educational level and
awareness of AIDS. Find the relationship between education and awareness.
Awareness Level
Education Low Moderate Higher
Illiterate 320 50 30
Primary 80 15 05
Middle School 110 70 20
High School 200 60 40
Higher Secondary 310 130 60
3. Due to recession, an IT company is planning to lay-off some of its personnel. The results
of opinion survey conducted by a company are given below. Formulate the hypothesis
and test it using 2 test at 0.05 level of significance
4. The following table gives the data regarding the field of study in the University and their
field of Specialization in High School
Test whether there is any association between High School Specialization and field of
study in the University
5. Based on the following data test the hypothesis that there is no difference in quality of the
kind of tyres. ( = 0.05)
Tyre Brand
A B C D
Failed to last 4000 kms 26 23 15 32
Lasted to 4000-6000kms 118 93 116 121
Lasted for more than 6000kms 56 84 67 49
6. Following table provides the number of executives according to the time devoted to
public activities by rank
Rank
Time Devoted Manager Sr. Manager Gr.Manager
A Good Deal 25 13 9
Some Time 62 53 49
Never 12 34 43
Test the hypothesis that the time devoted to public activities is independent of the rank.
7. A survey was conducted in Bangalore city as well as in the rest of Karnataka state
regarding the peoples first choice of four types of magazines. The results are tabulated
below
City
Type of Magazine Bangalore Rest of Karnataka
News Magazine 70 310
Movie Magazine 60 280
Ladies Magazine 40 170
Sports Magazine 30 40
Test whether there is any significant diference between Bangalore
population and rest of Karnataka in the choice of magazines.
8. A sample of 115 professionals, 110 businessmen, and 125 farmers were chosen and asked
to express their feelings regarding a national policy. The result of the survey is given
below:
Education
Sex Middle School High School College
Male 10 15 25
Female 25 10 15
Can you conclude that education depends of sex of the individual?
11. The following table gives a sample of married women, their level of education and
marriage adjustment scores.
12. Formulate an appropriate hypothesis and use 2 test for the following data
Inter-Caste Marriage
Socioeconomic Status Favorable Indifferent Unfavorable
Low 40 25 10
Moderate 35 30 15
High 25 45 5
13. An oil company has explored three different areas for possible oil reserves the results of
the test were given as below.
Area
A B C
Strikes 7 10 8
Dry Holes 10 18 9
Does the data suggest that the three areas have the same potential at 10% level of
significance?
Type - II
14. On the basis of the information given below find if there is any association with
inoculation and absence of attack of typhoid at 5% level of significance.
15. Find the relationship between Educational Status and Safety awareness of workers, Use
2 test at 0.05 level of significance. Propose the Null hypothesis and Alternate
Hypothesis.
Level of Awareness
Educational Status Low High
Upto 10th Std. 30 70
Professional 60 40
Type III
16. A theory predicts the proportion of beans in the four groups A, B, C and D should be
9:3:3:1. In an experiment among 1600 beans, the numbers in the four groups were 882,
313, 287 and 118. Does the experiment result support the theory? Apply 2 test.
17. A college is running post graduate classes in five subjects with equal number of students.
The total number of absentees in these five classes is 75. Test the hypothesis that these
classes are alike in absentees if the actual absentees in each are as follows: History = 9;
Philosophy = 18; Economics = 15; Commerce = 12; Chemistry = 11.
18. How well do the airline companies serve their customers? A study showed the following
customer ratings. 3% Excellent, 28% Good, 45% Fair and 24% Poor. In a follow up study
of telephone companies, a sample of 400 adults found the following customer ratings 24
Excellent, 124 Good, 172 Fair and 80 Poor. Does the distribution of customer ratings for
telephone companies differ from the distribution of the customer ratings for the airline
companies?
19. A multinomial population with four categories A, B, C and D have the following
proportion of items same in all categories. A sample of size 300 yielded the following
results: A = 85, B = 95, C = 50 and D = 70. Use = 0.05 to determine whether the claim
of the proportions being same in every category is true.
20. During the first 13 weeks of television season, the Saturday evening 8.00pm to 9.00pm
audience proportion were recorded as ABC = 29%, CBS = 28%, NBC = 25% and
Independent = 18%. A sample of 300 homes two weeks after a Saturday night schedule
revision yielded the following viewing audience data ABC = 95 homes, CBS = 70 homes,
NBC = 89 homes and Independent = 46 homes. Test with = 0.05 to determine whether
the viewing audience proportion have changed.
Type IV
No: Turned Up 1 2 3 4 5 6
Frequency 16 20 25 14 29 28
Is the die biased?
22. Eight coins were tossed 256 times and the following results were obtained
No: of Heads 0 1 2 3 4 5 6 7 8
Frequency 2 6 30 52 67 56 32 10 1
Success 0 1 2 3 4 5 6 7 8 9 10 11 12
Frequency - 7 60 198 430 731 948 847 536 257 71 11 -
Test for goodness of fit.
24. A typist kept a record of mistakes made per day during 300 working days in a year.
Mistakes/Day 0 1 2 3 4 5 6
No: of Days 143 90 42 12 9 3 1
Fit a Poisson distribution to the data. Test for goodness of fit.
0.89
(Given e = 0.410652)
25. The following are the number of arrivals of flights per hour in an airport. Can we
conclude that the following 400 arrivals follow a Poisson distribution with =3 at 5%
level of significance?
No: of 0 1 2 3 4 5
Arrivals
No: of Hours 20 57 98 85 78 62