Sei sulla pagina 1di 123

Master of Business Administration – MBA Semester I

MBA103/MB0040 – Statistics for Management – 4 Credits

(Book ID B1731)

Section A

Qn 1 Multiple Choice Questions (MCQs)

I. A random variable takes the values -3, -2, 1, 0, 4, 6 with probabilities 1/12, 2/12, 3/12, 4/12, 1/12,
1/12 respectively. The mean or expected value and variance is _______________.

a. 1/2 and 23/4

b. 2/3 and 1/2

c. 23/4 and 1/2

d. 5/2 and 24/5

Answer - a. 1/2 and 23/4

II. i. Size of the class interval is equal to _______________. ii. Tally marks are used to construct
_______________.

a. i- Range/(1+3.322 log N), ii - frequency tables

b. i - Range/(1+2.22 log N), ii - frequency tables

c. i - Range/number of classes, ii - frequency distribution

d. i - Range/number of classes, ii - class interval

Answer – a. i- Range/(1+3.322 log N), ii - frequency tables

III. Steps in construction of cost of living index numbers involve the following in the order of:
a. Conduct family budget inquiry, select the class of people, obtain price quotations, define the scope of
the index, prepare a frame or list of persons.

b. Define scope of the index, Select the class of people, prepare a frame or list of persons, conduct
family budget inquiry, obtain price quotations.

c. Select the class of people, define scope of the index, conduct family budge inquiry, obtain price
quotations, prepare a frame or list of persons

d. Prepare a frame or list of persons, obtain price quotations, conduct family budget inquiry, define
scope of the index, select the class of people

Answer- c. Select the class of people, define scope of the index, conduct family budge inquiry, obtain
price quotations, prepare a frame or list of persons

IV. The Arithmetic Mean for following data is:

a. 25

b. 29

c. 32

d. 12

Answer - a. 25

V. State whether the following statements are true or false

i. The quantitative characteristic that varies from unit to unit is called a variable.

ii. A variable that assumes all the values in the range is known as discrete variable.

a. i- True, ii- False

b. i - False, ii- True


c. i- True, ii- True

d. i- False, ii- False

Answer – a. i- True, ii- False

VI. i. 1. The totality of all units in a survey is called _______________. ii. A ______________ is a part or a
subset of the population.

a. i - Unit, ii - Statistic

b. i - Variable, ii - Unit

c. i - Population, ii - Sample

d. i - Statistic, ii – Population

Answer – c. i - Population, ii - Sample

VII. In a bivariate data on ‘x’ and ‘y’, variance of ‘x’ = 49, variance of ‘y’ = 9 and covariance Cov(x, y) = -
17.5. Coefficient of correlation between ‘x’ and ‘y’is

a. 0.833

b. -0.833

c. 0.933

d. -0.933

Answer - b. -0.833

VIII. i. Questions that are answered only if the respondent gives a particular response to a previous
question is___________

ii. Questions where the respondents’ answers are limited to a fixed set of responses
are________________

a. i - Closed ended questions,

ii - Contingency questions

b. i - Matrix questions, ii - Open ended questions


c. i - Contingency questions, ii - Closed ended questions

d. i - Closed ended questions, ii – Open ended questions

Answer – c. i - Contingency questions, ii - Closed ended questions

IX. i. The computed values of chi-square are__________

ii. The number of degrees of freedom in a 4X4 contingency table is__________

a. i - always negative , ii- 16

b. i - always positive, ii- 9

c. i - either positive or negative, ii- 8

d. i- always zero, ii – 15

Answer – b. i - always positive, ii- 9

X. From a random sample of 36 New Delhi civil service personnel, the mean age and the sample
standard deviation were found to be 40years and 4.5 years respectively. 95% confidence interval for the
mean age of civil personnel in New Delhi is:

a. 40 ± 1.47

b. 42 ± 2.47

c. 52 ± 3.37

d. 55 ± 5.57

Answer – a. 40 ± 1.47

1. Statistics that is used to make valid inferences from the data for effective decision making among
managers or professionals is

a. Descriptive Statistics

b. Inferential Statistics

c. None of the above


d. All of the above

Answer - b. Inferential Statistics

2. “Statistics is the science of collection, presentation, analysis and interpretation of

numerical data from logical analysis” is the definition given by

a. Webster

b. Boddington

c. A. L Bowley

d. Croxton and Cowden

Answer - d. Croxton and Cowden

3. A professor asked the students in a class their heights. On the basis of this information, the

professor states that the average height of all the students in the university is 21 years. This is

an example of

a. a census

b. descriptive statistics

c. an experiment

d. Inferential Statistics

Answer - b. descriptive statistics

4. The measure describing the characteristics of the population is known as

a. Parameter

b. Sample

c. Statistics

d. Census

Answer - a. Parameter

5. Stubs stand for


a. Numerical information

b. The headings and subheadings of columns

c. The headings and subheadings of rows

d. The Table heading

Answer - The headings and subheadings of rows

12. 16. Algebraic sum of deviations of a set of values taken form their mean is

a. 1

b. 2

c. 3

d. 0

Answer- d. 0

17. The standard deviation is:

a. The square root of the variance

b. A measure of variability

c. An approximate indicator of how numbers vary from the mean

d. All of the above

Answer - d. All of the above

6. To compare the homogeneity or stability or consistency of two or more data sets we use

a. Arithmetic Mean

b. Standard Deviation

c. Coefficient of Variation

d. Mean Deviation

Answer - c. Coefficient of Variation

7. Which of the following represents the fiftieth percentile, or the middle point in a set of
numbers arranged in order of magnitude?

a. Mode

b. Median

c. Mean

d. Variance

Answer - b. Median

8. The mode for the data 8,7,6,5,6,6,7,6 is

a. 8

b. 7

c. 6

d. 5

ANSWER - c. 6

9. The probability of an event must lie within the interval from

a. 0 to 1

b. -1 to 1

c. 1 to 2

d. -1 to 0

ANSWER - a. 0 to 1

9. The Mathematical expectation of a random variable is given by

a. E(X)= Σ Xi P (Xi)

b. E(X)= Σ Xi P (Xi2)

c. E(X)= Σ Xi2 P(Xi)

d. E(X)= Σ Xi2 P(Xi2)

ANSWER - a. E(X)= Σ Xi P (Xi)

11.___________ is obtained when the binomial experiment is conducted many number of

times.

a. Probability distribution
b. Normal distribution

c. Poisson process

d. Binomial process

ANSWER – c. Poisson process

12. If X is a Poisson variate, such that P(X = 1) = P(X = 2), find P(X = 0).

a. 0.04979

b. 0.13534

c. 0.2382

d. 0.14937

ANSWER - b. 0.13534

13. Number of heads obtained in 4 tosses of a coin is an example of

a. Binomial distribution

b. Bernoulli distribution

c. Poisson distribution

d. Normal distribution

ANSWER - a. Binomial distribution

14. Which sampling theory states that, “other things being equal, as the sample size increases,

the results tend to be more reliable and accurate”?

a. Law of statistical regularity

b. Principle of inertia of large numbers

c. Principle of persistence of small numbers

d. Principle of validity

Answer - b. Principle of inertia of large numbers


15. If the sample size is less than 30 and the population standard deviation is not known, we use the
_____________ for estimation.

a. Normal distributions

b. Standard deviation

c. Student’s ‘t’ distribution

d. Binomial distribution

Answer – c. Student’s ‘t’ distribution

16. Rejecting a null hypothesis when it is true constitutes ___________ .

a. Type I error

b. Type II error

c. Producer's risk

d. Right decision

Answer – a. Type I error

17. Which business forecasting method is used when business indices are constructed to study and
analyse the business activities on the basis of which future conditions are predetermined?

a. Business barometers

b. Time series analysis

c. Extrapolation

d. Regression analysis

18. The results of Chi-square test cannot be accurate if the cell frequencies in a contingency

table are less than ________.

a. 50

b. 5
c. 20

d. 10

Answer - b. 5

19. Which business forecasting method is used when business indices are constructed to

study and analyse the business activities on the basis of which future conditions are

predetermined?

a. Business barometers

b. Time series analysis

c. Extrapolation

d. Regression analysis

Answer - a. Business barometers

20. The long- term oscillations that represent consistent rise and decline in the values of the

variable are called

a. Long term trend

b. Seasonal variations

c. Cyclic variations

d. Random variables

Answer - c. Cyclic variations

21. Price and demand of the commodity is an example of

a. Positive correlation

b. Negative correlation

c. Zero correlation

d. multiple correlation

Answer - b. Negative correlation

22. If the Standard deviation and Mean of the distribution are 2.64 and 53 respectively, the
Co-efficient of Variation is

a. 4.98%

b. 5.54%

c. 6.64%

d. 8.14%

Answer - a. 4.98%

23. Steps in construction of cost of living index numbers involve the following in the order

of:

a. Conduct family budget inquiry, select the class of people, obtain price quotations, define the scope of
the index, prepare a frame or list of persons.

b. Define scope of the index, Select the class of people, prepare a frame or list of persons, conduct
family budget inquiry, obtain price quotations.

c. Select the class of people, define scope of the index, conduct family budge inquiry, obtain price
quotations, prepare a frame or list of persons

d. Prepare a frame or list of persons, obtain price quotations, conduct family budget inquiry, define
scope of the index, select the class of people

Answer - c. Select the class of people, define scope of the index, conduct family budge inquiry, obtain
price quotations, prepare a frame or list of persons

24. 2% of the fuses manufactured by a firm are expected to be defective. The probability that a box
containing 200 fuses contains defective fuses is

a. 0.9817

b. 0.5124

c. 0.4523

d. 0.2222

Answer - a. 0.9817

25. The mean and variance of the binomial distribution is

a) np and npq

b) n and p
c) nq and npq

d) npq and np

Answer - a) np and npq

26. i. Questions that are answered only if the respondent gives a particular response to a

previous question is___________

ii. Questions where the respondents’ answers are limited to a fixed set of responses

are________________

a. i - Closed ended questions, ii - Contingency questions

b. i - Matrix questions, ii - Open ended questions

c. i - Contingency questions, ii - Closed ended questions

d. i - Closed ended questions, ii – Open ended questions

Answer - c. i - Contingency questions, ii - Closed ended questions

27. The median value of the following set of values 22, 16, 18, 13, 15, 19, 17, 20, 23 is

a. 19

b. 18

c. 15

d. 13

Answer - b. 18

28. i. The theory of Business forecasting based on the assumption that most of the business

data have the lag and lead relationship, that is, changes in business are successive but not

simultaneous is_________________

ii. The theory of business forecasting based on the assumption that history repeats itself

and hence assumes that all economic and business events behave in a rhythmic order

is____________

a. i - Specific historical analogy, ii - Action and reaction theory


b. i - Action and reaction theory, ii - Specific historical analogy

c. i - Economic rhythm theory, ii - Sequence or time-lag theory

d. i - Sequence or time-lag theory, ii – Economic rhythm theory

Answer - d. i - Sequence or time-lag theory, ii – Economic rhythm theory

29. If average height of 30 men is 158 cm and average height of another group of 40 men is 162 cm, the
average height of the combined group is

a. 150.29

b. 160.29

c. 170.29

d. 180.29

Answer - b. 160.29

30. i. If the statistical data are classified according to the time of its occurrence, the type of

classification is_________________

ii. Classification based on some attributes is______________

a. i - Chronological Classification, ii - Qualitative Classification

b. i - Qualitative Classification, ii - Chronological Classification

c. i - Quantitative Classification, ii - Geographical Classification

Answer - a. i - Chronological Classification, ii - Qualitative Classification

31. Match the following with respect to Parts of a Table:

Part A Part B
1. Source note A. The headings and subheadings describing
2. Head note the data present in the columns.
3. Captions B. Indicates the scope and the nature of
4. Title contents in a concise form.
C. Indicates the source from which the data is
taken and is placed at the bottom on the left
hand corner.
D. It is given below the title of the table to
indicate the units of measurement of the data
and is enclosed in brackets.

a. 1D, 2C, 3B, 4A

b. 1C, 2B, 3D, 4A

c. 1C, 2D, 3A, 4B

d. 1A, 2B, 3C, 4D

Answer - c. 1C, 2D, 3A, 4B

32. From a random sample of 36 New Delhi civil service personnel, the mean age and the sample
standard deviation were found to be 40years and 4.5 years respectively. 95% confidence interval for
the mean age of civil personnel in New Delhi is:

a. 40 ± 1.47

b. 42 ± 2.47

c. 52 ± 3.37

d. 55 ± 5.57

Answer - a. 40 ± 1.47

33. In a competition, two judges assigned the ranks for seven candidates. The Spearman’s

rank correlation coefficient is

a. 0.25

b. 0.55

c. 0.35

d. 0.75

Answer - d. 0.75
34. Heights of students are normally distributed with mean 165 cm and standard deviation 5

cm. The probability that the height of a student is greater than 177 cm is:

a. 0.0082

b. 1

c. 1.2

d. 0.5

Answer - a. 0.0082

35. i. The computed values of chi-square are__________

ii. The number of degrees of freedom in a 4X4 contingency table is__________

a. i - always negative , ii- 16

b. i - always positive, ii- 9

c. i - either positive or negative, ii- 8

d. i- always zero, ii - 15

Answer - b. i - always positive, ii- 9

36. Given

i) Laspeyre’s price index is given by

a. 151.92

b. 161.92

c. 171.92

d. 181.92

Answer - a. 151.92

ii) Laspeyre’s quantity index number is given by


a. 101.54

b. 120.12

c. 90.35

d. 82.45

37. The time series given below shows the figures of production (in m. tonnes) of a sugar factory. The
best fit for the following data is the straight line trend represented by the equation Y= a+bX

i) The value of a is

a. 50

b. 80

c. 45

d. 90

Answer - d. 90

ii) The value of b is

a. 0

b. 10

c. 1

d. 2

Answer - d. 2

38. Match the following:

Part A Part B
A. It divides the distribution into 100 parts of
equal frequency.
1. Mode
2. Deciles B. It can be determined graphically (Ogives)
3. Median and is not affected by extreme values.
4. Percentile
C. It divides the arrayed set of variates into
ten portions of equal frequency and they are
sometimes used to characterise the data for
some specific purpose.

D. It represents fashion and often it is used in


business. Thus, it corresponds to the values
of variable, which occurs most frequently.
a

a. 1D, 2C, 3B, 4A

b. 1B, 2C, 3D, 4A

c. 1C, 2B, 3A, 4D

d. 1A, 2B, 3C, 4D

Answer – a. 1D, 2C, 3B, 4A

74. Karl Pearson’s correlation coefficient for the following data is

a. r = 0.50

b. r = 0.596

c. r = 0.699

d. r = -1.1

Answer - c. r = 0.699

39. In a Binomial distribution p = 0.5 and n = 4, then

i. P(X = 0) is

a. 0.0625

b. 0.078

c. 0.78

d. 0.008

Ans - a. 0.0625
ii. P(X ≥ 2) is

a. 0.5825

b. 0.6875

c. 0.2875

d. 0.010

Answer - b. 0.6875

Section B

Q.2. Write a note on any 5 terminologies used in probability theory.

Answer:-

5 terminologies used in probability theory:-

a) Experiment: - An operation that results in a definite outcome is called an experiment. Tossing a coin is
an experiment, if it shows head (H) or tail (T) on falling. In anticipation of outcome of either H or T and
nothing else, tossing a coin which is likely to stand on its edge (figure) over a typical surface is not an
experiment.

Fig.: A Coin Standing on its Edge


b) Random experiment: - When the outcome of an experiment cannot be predicted with certainty, then
it is called random experiment or stochastic experiment.

There are two types of experiments. They are –

(i) Deterministic experiment and

(ii) Random experiment.

c) Sample space: - The set of all possible outcomes of a random experiment is the sample space. The
sample space is denoted by S. The outcomes of the random experiment (elements of the sample space)
are called sample points or outcomes or cases.

d) Event: - Event is a subset of the sample space. Events are denoted by A, B, C, etc. An event which
does not contain any outcome is a null event (impossible event).

e) Equally likely events (equiprobable events):- Two or more events are equally likely if they have equal
chance of occurrence. That is, equally likely events are such that none of them have greater chance of
occurrence than the others.

Q.3.The incidence of occupational disease in an industry is such that the workers have a 20% chance
of suffering from it. What is the probability that out of six workers, 4 or more will contract the
disease?

Answer:-

Let X: number of workers contracting the diseases among 6 workers

Then, X is a binomial variate with parameter n = 6

p = P [a worker contracts the disease] = 20/100 = 0.20

Therefore by binomial distribution,

P(X=x) = 6cx(0.20)x(0.75)6-x,x = 0,1,2,…….6

The probability that at the most two workers contract the disease is

Q.5. List at least 5 conditions to apply Chi-Square test.

Answer:-

Conditions to apply Chi-Square test:-

i. The frequencies used in Chi-Square test must be absolute and not in relative terms.

ii. The total number of observations collected for this test must be large.

iii. Each of the observations which make up the sample of this test must be independent of each other.
iv. As test is based wholly on sample data, no assumption is made concerning the population
distribution. In other words, it is a non parametric-test.

v. test is wholly dependent on degrees of freedom. As the degrees of freedom increase, the Chi-
Square distribution curve becomes symmetrical.

Q.6. Write the differences between Correlation and Regression Coefficient.

Answer:-

Differences between Correlation and Regression Coefficient:-

Correlation Coefficient Regression Coefficient


The correlation coefficients, rxy = ryx The regression coefficients, byx ≠ bxy
‘r’ lies between -1 and 1. ‘byx’ can be greater than one in which case ‘bxy’
must be less than one such that byx.bxy ≤1
It has no units attached to it. It has units attached to it.
There exists nonsense correlation. There is no such nonsense regression.
It is not based on cause and effect relationship. It is based on cause and effect relationship.
It indirectly helps in estimation. It is meant for estimation.

Q.7. Write short note on forecasting methods using time series.

Answer:-

Forecasting methods using time series:-

(i) Mean forecast: - It is the simplest method of forecasting in which for the time period t, we forecast
the value of the series to be equal to the mean of the series, that is,

In this method the trend effect and cyclic effects do not come into account.

(ii) Naive forecast: - In this method we forecast the value, for the time period t, to be equal to the actual
value observed in the previous period, that is, time period (t-1). This is given as:

(iii) Linear trend forecast: - It is given by Yt = a + bX, where X is to be found from the value of t; a and b
are constants. This method is based on the least squares method where a linear relationship is to be
obtained between time and the response value ‘X’ by the formula which is given as:

(iv) Non-linear trend forecast: - In this method a non-linear relationship between the time and the
response value has been found by the method of least squares. The value of forecast ‘Yt’ for the time
period ‘t’, is given as:
2

Yt = a + bX + cX2 where, X-value will be calculated from the value of ‘t’ and the constant ‘a’.

(v) Forecasting with exponential smoothing: - Exponential smoothing is the forecasting method in
which the observation values are constantly updated and used to revise a forecast. As the observations
get older, they get exponentially decreasing weights. Exponential smoothing is of many types, such as
single, double, triple exponential smoothing.

Section C

Q.8. Explain the various methods of sampling.

Answer:-

Probability Sampling Methods:-

1. Simple random sampling:-

Sample units are drawn in such a way that each and every unit in the population has an equal and
independent chance of being included in the sample.

Simple random sampling can be done by the following ways:

i. Lottery method – we identify each and every unit with distinct numbers by allotting an identical
card.

ii. The use of table of random numbers – There are several random number tables. They are
Tippet’s random number table, Fisher’s and Yate’s tables, Kendall and Babington Smiths random
tables, Rand Corporation random numbers etc.

2. Stratified random sampling:-

This sampling design is most appropriate if the population is heterogeneous with respect to
characteristic under study or the population distribution is highly skewed.

We subdivide the population into several groups or strata such that:

i. Units within each stratum is more homogeneous


ii. Units between strata are heterogeneous
iii. Strata do not overlap, in other words, every unit of the population belongs to one and only one
stratum

3. Systematic sampling:-

This design arranged in some systematic order such as geographical, chronological or alphabetical order.

4. Cluster sampling:-
The total population is divided into recognisable sub-divisions, known as clusters.

5. Multi-stage sampling:-

The total population is divided into several stages. The sampling process is carried out through several
stages.

Non-Probability Sampling Methods:-

1. Judgment sampling:-

The choice of sample items depends exclusively on the judgment of the investigator.

Merits Demerits
1. Most useful for small population. 1. It is not a scientific method.
2. Most useful to study some unknown traits of 2. It has a risk of investigator’s bias being
a population some of whose characteristics are introduced.
known.
3. Helpful in solving day-to-day problems.

2. Convenience sampling:-

The sample units are selected according to the convenience of the investigator. It is also called “chunk”
which refers to the fraction of the population being investigated.

3. Quota sampling:-

Quotas are set up according to some specified characteristic such as age groups or income groups. From
each group a specified number of units are sampled according to the quota allotted to the group.

Q.9. Discuss the various steps involved in the analysis of variance in two way classification.

Answer:-

Procedure for carrying out the One-way ANOVA:-

1. Compute the sum of all values ‘T’.

2. Find the correction factor:

Correction factor =

3. Find Total sum of squares:

SST = Sum of squares of all observations

4. Sum of the Squares of Error between the columns (samples):


5. Sum of the squares of the Error within columns (samples):

SSE = SST – SSC - SSR

6. Variance between samples:

7. Variance within the samples:

8. Test statistics F =

9. Decision: If the computed value of F > Table (critical) value of F for degrees of freedom (k-1, n - k) at
α% (5% or 1%), then we reject H0 and conclude that all the population means are unequal. Otherwise
accept H0 and conclude that the population means are not unequal.

Procedure for carrying out the Two-way ANOVA:-

1. a) Assume the means of all columns are equal. That is, the effects of all factors in the first kind of
treatment are equal.

b) Assume the means of all rows are equal. That is, the effects of all factors in the second kind of
treatment are equal.

2. Compute the sum of all values ‘T’.

3. Find Total sum of squares:

SST = Sum of squares of all observations

4. For columns, SSC is calculated as:

5. For rows, SSR is calculated as:

6. SS residual or error: SSE = SST – SSC – SSR


where, ‘c’ - number of columns and ‘r’ - number of rows.

Degrees of freedom for Fc = {c-1, (c-1) (r-1)}

Degrees of freedom for Fr = {r-1, (c-1) (r-1)}

If MSE > MSC then we take Fc =

If MSE > MSR then we taken Fr =

Fc is for column wise comparison

Fr is for row wise comparison

If Fc < table value of F then

If Fr < table value of F then

Table depicts the ANOVA table for two-way ANOVA

Source of Variation Sum of Squares Degrees of Freedom Mean Square F-Ratio


Between Columns SSC (c-1) MSC
Between Rows SSR (r-1) MSR Fc
Residual SSE (c-1)x (r-1) MSE Fr
Total (n-1)

Q.10. Two research workers classified some people in income groups on the basis of sampling studies.
Their results are as follow:

Show that the sampling technique of at least one research worker is defective.

Answer:-

Answer.
a.)
Chi-square test of goodness of fit

The test is applied when you have one categorical variable from a single population. It is used to
determine whether sample data are consistent with a hypothesized distribution.

For example, suppose a company printed baseball cards. It claimed that 30% of its cards were
rookies; 60%, veterans; and 10%, All-Stars. We could gather a random sample of baseball cards
and use a chi-square goodness of fit test to see whether our sample distribution differed
significantly from the distribution claimed by the company. The sample problem at the end of the
lesson considers this example.

Precautions

In order to use a chi-square hypothesis test properly, one has to be extremely careful and keep in
mind certain precautions.

First, a sample size should be large enough. If the expected frequencies are too small, the value of
χ2 gets over-estimated. This will result in the rejection of the hypothesis in several cases.

Another point to note is that the calculations or percentages are used, then the theoretical
distribution would not be applicable.

In most of the cases, the problem of χ2 involves simple calculations. However, for large sets of data
the chi-square test involves very comprehensive calculations. In all such cases, computer should
be used. Several computer Statistics packages contain routines for carrying out chi-square tests.

Role in business decision making

Goodness-of-fit tests are often used in business decision making. In order to calculate a chi-square
goodness-of-fit, it is necessary to first state the null hypothesis and the alternative hypothesis,
choose a significance level (such as α = 0.5) and determine the critical value. It can be applied in
a wide area including surveys, business decision making, quality control, biological research,
medical research, etc. Also, chi-square tests are commonly used in studies dealing with
demographics, Likert scales, and other discrete data. It is also used to estimate the confidence
interval for a normally distributed population’s standard deviation from the sample standard
deviation; or for other tests like ANOVA and Friedman’s Rank ANOVA.

b.)

Solution:
Let us take the hypothesis that the sampling technique adopted by research workers is similar.
This being so, the expectation of A investigator classifying the people in

(i) Poor income group= (200x300)/500 =120


(ii) Middle income group= (200x150)/500=60
(iii) Rich income group=(200x50)/500=20

Similarly the expectation of B investigator classifying the people in

(iv) Poor income group= (300x300)/500 =180


(v) Middle income group= (300x150)/500=90
(vi) Rich income group=(300x50)/500=30

We can now calculate value of χ2 as follows:

Group Observed freq Expected freq OijEij (Oij-Eij)2Eij


Oij (Eij
Investigator A
Poor 160 120 40 1600/120=13.33
Middle 30 60 -30 900/60=5.00
Rich 10 20 -10
Investigator B
Poor 140 180 -40 1600/180=8.88
Middle 120 90 30 900/90=10.00
Rich 40 30 10 100/30=3.33

Hence,

χ2=∑[(Oij-Eij)2 / Eij ] =55.54

Degree of freedom=(c-1)(r-1)

=(3-1)(2-1)=2

The table value of χ2 for two degree of freedom at 5 percent level of significance is 5.991.

The calculated value of χ2 is much higher than this table value which means that the calculated
value cannot be said to have arisen just because of chance. It is significant. Hence, the hypothesis
does not hold good. This means that the sampling techniques adopted by two investigators differ
and are not similar. Naturally, then the technique of one must be superior to that of the other.

1 Statistics plays a vital role in almost every facet of human life. Describe the functions of Statistics.
Explain the applications of statistics.

Meaning of statistics Functions of statistics Applications of statistics 2 4 4 10

Answer –

MEANING OF STATISTICS
According to Horace Secrist, Statistics may be defined as “an aggregate of facts affected
to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according
to a reasonable standard of accuracy, collected in a systematic manner for a predetermined purpose and
placed in relation to each other”1. This definition is both comprehensive and exhaustive.

Prof. Boddington, on the other hand, defined Statistics as “The science of estimates and
probabilities”2. This definition is also not complete.

FUNCTIONS OF STATISTICS

Statistics is used for various purposes. Let us look at each function of Statistics in detail.

1. Statistics simplifies mass data

The use of statistical concepts helps in simplification of complex data. Using statistical
concepts, the managers can make decisions more easily. The statistical methods help in reducing the
complexity of the data and in the understanding of any huge mass of data.

2. Statistics brings out trends and tendencies in the data

After data is collected, it is easy to analyse the trend and tendencies in the data by using the various
concepts of Statistics.

3. Statistics brings out the hidden relations between variables

Statistical analysis helps in drawing inferences on the data. Statistical analysis brings out the hidden
relations between variables.
4. Decision making power becomes easier

With the proper application of Statistics and statistical software packages on the collected data,
managers can take effective decisions, which can increase the profits in a business.

5. Statistics makes comparison easier

Without using statistical methods and concepts, collection of data and comparison would be difficult.
Statistics helps us to compare data collected from various sources. Grand totals, measures of central
tendency and measures of dispersion, graphs and diagrams and coefficient of correlation all provide
ample scope for comparison.

APPLICATIONS OF STATISTICS

Statistical methods are applied to specific problems in various fields such as Biology, Medicine,
Agriculture, Commerce, Business, Economics, Industry, Insurance, Sociology and Psychology. In the field
of medicine, statistical tools like t-tests are used to test the efficiency of the new drug or medicine. In the
field of economics, statistical tools such as index numbers, estimation theory and time series analysis are
used in solving economic problems related to wages, price, production and distribution of income.

In Biology, Medicine and Agriculture, Statistical methods are applied in the following:

 Study of the growth of plants


 Movement of fish population in the ocean
 Migration pattern of birds
 Analysis of the effect of newly invented medicines
 Theories of heredity
 Estimation of yield of crop
 Study of the effect of fertilizers on yield
 Birth rate
 Death rate
 Population growth
 Growth of bacteria

2 a) Explain the approaches to define probability. b) State the addition and multiplication rules of
probability giving an example of each case.

a) Explanation of the approaches to define probability

b) Addition and multiplication rules of probability giving an example of each 5


Answer – A/ In some areas, such as mathematics or logic, results of some process can be known with
certainty (e.g., 2+3=5). Most real life situations, however, involve variability and uncertainty. For
example, it is uncertain whether it will rain tomorrow; the price of a given stock a week from today is
uncertain Note_1 ; the number of claims that a car insurance policy holder will make over a one-year
period is uncertain. Uncertainty or "randomness" (meaning variability of results) is usually due to some
mixture of two factors: (1) variability in populations consisting of animate or inanimate objects (e.g.,
people vary in size, weight, blood type etc.), and (2) variability in processes or phenomena (e.g., the
random selection of 6 numbers from 49 in a lottery draw can lead to a very large number of different
outcomes; stock or currency prices fluctuate substantially over time).

Variability and uncertainty make it more difficult to plan or to make decisions. Although they cannot
usually be eliminated, it is however possible to describe and to deal with variability and uncertainty, by
using the theory of probability. This course develops both the theory and applications of probability.

B/

The addition rule of probability states that: i) If ‘A’ and ‘B’ are any two events then the probability of the
occurrence of either ‘A’ or ‘B’ is given by:

ii) If ‘A’ and ‘B’ are two mutually exclusive events then the probability of occurrence of either ‘A’ or ‘B’ is
given by:

iii) If ‘A’, ‘B’ and ‘C’ are any three events then the probability of occurrence of either ‘A’ or ‘B’ or ‘C’ is
given by:

In terms of Venn diagram, from the figure 5.3, we can calculate the probability of
occurrence of either event ‘A’ or event ‘B’, given that event ‘A’ and event ‘B’ are dependent events.

From the figure 5.4, we can calculate the probability of occurrence of either ‘A’ or ‘B’,
given that, events ‘A’ and ‘B’ are independent events. From the figure 5.5, we can calculate the
probability of occurrence of either ‘A’ or ‘B’ or ‘C’, given that, events ‘A’, ‘B’ and ‘C’ are dependent events.
iv) If A1, A2, A3………, An are ‘n’ mutually exclusive and exhaustive events then the
probability of occurrence of at least one of them is given by:

Multiplication rule

If ‘A’ and ‘B’ are two independent events then the probability of occurrence of ‘A’ and ‘B’ is given by:

Solved Problem 6

i) Show that P(A) = 1 – P(A')


ii) Show that probability is a value between 0 and 1.
iii) Show that P(Ф) = 0 where Ф is null event.
3 a) The procedure of testing hypothesis requires a researcher to adopt several steps. Describe in brief
all such steps.

b) Explain the components of time series. a) Hypothesis testing procedure b) Components of time series

Answer –

Procedure for Hypothesis Testing

To test a hypothesis means to tell (on the basis of the data the researcher has collected) whether or not
the hypothesis seems to be valid. In hypothesis testing the main question is: whether to accept the null
hypothesis or not to accept the null hypothesis? Procedure for hypothesis testing refers to all those steps
that we undertake for making a choice between the two actions i.e., rejection and acceptance of a null
hypothesis

The various steps involved in hypothesis testing are stated below:

Five Steps in Hypothesis Testing:

1. Specify the Null Hypothesis

2. Specify the Alternative Hypothesis

3. Set the Significance Level (a)

4. Calculate the Test Statistic and Corresponding P-Value

5. Drawing a Conclusion

Step 1: Specify the Null Hypothesis

The null hypothesis (H0) is a statement of no effect, relationship, or difference between two or more
groups or factors. In research studies, a researcher is usually interested in disproving the null hypothesis.

Examples:

• There is no difference in intubation rates across ages 0 to 5 years.

• The intervention and control groups have the same survival rate (or, the intervention does not
improve survival rate).

• There is no association between injury type and whether or not the patient received an IV in
the prehospital setting
Step 2: Specify the Alternative Hypothesis

The alternative hypothesis (H1) is the statement that there is an effect or difference. This is usually the
hypothesis the researcher is interested in proving. The alternative hypothesis can be one-sided (only
provides one direction, e.g., lower) or two-sided. We often use two-sided tests even when our true
hypothesis is one-sided because it requires more evidence against the null hypothesis to accept the
alternative hypothesis.

Examples:

• The intubation success rate differs with the age of the patient being treated (two-sided).

• The time to resuscitation from cardiac arrest is lower for the intervention group than for the
control (one-sided).

• There is an association between injury type and whether or not the patient received an IV in
the prehospital setting (two sided).

Step 3: Set the Significance Level (a)

The significance level (denoted by the Greek letter alpha— a) is generally set at 0.05. This means that
there is a 5% chance that you will accept your alternative hypothesis when your null hypothesis is
actually true. The smaller the significance level, the greater the burden of proof needed to reject the null
hypothesis, or in other words, to support the alternative hypothesis.

Step 4: Calculate the Test Statistic and Corresponding P-Value

In another section we present some basic test statistics to evaluate a hypothesis. Hypothesis testing
generally uses a test statistic that compares groups or examines associations between variables. When
describing a single sample without establishing relationships between variables, a confidence interval is
commonly used.

The p-value describes the probability of obtaining a sample statistic as or more extreme by chance alone
if your null hypothesis is true. This p-value is determined based on the result of your test statistic. Your
conclusions about the hypothesis are based on your p-value and your significance level.

Example:

• P-value = 0.01 This will happen 1 in 100 times by pure chance if your null hypothesis is true.
Not likely to happen strictly by chance.

Step 5: Drawing a Conclusion

1. P-value <= significance level (a) => Reject your null hypothesis in favor of your alternative
hypothesis. Your result is statistically significant.
2. P-value > significance level (a) => Fail to reject your null hypothesis. Your result is not
statistically significant.

Hypothesis testing is not set up so that you can absolutely prove a null hypothesis. Therefore, when you
do not find evidence against the null hypothesis, you fail to reject the null hypothesis. When you do find
strong enough evidence against the null hypothesis, you reject the null hypothesis. Your conclusions also
translate into a statement about your alternative hypothesis. When presenting the results of a
hypothesis test, include the descriptive statistics in your conclusions as well. Report exact p-values rather
than a certain range. For example, "The intubation rate differed significantly by patient age with
younger patients have a lower rate of successful intubation (p=0.02)." Here are two more examples with
the conclusion stated in several different ways.

Example:

• H0: There is no difference in survival between the intervention and control group.

• H1: There is a difference in survival between the intervention and control group.

• a = 0.05; 20% increase in survival for the intervention group; p-value = 0.002

Conclusion:

• Reject the null hypothesis in favor of the alternative hypothesis.

• The difference in survival between the intervention and control group was statistically
significant.

• There was a 20% increase in survival for the intervention group compared to control (p=0.001).

B/COMPOTENTS OF TIME SERIES -

Any time series can contain some or all of the following components:

1. Trend (T)

2. Cyclical (C)

3. Seasonal (S)

4. Irregular (I)

These components may be combined in deferent ways. It is usually assumed that they are multiplied or

added, i.e.,

yt = T _ C _ S _ I

yt = T + C + S + I
To correct for the trend in the _rst case one divides the _rst expression by the trend (T). In the second
case it is subtracted.

Trend component

The trend is the long term pattern of a time series. A trend can be positive or negative depending on
whether the time series exhibits an increasing long term pattern or a decreasing long term pattern. If a
time series does not show an increasing or decreasing pattern then the series is stationary in the mean.

Cyclical component

Any pattern showing an up and down movement around a given trend is identi_ed as a cyclical pattern.
The duration of a cycle depends on the type of business or industry being analyzed.

Seasonal component

Seasonality occurs when the time series exhibits regular uctuations during the same month (or months)

every year, or during the same quarter every year. For instance, retail sales peak during the month of
December.

Irregular component

This component is unpredictable. Every time series has some unpredictable component that makes it a

random variable. In prediction, the objective is to \model" all the components to the point that the only

component that remains unexplained is the random component.

4 a) What is a Chi-square test? Point out its applications. Under what conditions is this test applicable?
b) Discuss the types of measurement scales with examples.

a) Meaning, applications and conditions b) Types of measurement scales with examples 4 6

Answer –

MEANING OF CHI –SQUIRE TEST.


The Chi-square test is one of the most commonly used non-parametric tests in statistical work. The Greek
Letter is used to denote this test describe the magnitude of discrepancy between the observed
and the expected frequencies. The value of is calculated as:

Where, O1, O2, O3….On are the observed frequencies and E1, E2, E3…En are the corresponding expected
or theoretical frequencies.
Application –
The Chi-Square test can also be applied for the discrete distributions. In using Chi-Square test, we
need no assumptions regarding the shape of sampling distributions. The applications of Chi- Square test
include testing:

 The significance of sample variances


 The goodness of fit of a theoretical distribution
 The independence in a contingency table whether the observed results are consistent with the
expected segregations in breeding experiments of genetics

CONDITIONS
The following are the conditions for using the Chi-Square test:

1. The frequencies used in Chi-Square test must be absolute and not in relative terms.

2. The total number of observations collected for this test must be large.

3. Each of the observations which make up the sample of this test must be independent of each other.

4. As test is based wholly on sample data, no assumption is made concerning the population
distribution. In other words, it is a non parametric-test.

6. The expected frequency of any item or cell must not be less than 5, the frequencies of adjacent items
or cells should be polled together in order to make it more than 5. 7.

8. This test is used only for drawing inferences through test of the hypothesis, so it cannot be used for
estimation of parameter value.

5 Business forecasting acquires an important place in every field of the economy. Explain the
objectives and theories of Business forecasting.

 Meaning of Business forecasting


 Objectives of Business forecasting
 Theories of Business forecasting

Answer -

MEANING OF BUSINESS FORCASTING.

Business forecasting provides a guide to long-term strategic planning and helps to inform
decisions about scheduling of production, personnel and distribution. These are common statistical tasks
in business that are often done poorly and frequently confused with planning and setting of goals.
Forecasting of USB-ED introduces participants to forecasting techniques and provides a practical
understanding of the main forecasting tools used by economists, and business, marketing and financial
analysts.

This unique program is designed to provide a balanced mix of theory and practice with the aim of
equipping participants to become operational forecasters, capable of designing, implementing and
evaluating their own forecasting projects. The theories discussed will be cemented by hands-on sessions
in the computer laboratory using industry-standard forecasting software packages.

OBJECTIVE OF FORCASTING

To a very large extent, success or failure would depend upon the ability to successfully
forecast the future course of events. Without some element of continuity between past, present and
future, there would be little possibility of successful prediction. But history is not likely to repeat itself and
we would hardly expect economic conditions next year or over the next 10 years to follow a clear cut
prediction. Yet, past patterns prevail sufficiently to justify using the past as a basis for predicting the
future.

A businessman cannot afford to base his decisions on guesses. Forecasting helps a businessman
in reducing the areas of uncertainty that surround management decision making with respect to costs,
sales, production, profits, capital investment, pricing, expansion of production, extension of credit,
development of markets, increase of inventories and curtailment of loans. These decisions are to be
based on present indications of future conditions.

Theories of Business forecasting -

Theories of Business forecasting

There are many theories, which are usually followed to make business forecasting. In theory of
economic rhythm the available historical data have to be analyzed into their components, i.e. trend,
seasonal, cyclical, and irregular variations.

The propounders of the theory were of the view that the economic phenomenon behaves in a
rhythmic manner and cycles of nearly the same intensity and duration tend to recur.

The secular trend obtained from historical data is projected a number of years into the future on
a graph or with the help of mathematical trend equation. If the phenomenon is cyclical in behavior, the
trend should be adjusted for cyclical movements. When the forecast for a year is to be split into months
or quarters then the forecasters should adjust the projected figure for seasonal variations also with the
help of seasonal indices.

Action and Reaction theory is based on the Newton’s 3rd law of motion i.e. for every action there
is an equal and opposite reaction. When we apply this law for business forecasting, it implies that if there
is depression in a particular field of business, there is bound to be boom in it sooner or later. It reminds us
of the business of cycle, which has four phases, i.e. prosperity, decline, depression and prosperity. This
theory regards certain levels of business activity as normal and the forecasters have to estimate the
normal level carefully. According to this theory if the price of commodity goes beyond the normal level, it
must come down also below the normal level because of the increased production and supply of that
commodity. Sequence theory or time lag method is based on behavior of different businesses, which
show similar movements occurring successively but not simultaneously.

As such, this method takes into account time lag based on the theory of lead lag relationship,
which hold goods in most cases. The series that usually change earlier serve as forecast for other related
series. This way the element of risk is considerably reduced.

Answer - Meaning of Analysis of Variance


The term analysis of variance probably sounds familiar to you, especially if you have been schooled in at
least one quantitative methodology course or have been working in the field of social sciences for some
time. Analysis of variance (ANOVA), as the name implies, is a statistical technique that is
intended to analyze variability in data in order to infer the inequality among population means. This may
sound illogical, but there is more to this idea than just what the name implies.

Assumptions

The results of a one-way ANOVA can be considered reliable as long as the following assumptions are
met:
 Response variable are normally distributed (or approximately normally distributed).
 Samples are independent.
 Variances of populations are equal.
 Responses for a given group are independent and identically distributed normal random variables (not a
simple random sample (SRS)).

Formulas/Calculation/Solution to the problem

Let H0: There is no significant difference in the means of three samples

Table : The three samples

T= Sum of all observations = 150

𝑇2 1502
Correction factor = 𝑁 = 15
=1500
𝑇2
SST (Total Sum of the Squares)= Sum of squares of all observations -
𝑁
= 8 + 7 +12 +10 +..........+14 1500 1600 -1500 =100
Sum of the Squares of Error between the columns (samples):

Sum of the squares of the Error within columns (samples):


SSE = SST – SSC = 100 – 40 = 60
Variance between samples:

Variance within the samples:

The degree of freedom = (k – 1, n – k) = (2, 12).


[ k is the number of columns and n is the total number of observations]

Q7 Distinguish between Classification and Tabulation. Explain the structure and components of a
Table with an example.
Meaning of Classification and Tabulation
Differences between Classification and Tabulation
Structure and Components of a Table with an example
Answer.
Meaning of Classification and Tabulation
Classification
According to Secrist, “Classification is the process of arranging data into sequences and groups according
to their common characteristics or separating them into different but related parts”. According to Stockton
and Clark, “The process of grouping large number of individual facts and observations, on the basis of
similarity among the items is called Classification”.
Tabulation
Tabulation follows classification. It is a logical or systematic listing of related data in rows and columns.
The row of a table represents the horizontal arrangement of data and column represents the vertical
arrangement of data. The presentation of data in tables should be simple, systematic and unambiguous.
The objectives of tabulation are to:
 Simplify complex data
 Highlight important characteristics
 Present data in minimum space
 Facilitate comparison
 Bring out trends and tendencies
 Facilitate further analysis
Differences between Classification and Tabulation
Table depicts the few differences between classification and tabulation.
Table: Differences between Classification and Tabulation

Structure and Components of a Table with an example


Table and figure depict the parts of a table along with the explanation of each tab (tabs from 1 to 10).

Tab 1: Table number


Table number is to identify the table for reference. When there are many tables in an analysis, then table
numbers are helpful in identifying the tables.
Tab 2: Title
Title indicates the scope and the nature of contents in a concise form. In other words, title of a table gives
information about the data contained in the body of the table. Title should not be lengthy.
Tab 3 and Tab 4: Captions
Captions are the headings and subheadings describing the data present in the columns.
Tab 5 and Tab 6: Stubs
Stubs are the headings and subheadings of rows.
Tab 7: Body of the table
Body of the table contains numerical information.
Tab 8: Totals
The sub-totals for each separate classification and a general total for all combined classes should be given
at the bottom or right side of the figures whose totals are taken. Ruling and spacing separate columns and
rows. However, totals are separated from main body by thick lines.
Tab 9: Head note
Head note is given below the title of the table to indicate the units of measurement of the data and is enclosed
in brackets.
Tab 10: Source note
Source note indicates the source from which data is taken. The source note related to table is placed at the
bottom on the left hand corner.

Q8 a) Describe the characteristics of Normal probability distribution.


b) In a sample of 120 workers in a factory, the mean and standard deviation of wages were Rs. 11.35
and Rs.3.03 respectively. Find the percentage of workers getting wages between Rs.9 and Rs.17 in
the whole factory assuming that the wages are normally distributed.
Characteristics of Normal probability distribution
Formula/Computation/Solution to the problem
Answer.

a) Characteristics of Normal probability distribution


The following are some of the characteristics of Normal distribution:
1. Normal distribution is a Continuous probability distribution
2. Its probability density function is given by:

3. Its mean is µ and standard deviation is , where µ and  are the parameters of the distribution
4. It is a bell-shaped curve and is symmetric about its mean, as depicted in figure.

Fig.: Normal Distribution Curve

a. It is symmetrical (Non-skew). That is β1 = 0


b. The mean, median and mode are equal
5. The Mean divides the curve into two equal portions
6. Its quartile deviation, Q.D. = 2/3 
7. Its mean deviation, M.D.  4/5 
8. The X – axis is an asymptote to the curve [Asymptote is a straight line that touches the curve at
infinity]
9. The point of inflexion occurs at µ≠
10. It is a unimodal distribution
11. Mean, Median and Mode coincide
12. The area under normal curve within certain limits is depicted in table. The graphical representation of
the table is depicted in figure.

Fig. : Areas under the Normal Distribution Curve

b) Formula/Computation/Solution to the problem

Z1= (x1-µ)/σ
= (9-11.35)/3.03
=0.078

Z2= (17-11.35)/3.03
=1.86

From tables
Area between z=0 and z=0.78 is 0.2823
Area between z=0 and z= 1.86 is 0.4686
Area covered by the workers getting wages between rs 9 and rs 17
= 0.2823+0.4686
=0.7509

Required percentage = (0.7509*120)/120 *100


= 75%

Q9 a) The procedure of testing hypothesis requires a researcher to adopt several steps. Describe in
brief all such steps.
b) Distinguish between:
i. Stratified random sampling and Systematic sampling
ii. Judgment sampling and Convenience sampling
Hypothesis testing procedure
Differences

Answer.
Steps for procedure of testing hypothesis
Five Steps in Hypothesis Testing:
1. Specify the Null Hypothesis
2. Specify the Alternative Hypothesis
3. Set the Significance Level (a)
4. Calculate the Test Statistic and Corresponding P-Value
5. Drawing a Conclusion

Step 1: Specify the Null Hypothesis

The null hypothesis (H0) is a statement of no effect, relationship, or difference between two or more groups
or factors. In research studies, a researcher is usually interested in disproving the null hypothesis.

Examples:
 There is no difference in intubation rates across ages 0 to 5 years.
 The intervention and control groups have the same survival rate (or, the intervention does not
improve survival rate).
 There is no association between injury type and whether or not the patient received an IV in the
prehospital setting

Step 2: Specify the Alternative Hypothesis

The alternative hypothesis (H1) is the statement that there is an effect or difference. This is usually the
hypothesis the researcher is interested in proving. The alternative hypothesis can be one-sided (only
provides one direction, e.g., lower) or two-sided. We often use two-sided tests even when our true
hypothesis is one-sided because it requires more evidence against the null hypothesis to accept the
alternative hypothesis.

Examples:
 The intubation success rate differs with the age of the patient being treated (two-sided).
 The time to resuscitation from cardiac arrest is lower for the intervention group than for the
control (one-sided).
 There is an association between injury type and whether or not the patient received an IV in the
prehospital setting (two sided).
Step 3: Set the Significance Level (a)

The significance level (denoted by the Greek letter alpha— a) is generally set at 0.05. This means that there
is a 5% chance that you will accept your alternative hypothesis when your null hypothesis is actually true.
The smaller the significance level, the greater the burden of proof needed to reject the null hypothesis, or
in other words, to support the alternative hypothesis.

Step 4: Calculate the Test Statistic and Corresponding P-Value

In another section we present some basic test statistics to evaluate a hypothesis. Hypothesis testing generally
uses a test statistic that compares groups or examines associations between variables. When describing a
single sample without establishing relationships between variables, a confidence interval is commonly
used.

The p-value describes the probability of obtaining a sample statistic as or more extreme by chance alone if
your null hypothesis is true. This p-value is determined based on the result of your test statistic. Your
conclusions about the hypothesis are based on your p-value and your significance level.

Example:

 P-value = 0.01 This will happen 1 in 100 times by pure chance if your null hypothesis is true. Not
likely to happen strictly by chance.

Step 5: Drawing a Conclusion

1. P-value <= significance level (a) => Reject your null hypothesis in favor of your alternative
hypothesis. Your result is statistically significant.
2. P-value > significance level (a) => Fail to reject your null hypothesis. Your result is not statistically
significant.

Hypothesis testing is not set up so that you can absolutely prove a null hypothesis. Therefore, when you
do not find evidence against the null hypothesis, you fail to reject the null hypothesis. When you do find
strong enough evidence against the null hypothesis, you reject the null hypothesis. Your conclusions also
translate into a statement about your alternative hypothesis. When presenting the results of a hypothesis
test, include the descriptive statistics in your conclusions as well. Report exact p-values rather than a certain
range. For example, "The intubation rate differed significantly by patient age with younger patients have a
lower rate of successful intubation (p=0.02)." Here are two more examples with the conclusion stated in
several different ways.

Example:
 H0: There is no difference in survival between the intervention and control group.
 H1: There is a difference in survival between the intervention and control group.
 a = 0.05; 20% increase in survival for the intervention group; p-value = 0.002

Conclusion:
 Reject the null hypothesis in favor of the alternative hypothesis.
 The difference in survival between the intervention and control group was statistically significant.
 There was a 20% increase in survival for the intervention group compared to control (p=0.001).
Difference between Stratified random sampling and Systematic sampling & Judgement sampling and
Convenience sampling
Stratified random sampling
This sampling design is most appropriate if the population is heterogeneous with respect to characteristic
under study or the population distribution is highly skewed. We subdivide the population into several
groups or strata such that:
i) Units within each stratum is more homogeneous
ii) Units between strata are heterogeneous
iii) Strata do not overlap, in other words, every unit of the population belongs to one and only one stratum
The criteria used for stratification are geographical, sociological, age, sex, income etc. The population of
size ‘N’ is divided into ‘k’ strata relatively homogenous of size N1, N2…….Nk such that ‘N1 + N2
+……… + Nk = N’.
Then, we draw a simple random sample from each stratum either proportional to size of stratum or equal
units from each stratum.

Systematic sampling
This design is recommended if we have a complete list of sampling units arranged in some systematic order
such as geographical, chronological or alphabetical order.
Suppose the population size is ‘N’. The population units are serially numbered ‘1’ to ‘N’ in some systematic
order and we wish to draw a sample of ‘n’ units. Then we divide units from ‘1’ to ‘N’ into ‘K’ groups such
that each group has ‘n’ units. This implies ‘nK = N’ or ‘K = N/n’. From the first group, we select a unit at
random. Suppose the unit selected is 6th unit, thereafter we select every 6 + Kth units. If ‘K’ is 20, ‘n’ is 5
and ‘N’ is 100 then units selected are 6, 26, 46, 66, 86.

Judgment sampling
The choice of sample items depends exclusively on the judgment of the investigator. The investigator’s
experience and knowledge about the population will help to select the sample units. It is the most suitable
method if the population size is less. The table depicts the merits and demerits of judgement sampling.

2. Convenience sampling
The sample units are selected according to the convenience of the investigator. It is also called “chunk”
which refers to the fraction of the population being investigated, which is selected neither by probability
nor by judgment. Moreover, a list or framework should be available for the selection of the sample. It is
used to make pilot studies. However, there is a high chance of bias being introduced.
Q10 a) What is regression analysis? How does it differ from correlation analysis?
b) Calculate Karl Pearson’s coefficient of correlation between X series and Y series.

x 110 120 130 120 140 135 155 160 165 155
y 12 18 20 15 25 30 35 20 25 10

Meaning of Regression and Correlation


Differences
Formula/ Computation/ Solution to the problem

Answer.
Meaning of Regression and Correlation

Regression analysis
According to M. M. Blair, Regression is defined as, “the measure of the average relationship between two
or more variables in terms of the original units of the data”. Regression analysis – in statistics, this includes
any technique for learning about the relationship between one or more dependent variables Y and one or
more independent variables X. Regression analysis is used to estimate the values of the dependent variables
from the values of the independent variables. Regression analysis is used to get a measure of the error
involved while using the regression line as a basis for estimation. The regression coefficient Y on X is the
coefficient of the variable ‘X’ in the line of regression Y on X. Regression coefficients are used to calculate
the correlation coefficient. The square of correlation is the product of regression coefficients.

Correlation
Correlation analysis attempts to study the relationship between the two variables ‘X and ‘Y’. In regression,
it is attempted to quantify the dependence of one variable on the other. For example, if there are two
variables ‘X’ and ‘Y’ and ‘Y’ depends on ‘X’, then the dependence is expressed in the form of the equations.
When two or more variables move in sympathy with the other, then they are said to be correlated. If both
variables move in the same direction, then they are said to be positively correlated. If the variables move in
the opposite direction, then they are said to be negatively correlated. If they move haphazardly, then there
is no correlation between them. Correlation analysis deals with the following:
 Measuring the relationship between variables.
 Testing the relationship for its significance.
 Giving confidence interval for population correlation measure.
The correlation between two variables may be due to the following causes:
 Due to small sample sizes, Correlation may be present in sample and not in population.
 Due to a third factor, like in the case, Correlation between yield of rice and tea may be due to a
third factor - ‘rain’.

Differences

Correlation and regression analysis are related in the sense that both deal with relationships among
variables.

The correlation coefficient is a measure of linear association between two variables. Values of the
correlation coefficient are always between -1 and +1. A correlation coefficient of +1 indicates that two
variables are perfectly related in a positive linear sense, a correlation coefficient of -1 indicates that two
variables are perfectly related in a negative linear sense, and a correlation coefficient of 0 indicates that
there is no linear relationship between the two variables. The correlations term is used when
1) Both variables are random variables, and
2) The end goal is simply to find a number that expresses the relation between the variables
Regression analysis involves identifying the relationship between a dependent variable and one or more
independent variables. The regression term is used when
1) One of the variables is a fixed variable, and
2) The end goal is use the measure of relation to predict values of the random variable based on values of
the fixed variable

Formula/ Computation/ Solution to the problem

S. No X Y XY X2 Y2
1 110 12 1320 12100 144
2 120 18 2160 14400 324
3 130 20 2600 16900 400
4 120 15 1800 14400 225
5 140 25 3500 19600 625
6 135 30 4050 18225 900
7 155 35 5425 24025 1225
8 160 20 3200 25600 400
9 165 25 4125 27225 625
10 155 10 1550 24025 100
N=10 ∑X=1390 ∑Y=210 ∑XY= 29730 ∑X2=196500 ∑Y2=4968

n∑XY-(∑X)( ∑Y)
r= --------------------------------
√[n∑X2-(∑X)2] √[n∑Y2-(∑Y)2]

10(29730)-(1390x210)
r= ----------------------------------------
√[10(196500)-(1390)2 ] √[10(4968)-(210)2]

297300-291900 5400
r= ---------------------- = ---------------------
√ [1965000-1932100] √[49680-44100] √32900√ 5580

r=0.3987 Answer.

Q11. Briefly explain the methods and theories of Business forecasting.


Meaning of Business forecasting
Methods of Business forecasting
Theories of Business forecasting
Answer.
Meaning of Business forecasting
Business forecasting provides a guide to long-term strategic planning and helps to inform decisions about
scheduling of production, personnel and distribution. These are common statistical tasks in business that
are often done poorly and frequently confused with planning and setting of goals. Forecasting of USB-ED
introduces participants to forecasting techniques and provides a practical understanding of the main
forecasting tools used by economists, and business, marketing and financial analysts.
This unique program is designed to provide a balanced mix of theory and practice with the aim of equipping
participants to become operational forecasters, capable of designing, implementing and evaluating their
own forecasting projects. The theories discussed will be cemented by hands-on sessions in the computer
laboratory using industry-standard forecasting software packages.

Methods of Business forecasting


The following are the main methods of business forecasting.
1. Business barometers
2. Time series analysis
3. Extrapolation
4. Regression analysis
5. Modern econometric methods
6. Exponential smoothing method

Business Barometers
Business indices are constructed to study and analyse the business activities on the basis of which future
conditions are predetermined. As business indices are the indicators of future conditions, they are also
known as ’business barometers’ or ‘economic barometers’. With the help of these business barometers the
trend of fluctuations in business conditions are understood and a decision can be taken relating to the
problem by forecasting.
Time series analysis
Time series analysis is also used for the purpose of making business forecasting. The forecasting through
time series analysis is possible only when the business data of various years are available which reflects a
definite trend and seasonal variation. By time series analysis the long term trend, secular trend, seasonal
and cyclical variations are ascertained, analyzed and separated from the data of various years.
Extrapolation
Extrapolation is the simplest method of business forecasting. By extrapolation, a businessman finds out the
possible trend of demand of his goods and also about the future price trends. The accuracy of extrapolation
depends on two factors:
 Knowledge about the fluctuations of the figures
 Knowledge about the course of events relating to the problem under consideration
Regression analysis
The regression approach offers many valuable contributions to the solution of the forecasting problem. It
is the means by which we select from among the many possible relationships between variables in a
complex economy, which will be useful for forecasting.
Regression relationship may involve one predicted or dependent variable and one independent variable
under simple regression, or it may involve relationships between the variable to be forecasted and several
independent variables under multiple regressions.
Modern econometric methods
Econometric techniques, which originated in the eighteenth century, have recently gained popularity for
forecasting. Econometrics refers to the application of mathematical economic theories and statistical
procedures to economic data to verify economic theorems. Models take the form of a set of simultaneous
equations. The values of the constants in such equations are supplied by a study of statistical time series,
and a large number of equations may be necessary to produce an adequate model.

Exponential smoothing method


This method is regarded as the best method of business forecasting as compared to other methods.
Exponential smoothing is a special kind of increasing exponential weighted average assigned to recent
observation data and is found extremely useful in short-term forecasting of inventories and sales.
Theories of Business forecasting

There are many theories, which are usually followed to make business forecasting. In theory of economic
rhythm the available historical data have to be analyzed into their components, i.e. trend, seasonal, cyclical,
and irregular variations. The propounders of the theory were of the view that the economic phenomenon
behaves in a rhythmic manner and cycles of nearly the same intensity and duration tend to recur.
The secular trend obtained from historical data is projected a number of years into the future on a graph or
with the help of mathematical trend equation. If the phenomenon is cyclical in behavior, the trend should
be adjusted for cyclical movements. When the forecast for a year is to be split into months or quarters then
the forecasters should adjust the projected figure for seasonal variations also with the help of seasonal
indices.
Action and Reaction theory is based on the Newton’s 3rd law of motion i.e. for every action there is an
equal and opposite reaction. When we apply this law for business forecasting, it implies that if there is
depression in a particular field of business, there is bound to be boom in it sooner or later. It reminds us of
the business of cycle, which has four phases, i.e. prosperity, decline, depression and prosperity.
This theory regards certain levels of business activity as normal and the forecasters have to estimate the
normal level carefully. According to this theory if the price of commodity goes beyond the normal level, it
must come down also below the normal level because of the increased production and supply of that
commodity. Sequence theory or time lag method is based on behavior of different businesses, which show
similar movements occurring successively but not simultaneously.
As such, this method takes into account time lag based on the theory of lead lag relationship, which hold
goods in most cases. The series that usually change earlier serve as forecast for other related series. This
way the element of risk is considerably reduced.
Q12 Construct Fisher’s Ideal Index for the given information and check whether Fisher’s formula
satisfies Time Reversal and Factor Reversal Tests.

Items P0 Q0 P1 Q1
A 16 5 20 6
B 12 10 18 12
C 14 8 16 10
D 20 6 22 10
E 80 3 90 5
F 40 2 50 5
Formula of Fishers Ideal Index
Computation of Fisher’s Ideal Index
Fisher’s formula satisfies Time Reversal Test
Fisher’s formula satisfies Factor Reversal Test
Answer.
Formula of Fishers Ideal Index

This method is a combination of Laspeyre’s and Paasche’s method. If we find out the geometric average of
Laspeyre’s index and Paasche’s index, we get the index suggested by Fisher. Fisher’s index number is given
by:
Where,
LP01 & PP01 is Paasche’s price index.

(20x5+18x10+16x8+22x6+90x3+50x2) (20x6+18x12+16x10+22x10+90x5+50x5)
--------------------------------------------------- x ------------------------------------------------------
(16x5+12x10+14x8+20x6+80x3+40x2) (16x6+12x12+14x10+20x10+80x5+40x5)

(100+180+128+132+270+100) (120+216+160+220+450+250)
---------------------------------------- x -------------------------------------------
(80+ 120+112+120+240+80) (96+144+140+200+400+200)

910 1416
------- x -------
752 1180

=
1288560 1.45212766
------------
887360

= 1.20 Answer

Fisher’s formula satisfies Time Reversal Test

Time reversal test


This test requires the formula for calculating the index number that should be such that it will give the same
ratio between one period of comparison and the other. Symbolically, is Laspeyre’s price index and P 01
xP10=1
This test is satisfied by Fisher’s ideal index, simple geometric mean of price relatives, and weighted
geometric mean of price relatives and Marshall-Edge worth index number.

Fisher’s formula satisfies Factor Reversal Test

Factor reversal test


The formula should permit the interchange of price and quantity without giving inconsistent results.

This test is satisfied by Fisher’s ideal index


Ques.13. A statistical survey is a scientific process of collection and analysis of numerical
data. Explain the stages of statistical survey. Describe the various methods for collecting
data in a statistical survey.
Ans:- Definition of Statistical Survey:- Scientific process of collection and analysis of
numerical data is called statistical survey. Statistical surveys are used to collect information
about units in a population and it involves asking questions to individuals. Surveys of human
populations are common in government, health, social science and marketing sectors.

Stages of Statistical Survey


Statistical surveys involve two stages namely – Planning and Execution.
Figure shows the two broad stages of Statistical Survey.

Stages of Statistical Survey


Planning a Statistical Survey
The relevance and accuracy of data obtained in a survey depends upon the care taken in
planning. A properly planned investigation can lead to the best results with least cost and time.
Figure gives the explanation of steps involved in the planning stage.
Fig. Steps Involved in Planning of a Statistical Survey

Execution of statistical survey


Controlled methods should be adopted at every stage of carrying out the investigation to check
the accuracy, coverage, methods of measurements, analysis and interpretation.
The collected data should be edited, classified, tabulated and presented in the form of diagrams
and graphs. The data should be carefully and systematically analyzed and interpreted.

Collection of primary data is done by a suitable method as per the following:


1. Direct personal observation
2. Indirect oral interview
3. Information through agencies
4. Information through mailed questionnaires
5. Information through a schedule filled by investigators
Let us study each of them in detail which are as follows:
1. Direct Personal Observation – In the direct personal observation method, as illustrated in
figure 2.5, the investigator collects data by having direct contact with the units of investigation.
The accuracy of data depends upon the ability, training and attitude of the investigator. The
direct personal observation method is suitable where,
 The scope of investigation is narrow
 Investigation is confidential and requires personal attention of the investigator
 Accuracy of data is important
2. Indirect oral interview – Indirect oral interview is used when the area to be covered is large.
The investigator collects the data from a third party or a witness or the head of an institution.
This method is generally used by the police department in cases related to enquiries on the cause
of fires, thefts
or murders. In this method, the investigator contacts witnesses or neighbors or friends or some
other third parties who are capable of supplying the necessary information. Enquiry committees
appointed by governments use this method to get people’s views and every possible detail
regarding the enquiry. This
method suits best when direct sources do not exist or cannot be relied upon or would be
unwilling to take part in the survey.

3. Collecting information through agencies – Methods of collecting information through local


agencies or correspondents is generally adopted by newspapers and television channels. Local
agents are appointed in different parts of the area under investigation. This method is illustrated
in figure They send the desired information at regular intervals. This method is used where the
area to be covered is very large and periodic information is required. However, one disadvantage
of this method is that the information is likely to be biased.

4. Information collected through mailed questionnaires – Often, information is collected


through questionnaires. The questionnaires are filled with questions pertaining to the
investigation. They are sent to the respondents with a covering letter soliciting cooperation from
the respondents (respondents are the people who respond to questions in the questionnaire). The
respondents are asked to give correct information and to mail the questionnaire back. The
objectives of the investigation are explained in the covering letter along with the assurance to
keep the information confidential.

Ques.14a) Explain the approaches to define probability.


b) In a bolt factory machines A, B, C manufacture 25, 35 and 40 percent of the total output.
Of their total output 5, 4 and 2 percent are defective respectively. A bolt is drawn at
random and is found to be defective. What are the probabilities that it was manufactured
by machines A, B and C?

a) Explanation of the approaches to define probability

Ans:- In some areas, such as mathematics or logic, results of some process can be known with
certainty (e.g., 2+3=5). Most real life situations, however, involve variability and uncertainty.
For example, it is uncertain whether it will rain tomorrow; the price of a given stock a week from
today is uncertain Note_1 ; the number of claims that a car insurance policy holder will make
over a one-year period is uncertain. Uncertainty or "randomness" (meaning variability of results)
is usually due to some mixture of two factors: (1) variability in populations consisting of animate
or inanimate objects (e.g., people vary in size, weight, blood type etc.), and (2) variability in
processes or phenomena (e.g., the random selection of 6 numbers from 49 in a lottery draw can
lead to a very large number of different outcomes; stock or currency prices fluctuate substantially
over time).

Variability and uncertainty make it more difficult to plan or to make decisions. Although they
cannot usually be eliminated, it is however possible to describe and to deal with variability and
uncertainty, by using the theory of probability. This course develops both the theory and
applications of probability.

b) Applying Bayes theorem and calculating the probabilities

Ans:-

the Bayes’ theorem states that if A1, A2………….., An are ‘n’ mutually exclusive and exhaustive
events with prior probabilities P(A1),P(A2 ),...P(An ) respectively and ‘B’ be an event for which the

conditional probabilities of the probability of occurrence of B given A1 , B given A2 ,…B given An


are P(B/A ),P(B/A ),...P(B/A ) 1 2 n respectively, then the posterior probability of occurrence of A1
given that given that ‘B’has already occurred is given by:
Ques.15 a). The procedure of testing hypothesis requires a researcher to adopt several
steps. Describe in brief all such steps.
b) A sample of 400 items is taken from a normal population whose mean as well as variance
is 4. If the sample mean is 4.5, can the sample be regarded as a truly random sample?

a) Hypothesis testing procedure

Ans:-

Procedure for Hypothesis Testing

To test a hypothesis means to tell (on the basis of the data the researcher has collected) whether
or not the hypothesis seems to be valid. In hypothesis testing the main question is: whether to
accept the null hypothesis or not to accept the null hypothesis? Procedure for hypothesis testing
refers to all those steps that we undertake for making a choice between the two actions i.e.,
rejection and acceptance of a null hypothesis

The various steps involved in hypothesis testing are stated below:

(i) Making a formal statement: The step consists in making a formal statement of the null
hypothesis (H0) and also of the alternative hypothesis (Ha) This means that hypotheses should be
clearly stated, considering the nature of the research problem For instance, Mr. Mohan of the
Civil Engineering Department wants to test the load bearing capacity of an old bridge which
must be more than 10 tons In that case he can state his hypotheses as under:

Null Hypothesis H0: m = 10 tons

Alternative Hypothesis Ha: m > 10 tons

Take another example The average score in an aptitude test administered at the national level is
80 To evaluate a state’s education system, the average score of 100 of the state’s students selec-
ted on random basis was 75. The state wants to know if there is a significant difference between
the local scores and the national scores. In such a situation the hypotheses may be stated as under

Null Hypothesis H0: m = 80

Alternative HypothesisHa: m ¹ 80

The formulation of hypotheses is an important step, which must be accomplished with due care
in accordance with the object and nature of the problem under consideration It also indicates
whether we should use a one-tailed test or a two-tailed test. If Ha is of the type greater than (or of
the type lesser than), we use a one-tailed test, but when Ha is of the type “whether greater or
smaller”, then we use a two-tailed test

(ii) Selecting a significance level: The hypotheses are tested on a pre-determined level of
significance and as such the same should be specified Generally, in practice, either 5% level or
1% level is adopted for the purpose The factors that affect the level of significance are

(a) the magnitude of the difference between sample means

(b) the size of the samples

(c) the variability of measurements within samples

(d) whether the hypothesis is directional or non-directional (A directional hypothesis is one


which predicts the direction of the difference between, say, means). In brief, the level of
significance must be adequate in the context of the purpose and nature of enquiry.

Procedure in Hypothesis testing:

Procedure in Hypothesis testing:

Procedure in Hypothesis testing:

b) Calculation and solution to the problem


Ans:-

H0 : µ=4, H1: µ≠4


𝟒.𝟓−𝟒 𝟒.𝟓−𝟒
Ƶ= = =5
𝛛/√𝐧 𝟐/𝟐𝟎

Note : Since the sample size is large, normal test is applicable


Since the value of calculated 2 is greater than even 1% value of tabulated 2 i.e. 2.58, the null
hypotheses is rejected. The sample cannot be regarded as a truly random ample.

Ques.16. a) What is a Chi-square test? Point out its applications. Under what
conditions is this test applicable?
b) What are the components of time series? Enumerate the methods of determining
trend in time series.

Ans:-

Meaning of Chi-square test

The Chi-square test is one of the most commonly used non-parametric tests in statistical work.
The Greek Letter 2 is used to denote this test. 2 describe the magnitude of discrepancy
between the observed and the expected frequencies. The value of 2 is calculated as:

Where, O1, O2, O3….On are the observed frequencies and E1, E2, E3…En are the
corresponding expected or theoretical frequencies.

In the test for independence, the null hypothesis is that the row and column variables are
independent of each other. We have studied earlier, that the hypothesis testing is done under the
assumption that the null hypothesis is true.
 The following are the properties of the test for independence:
 The data are the observed frequencies
 The data is arranged in the form of a contingency table
 The degrees of freedom ‘ ’ can be calculated as:
 Number of rows 1 Number of columns 1 where, ‘ ’ is the degrees of freedom
 The test for independence has a Chi-Square distribution and is always a right tail test.
 The expected value is computed by taking the row total, multiplying it
 with the column total and dividing by the grand total. That is given by:

E= Row Total x Column Total/Grand Total

Condition
The following are the conditions for using the Chi-Square test:
1. The frequencies used in Chi-Square test must be absolute and not in relative terms.
2. The total number of observations collected for this test must be large.
3. Each of the observations which make up the sample of this test must be independent of each
other.
4. As 2 test is based wholly on sample data, no assumption is made concerning the population
distribution. In other words, it is a non parametric-test.

Components of a time series

Any time series can contain some or all of the following components:
1. Trend (T)
2. Cyclical (C)
3. Seasonal (S)
4. Irregular (I)
These components may be combined in deferent ways. It is usually assumed that they are
multiplied or
added, i.e.,
yt = T _ C _ S _ I
yt = T + C + S + I
To correct for the trend in the _rst case one divides the _rst expression by the trend (T). In the
second case it is subtracted.
Trend component
The trend is the long term pattern of a time series. A trend can be positive or negative depending
on whether the time series exhibits an increasing long term pattern or a decreasing long term
pattern. If a time series does not show an increasing or decreasing pattern then the series is
stationary in the mean.
Cyclical component
Any pattern showing an up and down movement around a given trend is identi_ed as a cyclical
pattern. The duration of a cycle depends on the type of business or industry being analyzed.
Seasonal component
Seasonality occurs when the time series exhibits regular uctuations during the same month (or
months)
every year, or during the same quarter every year. For instance, retail sales peak during the
month of December.
Irregular component
This component is unpredictable. Every time series has some unpredictable component that
makes it a
random variable. In prediction, the objective is to \model" all the components to the point that
the only
component that remains unexplained is the random component.

Ques.17 What do you mean by cost of living index? Discuss the methods of
construction of cost of living index with an example for each.
Ans:-

Meaning of cost of living index

Cost of living is the cost of maintaining a certain standard of living. A cost-of-living index is
a price index that measures relative cost of living over time. Such indexes are constructed
to have a value of 100 in a given year (or period or place), called the base. An index value of
110 indicates that the current cost of living is ten percent higher than in the base year.
Because the index provides measure of the change in the cost of living, it has no units.

Methods of constructing cost of living index with an example for each

The construction of the price index numbers involves the following steps or problems:

1. Selection of Base Year:

The first step or the problem in preparing the index numbers is the selection of the base year. The
base year is defined as that year with reference to which the price changes in other years arc
compared and expressed as percentages. The base year should be a normal year. In other words, it
should be free from abnormal conditions like wars, famines, floods, political instability, etc.

Base year can be selected in two ways:

(a) through fixed base method in which the base year remains fixed; and

(b) through chain base method in which the base year goes on changing, e.g., for 1980 the base
year will be 1979, for 1979 it will be 1978, and so on.

2. Selection of Commodities:

The second problem in the construction of index numbers is the selection of the commodities.
Since all commodities cannot be included, only representative commodities should be selected
keeping in view the purpose and type of the index number.

In selecting items, the following points are to be kept in mind:

(a) The items should be representative of the tastes, habits and customs of the people.

(b) Items should be recognizable,

(c) Items should be stable in quality over two different periods and places.

(d) The economic and social importance of various items should be considered

(e) The items should be fairly large in number,

(f) All those varieties of a commodity which are in common use and are stable in character should
be included,

3. Collection of Prices:

After selecting the commodities, the next problem is regarding the collection of their prices:

(a) from where the prices to be collected;

(b) whether to choose wholesale prices or retail prices;

(c) whether to include taxes in the prices or not etc.

While collecting prices the following points are to be noted:

(a) prices are to be collected from those places where a particular commodity is traded in large
quantities,
(b) Published information regarding the prices should also be utilized,

(c) In selecting individuals and institutions who would supply price quotations, care should be
taken that they are not biased.

(d) Selection of wholesale or retail prices depends upon the type of index number to be prepared.
Wholesale prices are used in the construction of general price index and retail prices are used in
the construction of cost-of-living index number,

(e) Prices collected from various places should be averaged.

Ques.18 a) What is analysis of variance? What are the assumptions of this technique?
b) Three samples below have been obtained from normal populations with equal
variances. Test the hypothesis at 5% level that the population means are equal.
Variances. Test the hypothesis at 5% level that the population means are equal.

A B C
8 7 12
10 5 9
7 10 13
14 9 12
11 9 14

[The table value of F at 5% level of significance for 1 = 2 and 2 = 12 is 3.88]

Ans:-

Meaning of Analysis of Variance


The term analysis of variance probably sounds familiar to you, especially if you have been
schooled in at least one quantitative methodology course or have been working in the field of
social sciences for some time. Analysis of variance (ANOVA), as the name implies, is a statistical
technique that is
intended to analyze variability in data in order to infer the inequality among population means.
This may sound illogical, but there is more to this idea than just what the name implies.
Assumptions

The results of a one-way ANOVA can be considered reliable as long as the following
assumptions are met:

 Response variable are normally distributed (or approximately normally distributed).


 Samples are independent.
 Variances of populations are equal.
 Responses for a given group are independent and identically distributed normal random
variables (not a simple random sample (SRS)).

Formulas/Calculation/Solution to the problem

Let H0: There is no significant difference in the means of three samples

Table : The three samples

T= Sum of all observations = 150

T2 1502
Correction factor = = =1500
N 15
T2
SST (Total Sum of the Squares)= Sum of squares of all observations -
N
= 8 + 7 +12 +10 +..........+14 1500 1600 -1500 =100
Sum of the Squares of Error between the columns (samples):
Sum of the squares of the Error within columns (samples):
SSE = SST – SSC = 100 – 40 = 60
Variance between samples:

Variance within the samples:

The degree of freedom = (k – 1, n – k) = (2, 12).


[ k is the number of columns and n is the total number of observations]
MCQS
1. In which of the following situations would you like to use Statistics?
a) Buying a house
b) Purchasing medicine prescribed by a doctor
c) Investing funds in several options
d) Attending relatives marriages

Ans- c

2. Out of the following, which one does not refer to a mass of data?
a) Banking Statistics
b) Mathematical Statistics
c) Agricultural Statistics
d) Income Statistics

Ans- b

3. Which of the following statement is most appropriate?


a) Nature believed in statistics
b) Nature created statistics
c) Nature believed in variation
d) Nature believed in symmetrical variation

Ans- c

4. Which of the following statement is true?


a) Statistics enlarges physical vision
b) Statistics helps in estimation
c) Statistics quantifies uncertainty
Ans- b

5. The origin of statistics can be traced to


a) State
b) Commerce
c) Economics

Ans- a

6. According to the definition of Statistics given by Croxton and Cowden, what are the four
components of Statistics?

Ans- The four components of Statistics are collection, presentation, analysis and interpretation of
data.

7. ‘Statistics may be called the science of counting’ is the definition given by


a) Croxton
b) A.L.Bowley
c) Boddington
d) Webster

Ans- b

8. In the olden days statistics was confined only to _______.

Ans- State affair

10. Answer the following:


a) Should the same degree of accuracy be applied while measuring the height of a mountain and the
height of a person?
b) Does Statistics deal with qualitative data?

Ans- a) No

b) No

11. Categorise the following data as qualitative or quantitative data


a) The number of transactions occurring in an ATM per day
b) The popular brand name in cars is Maruthi

Ans- a) Quantitative data


b) Qualitative data

12. The total sale of a product in Area A is 840 for 30 working days. The total sale of the same product in
Area B is 784 for 28 working days. Should Statistics be applied to get an appropriate picture regarding
the comparison of sales?

Ans- Yes

13. What are the main stages in a survey?

Ans- Planning and execution

14. Training of investigators belongs to which stage?

Ans- Planning

15. Analysis of data is a part of the execution of survey. Is this correct?

Ans- Yes

16. Classify the following as finite or infinite population.


i) Production of a product in a factory for a day
ii) Number of points in this page
iii) The set of rational numbers
iv) The weight of new born babies measured up to first decimal place in a state during the first
week of February 2008

Ans- i) Finite ii) Infinite iii) Infinite iv) Finite

17. Classify the following as an attribute or a variable.


i) Eye colour of human beings
ii) Number of pages in a book of various subjects
Ans- i) Attribute ii) Variable

18. Classify the following as discrete or continuous variable


i) Number of shares sold each day in a stock market.
ii) Temperatures recorded every half hour at a regional meteorological centre.

Ans- i) Discrete ii) Continuous

19. Statistics can best be considered as


i) both art and science 1
ii) Art
iii) science
iv) neither art nor science

Ans- i) both art & science

20. Data that possess numerical properties are known as


i) quantitative data
ii) qualitative data
iii) Primary data
iv) Parametric data

Ans- i) quantitative data

21. A tool of all science in research and making an intelligent judgement is


i) statistics
ii) collection
iii) data
iv) judgement

Ans- i) statistics

22. State True or False:


i) Census conducted by Government of India is an example of primary data.
ii) TV News Bulletins gather information on any event through their agents.
iii) Schedules make respondents record their answers.
iv) A covering letter to the questionnaire brings confidence in respondents.
v) Questions in questionnaire should be lengthy.

Ans- i) True ii) True iii) False iv) True v) False

23. State whether each of the following variables is qualitative or quantitative and indicate the
measurement scale that is appropriate for each.
i) Age
ii) Gender
iii) Class Rank
iv) Make of automobile
v) Number of people favouring the death penalty

Ans- i) Quantitative ratio ii) Qualitative nominal iii) Qualitative, ordinal iv) Quantative, ratio

24. State whether each of the following variables is qualitative or quantitative and indicates the
measurement scale that is appropriate for each.
i) Annual sales
ii) Soft drink size (small, medium, large)
iii) Employee classification (GSI through GSIS)
iv) Earning per share
v) Methods of payments (cash, check, credit card)

Ans- i) Quantitative, ratio ii) Qualitative nominal iii) Qualitative ordinal iv) Quantative, ratio

25. Classification is a systematic __________ of the units according to their ____________ __________.

Ans- Grouping, common characteristics

26. Classification reduces _________ of the data.

Ans- Bulk

27. Classification of data that are non-measurable is known as ____ ___.

Ans- Attribute

28. Data arranged logically according to size is known as _________.


Ans- Series

29. Manifold classification involve more than _________ variables.

Ans- Two

30. Data arranged according to time of occurrence is known as _______.

Ans- Chronological classification

31. Geographical classification means classification of data according to:


i) Location ii) Time iii) Attributes iv) Class intervals

Ans- i) location

32. Classification is a process of arranging the data in:


i) Different columns ii) Different rows iii) Different rows and columns iv) Grouping of related facts
in different classes

Ans- iv) grouping of related facts in different classes

33. The data that can be classified on the basis of time is:
i) Geographical ii) Chronological iii) Qualitative iv) Quantitative

Ans- ii) chronological

34When the collected data is grouped with reference totime, we have:


a) Quantitative classification b) Qualitative classification
c) Geographical classification d) Chronological classification

Solution–Chronological classification

35Most quantitative classifications are:


a) Chronological b) Geographical
c) Frequency distribution d) None of these

Solution–Frequency distribution

36.Caption stands for:


a)A numerical information b) The column headings
c) The row headings d)The table headings

Solution–The column headings

37.Asimple table contains data on:


a) Two characteristics b) Several characteristics
c) One characteristic d) Three characteristics

Solution–One characteristic

38.The headings of the rows givenin the first column of atable are called:
a) Stubs b) Captions
c) Titles d) Reference notes

Solution-Stubs

39.Geographical classification means, classification of dataaccording to _______.

Solution–Place

40.The data recorded according to standard of education likeilliterate, primary, secondary,


graduate, technical,etc, willbe known as _______ classification.

Solution –Qualitative

41.An arrangement of data into rows and columns isknown as_______.


Solution –Tabulation

42.Tabulation follows ______.

Solution –Classification

43. State True or False


i. Tabulation presents the data in a minimum space.
ii. Tabulation is a process of analysis
iii. General purpose table deals with specific objectives.
iv. Derived tables deal with total, percentages, ratios, etc
v. Row of a table is represented by the vertical arrangement of data.

Ans- i. True ii. False iii. False iv. True v. False

44. i) If the data readings are 3, 4, 5, 6, 7, then it is called _________ variable.Height is generally
__________ variable.
ii) There are ____________ derived frequency distributions for any frequency distribution.
iii) Width of class-interval is given by the difference between ________ and ______.
iv) There are ________ marginal distributions for a distribution.
v) __________ formula is used to calculate the number of class-intervals.
vi) The relative frequency distribution is obtained from frequency distribution by calculating
___________.

Ans- i. Discrete variableContinuous variable


ii. Five
iii. Upper class limit and lower class limit
iv. Two
v. Sturge’s
vi. F/N

45. i) Diagrams give an accurate value. (True/False)


ii) Pie diagram is drawn according to degree subtended at the centre of a circle. (True/False)
iii) Simple bar diagram is drawn for multiple characteristics. (True/False)

Ans- i) False ii. True iii. False

46. A graphs in the form of steps is known as


i) Frequency ii) Frequency polygon iii) Pie iv) Histogram

Ans- iv) Histogram

47. The diagram which are used to show percentages break down is
i) A circle ii) A square iii) A pie iv) A rectangle

Ans- iii) Pie

48. A line graph indicates


i) Comparison ii) Variation iii) Range iv) All the above

Ans- iv) All the above

49. Which of the following is not a type of bar chart?


i) Multiple ii) Percentages iii) Subdivided iv) Ogive

Ans- iv) Ogive

50. State whether the following questions are ‘True’ or ‘False’.


i. For a given set of values if we add a constant 5 to every value, then the arithmetic mean is
affected.
ii. Arithmetic mean can be calculated for distribution with open-end classes.
iii. Arithmetic mean is affected by extreme values.
iv. Arithmetic mean of 12, 16, 23, 25, 28, 32 is 22.

Ans- i- T, ii- F, iii – T, iv – F


51. Different methods give different averages which are known as the
i) measures of central tendency ii) statistics iii) measures of dispersion iv) skewness

Ans- i) measures of central tendency

52. (a) Find the Arithmetic mean 68,41,75,91,53,86,59


i) 67.57 ii) 47.57 iii) 37.57 iv) 27.57

(b) The average computed by considering the relative importance of each of values to the total
value, is called i) arithmetic mean ii) geometric mean iii) weighted arithmetic mean iv) harmonic
average.

Ans- (a) i) 67.57 (b). iii)

53. State whether the following questions are true ‘T’ or false ‘F’.
i) Mode is based on all values
ii) Mode = 3 Median – Mean
iii) Geometric mean is used when we are interested in rate of growth of any phenomena.
iv) Harmonic mean exists if one of the values is zero.
v) A.M < G.M < H.M for any two values ‘a’ and ‘b’.
vi) Arithmetic mean can be calculated accurately even when the distribution has open-end class.
vii) Mode can be located graphically.

Ans- i – F, ii – F, iii – T, iv – F, v – F, vi – F, vii – T, viii – T

54. If the values of the variables are arranged in ascending order of magnitude, the middle term is
i) mean ii) mode iii) median iv) quartile

Ans- iii) median

55. In a symmetrical distribution the mean, median and mode


i) differ ii) coincide iii) mean-median = mode iv) differ by 0.5

Ans- ii) coincide

56. The relation between mean, median and mode is given by


i) Mode= 3 Median-2 Mean ii) Mode=2 Mean-Median iii) Mode= 3Median –Mean iv) Mode= Mean-
Median

Ans- i) Mode= 3 Median-2 Mean

57. The harmonic mean of 30 and 20 is


i) 25 ii) 24 iii) 20 iv) 30

Ans- ii) 24
58. If assumed mean A=32.5, h=8, fd =-13 and f= 90
i) mean = 35.31 ii) mean=31.35 iii) mean = 33.15 iv) mean=35.35

Ans- ii) mean=31.35

59. In any distribution when the original items differ size, the value of AM, GM and HM would also
differ in the following order
i) AM>GM>HM ii) AM=GM=HM iii) AM<HM<GM iv)AM.GM>HM

Ans- i) AM>GM>HM

60. State whether the following questions are ‘True’ or ‘False’.


i) Quartiles are positional value.
ii) Quartiles help us to find percentage of readings below or above a certain value.
iii) Q2 = P50 = D7 = Median

Ans- i – T, ii – T, iii – F

61. State whether the following questions are true, ‘T’ or false, ‘F’.
i) The cost of living index numbers calculated are based on weighted averages.
ii) Many of the items which we use in our life can be assigned weights.
Ans- i- T, ii – T T-True

62. To which approach does the following probability estimates belong:


i. Probability that India will win the game
ii. Probability that Mr. Ram will resign from the post
iii. Probability of drawing a red card
iv. Probability that you will go to America this year

Ans- i) Relative frequency


ii) Subjective
iii) Classical
iv) Subjective

63. Find the probabilities in the following cases:


i. Getting an even number when a die is thrown
ii. Selecting two ‘y’ from the letters x, x, x, x, y, y, y
iii. Selecting a king and queen from a pack of cards, when two cards are drawn at a time
iv. Getting 53 Mondays in ordinary year
Ans- i) ½ ii) 1/7 iii) 8/663 iv) 1/7

64. Given P(A) = 0.6, P(B) = 0.7, and P(A B) = 0.5. Find P(A U B)?

Ans- 0.8

65. State whether the following questions are true or false:


i. Bayes’ probability estimates sample value
ii. Conditional probability can incorporate costs
iii. Bayes’ probability gives up to date information

Ans- i) False ii) False iii) True

66. State whether the following statements are true ‘T’ or false ‘F’.
i) The sum of probabilities sometimes will be greater than 1.
ii) The amount of time you study for an exam is a discrete random variable.
iii) The Bernoulli distribution has only one parameter ‘p’.

Ans- i – F, ii- F, iii- T

67. State whether the following statements are true or false.


i) Mean of binomial distribution is ‘npq’.
ii) ‘n’ and ‘p’ are the parameters of binomial distribution.
iii) If the mean and variance of a binomial distribution are 6 and 5, then p = 1/6.
iv) Each trial in a binomial experiment has the different probability of success, p.

Ans- i-F, ii- T, iii- T, iv- F

68. State whether the following statements are true ‘T’ or false ‘F’.
i) ‘X’ is a Poisson variate if p < 0.1 and n > 10
ii) Example of bimodal distribution is Poisson distribution

Ans- i- T, ii- T

69. State whether the following statements are true ‘T’ or false ‘F’.
i) Quartile deviation of normal distribution is 4/ 5
ii) Mean and standard deviation of a standard normal distribution are ‘1’ and ‘0’
iii) Mean, median and mode coincide in a normal distribution

Ans- i- F, ii- F, iii- T


70. State whether the following statements are true ‘T’ or false ‘F’.
i) Population is aggregate of objects under study.
ii) Sampling method consume time and resources.
iii) Any summarised figure from population is known as statistics.
iv) We adopt sampling technique in our activities.
v) Population is a subset of sample.
vi) An unbiased sample gives an accurate prediction of characteristics of an entire population.
vii) The standard deviation of sampling distribution of a statistic is known as standard error of that
statistic.
viii) Standard error is used as a reliability measure.
ix) Faulty selection of sample contributes to sampling error.
x) Personal bias increases the non-sampling errors.
xi) Unbiased errors are cumulative in nature.
xii) Biased errors are also known as compensatory errors.

Ans- i- T, ii- F, iii- T, iv- T, v- F, vi- T, vii- T, viii - T, ix- T, x- T, xi- F, xii- F

71. State whether the following statements are true ‘T’ or false ‘F’.
i) Sample in which units are selected by judgment is known as probability sample.
ii) Judgment sampling does not give representativeness of a sample.
iii) Large sample size always results in minimising the standard error.
iv) A sampling plan that divides the population into well-defined groups from which random
samples are drawn is known as cluster sampling.
v) The principles of simple random sampling are the theoretical basis for statistical inference.
vi) If the mean of a certain population is 20, it is likely that most of the sample means will be 20.
vii) Any sampling distribution can be totally described by its mean and standard deviation.
ix) The central limit theorem assures that the sampling distribution of mean is always normal.
x) Stratified sampling is used when each group considered are more homogenous within itself and
heterogeneous between group.

Ans- i- F, ii- T, iii- T, iv- F, v- T, vi- F, vii- F, ix- T, x- T

72. Madhu, a frugal student, wants to buy a used bike. After randomly selecting 125 wanted
advertisements, he found the average price of the bike to be Rs. 3250 with a standard deviation of
Rs. 615. Establish an interval estimate for the average price of bike so that Madhu can be:
i) 68.3% certain that the population mean lies in this interval.
ii) 95.5% certain that the population mean lies in this interval.

Ans-
73. Given the following confidence levels, express the lower and upper limits of the confidence
interval for these levels in terms of and (Use the normal distribution tables). x x
i) 54 percent
ii) 75 percent
iii) 94 percent
iv) 98 percent

Ans-

.
74. For the following sample sizes and confidence levels, find the approximate ‘t’ values for
constructing confidence intervals (use the ‘t’ table).
i) n = 28; 95%
ii) n = 8; 98%
iii) n = 13; 90%
iv) n = 25; 95%

Ans-. i) 2.052
ii) 2.998
iii) 1.782
iv) 2.262

75. i) Null hypothesis states that there is a significant difference between observed and hypothetical
values. (True/False)
ii) 1% level of significance means we are ready to reject a true hypothesis in 99% of cases.
(True/False)

iii) If the Null hypothesis Ho: = s or Ho: p = ps or Ho: 1= 2 or Ho: p1 = p2 then it is two-tailed
test. (True/False)

iv) If the calculated value of a statistic is not in the rejection region R, then Ho is accepted.
(True/False)

v) 1 - is called power of the test. (True/False)

vi) If n1 = 300, n2 = 500, 1 = 50, 2 = 60, 1 = 10, 2 = 12 results of two samples taken from two cities
A and B then we test for between means under different population. (True/False)
vii) If n < 30, then we do not apply z test unless, population S.D is known. (True/False)

Ans- i. False

ii. False

iii. True

iv. True
v. True

vi. True

vii. True

76. i) ‘t’ distribution is __________ probability distribution.

ii) ‘t’ distribution’s parameter is __________.

iii) ‘t’ distribution has ___________ areas at the tail than normal distribution.

iv) The mean and variance of the ‘t’ distribution are ________ and ________.

Ans- i. Continuous
ii. Degrees of freedom
iii. Larger
iv. Zero, greater than one

Qno - 1. The mean’s of two samples of sizes 50 and 100 respectively are 54.1 and 50.3 and there
standard deviations are 8 and 7 respectively obtain the SD for combined group.

Answer -

Qno The mean wage is Rs. 75 per day, SD wage is Rs. 5 per day for a group of 1000 workers and the
same is Rs. 60 and Rs. 4.5 for the other group of 1500 workers. Find mean and standard deviation for
the entire group.
Qno The runs scored by 3 batsman are 50, 48 and 12. Arithmtic mean’s respectively. The SD of there
runs are 15, 12 and 2 respectively. Who is t he most consistent of the three batsman? If the one of
these three is to be selected who is to be selected?

Evaluation Criteria
1. Less CV indicates more constant player and hence more consistent player is (Player C)

2. Highest rune scorer = x A = 50

Qno A student while computing the coefficient of variation obtained the mean and SD of 100
observations as 40 and 5.1 respectively. It was later discovered that he had wrongly copied an
observation as 50 instead of 40. Calculate the correct coefficient of variation.

Answer –
Qno. The mean and SD of 21 observations are 30 and 5 respectively. It was subsequently noted that one
of the observations 10 was incorrect. Omit it and determine the mean and SD of the rest.

MB 40 MCQS
Q1.In business context, managers are required to justify decisions on the basis of_______.

ANS - Data

Q2. Statistical concepts and statistical thinking enable them to:

 Solve problems in almost every domain


 ____________.
 Reduce guesswork

ANS - Support their decisions

Q3. Descriptive Statistics is used to present the __________which is summarised quantitatively.

ANS - General description of data

Q4.____________, defined Statistics as “The science of estimates and probabilities”.

ANS - Prof. Boddington

Q5. According to______________, “Statistics is the science of collection, presentation, analysis and
interpretation of numerical data from logical analysis”3.

ANS - Croxton and Cowden

Q6. Qualitative data deals with meanings while quantitative data deals with________.

ANS – Numbers.

Q7. Data is the facts and figures that are collected,_____________.

ANS - Analysed and interpreted

Q8. Element is the entities on which_________.

ANS - Data are collected

Q9. Sample is a subset of the ___________.

ANS - Population

Q10. Statistics is the art and science of collecting, analysing, presenting and____________.

ANS - Interpreting data


Q11. Statistics is broadly categorised into two parts based on their functions, namely, ___________and
Inferential Statistics

ANS - Descriptive Statistics

Q12. In which of the following situations would you like to use Statistics?

a) Buying a house

b) Purchasing medicine prescribed by a doctor

c) Investing funds in several options

d) Attending relatives marriages

ANS - c) Investing funds in several options

Q13. Out of the following, which one does not refer to a mass of data?

a) Banking Statistics

b) Mathematical Statistics

c) Agricultural Statistics

d) Income Statistics

Ans - b) Mathematical statistics

Q14. Which of the following statement is most appropriate?

a) Nature believed in statistics

b) Nature created statistics

c) Nature believed in variation

d) Nature believed in symmetrical variation

Ans - c) Nature believed in variation

Q15. Which of the following statement is true?

a) Statistics enlarges physical vision

b) Statistics helps in estimation

c) Statistics quantifies uncertainty


d) Statistics is of no use to humanity.

Ans - b) Statistics quantifies uncertainity

Q16.The origin of statistics can be traced to

a) State

b) Commerce

c) Economics

d) Industry

Ans - a) State

Q17._____________, “Statistics is a science which deals with the method of collecting, classifying,
presenting, comparing and interpreting the numerical data to throw light on enquiry”.

Ans - According to Seligman

Q18. According to the definition of Statistics given by Croxton and Cowden, what are the four
components of Statistics?

Ans - The four components of Statistics are collection, presentation, analysis and interpretation of data.

Qno19. ‘Statistics may be called the science of counting’ is the definition given by

a) Croxton

b) A.L.Bowley

c) Boddington

d) Webster

Ans - b) A. L. Bowley

Q20.In the olden days statistics was confined only to _______.

Ans - State affair

Q21. Mention some other areas where there is a scope of applying statistics.

Ans - Industrial Quality control, Investment policies, to find market potential for a product.
Q22 Answer the following:

a) Should the same degree of accuracy be applied while measuring the height of a mountain and the
height of a person?

b) Does Statistics deal with qualitative data?

Ans - a) No

b) No

Q23. Categorise the following data as qualitative or quantitative data

a) The number of transactions occurring in an ATM per day

b) The popular brand name in cars is Maruthi

Ans – a) Quantitative data

b) Qualitative data

Q24. The total sale of a product in Area A is 840 for 30 working days. The total sale of the same product
in Area B is 784 for 28 working days. Should Statistics be applied to get an appropriate picture regarding
the comparison of sales?

Ans - . Yes

Q25. Statistical data is numerical data or quantitative data but not_________.

Ans - Qualitative data

Q26. Statistics is broadly divided into Descriptive and___________.

Ans - Inferential Statistics

Q27. We define the term ‘survey’ as a measurement procedure to gather______.

Ans - People’s opinions

Q28. Statistical surveys involve two stages namely – Planning and___________.

ANS – Execution

Q29. Sample can never be larger than the population from which the__________.
Ans - Sample was taken

Q23. Quantitative characteristic is a characteristic which is numerically expressed; otherwise it is


a___________.

ANS - Qualitative characteristic

Q31.Meaning of a interval scale ?

Ans - An interval scale is a scale of measurement where the distance between any two adjacent units of
measurement (or 'intervals') is the same but the zero point is arbitrary.

Q32. Data collected for the first time keeping in view the objective of the survey is_________.

Ans - Primary data

Qn33 Ratio variables are very similar_______.

Ans - To interval variables

Q34. Statistical survey is scientific process of_____________.

Ans - Collection and analysis of numerical data

Q35. What are the main stages in a survey?

Ans - Planning and execution

Qno36. Training of investigators belongs to which stage?

ANS - Planning

Qno37. Analysis of data is a part of the execution of survey. Is this correct?

ANS - Yes

Qno38. Classify the following as finite or infinite population.

i) Production of a product in a factory for a day

ii) The set of rational numbers

iii) The weight of new born babies measured up to first decimal place in a state during the first week of
February 2008

Ans - i) Finite ii) Infinite iii) Finite


Qno39.Classify the following as an attribute or a variable. i) Eye colour of human beings ii) Number of
pages in a book of various subjects.

Ans – i) Attribute ii) Variable

Qno40. 6. Classify the following as discrete or continuous variable

i) Number of shares sold each day in a stock market.


ii) Temperatures recorded every half hour at a regional meteorological centre.

Ans - i) Discrete ii) Continuous

Q41. Statistics can best be considered as i) both Art and Science ii) Art iii) Science iv) neither Art nor
Science

ANS - i) both Art & Science

Q42. Data that possess numerical properties are known as i) Quantitative data ii) Qualitative data iii)
Primary data iv) Parametric data

ANS - i) Quantitative data

Q43. A tool of all science in research and making an intelligent judgement is i) Statistics ii) Collection iii)
Data iv) Judgement

ANS - i) Statistics

QN44. State whether the following data are Primary or Secondary.

i) An official of the Census Board of India is preparing a report on census of population based on the
survey data that is collected by the Census Board.

ii) An HR representative of a software company is deciding on the time taken to perform a particular job
on a project on the basis of random observations collected by him.

iii) A neurologist is examining the relationship between cigarette smoking and brain tumor based on the
data published in a famous neurology journal.

ANS – i) Primary data, ii) Primary data, iii) Secondary data

Q45. When population under investigation is infinite, we should use

i) sample method

ii) census method

iii) neither census nor sample method


iv) both a & b

ANS – Sample method

Q46. State True or False:

i) Census conducted by Government of India is an example of primary data.

ii) TV News Bulletins gather information on any event through their agents.

iii) Schedules make respondents record their answers.

iv) A covering letter to the questionnaire brings confidence in respondents.

v)Questions in questionnaire should be lengthy.

ANS - i) True ii) True iii) False iv) True v) False

Q47. 13. State whether each of the following variables is qualitative or quantitative.

i) Age

ii) Gender

iii) Class Rank

iv) Number of people favouring the death penalty

ANS - ) Quantitative ii) Qualitative iii) Qualitative iv) Quantitative

Q48. State whether each of the following variables is qualitative or quantitative and indicates the
measurement scale that is appropriate for each.

i) Annual sales

ii) Soft drink size (small, medium, large)

iii) Employee classification (GSI through GSIS)

iv) Earning per share v) Methods of payments (cash, check, credit card)

ANS - i) Quantitative, Ratio, ii) Qualitative, Nominal, iii) Qualitative, Ordinal, iv) Quantitative, Ratio, v)
Qualitative, Nominal
Q49. The Colgate-Palmolive Company started as a small soap and candle shop in __________in1806

Ans - New York City

Q50. Three methods of classification are

One-way classification , Two-way classification, ______

Ans - Manifold classification

Q51. Cumulative frequency distribution is a frequency distribution that indicates ___________that lie
above or below the specified values of the class intervals.

Ans - Directly the number of units

Q52. Frequency distribution of more than two variables is known as _________.

Ans - Multivariate frequency distribution

Q53. A diagram is a visual form for presentation of ____________.

Ans - Statistical data

Q54. Histogram is graphical presentation of a____________.

Ans - Frequency distribution

Q55. Frequency distribution of more than two variables is known as____________.

ANS - Multivariate frequency distribution

Q56. Ogive is a graph of a_____________.

Ans - Cumulative distribution

Q57. Pie chart is a graphical device for depicting data summaries based on the _________into sector
that corresponds to the relative frequency for each class.

Ans - subdivision of a circle

Q58. 1. Classification is a systematic __________ of the units according to their ____________


__________.

Ans - Grouping, common characteristics

Q59.Classification reduces _________ of the data.

Ans - Bulk

Q60. 3. Classification of data that are non-measurable is known as _______.

Ans - Attributes
Q61. Classification done according to two attributes or variables is _________.

Ans - Two-Way Classification

Q62. Manifold classification involve more than _________ variables.

Ans - Two

Q63. Data arranged according to time of occurrence is known as _______.

Ans - Chronological classification

Q64. Geographical classification means classification of data according to: i) Location ii)
Time iii) Attributes iv) Class intervals

Ans - i) location

Q65. Classification is a process of arranging the data into: i) Different columns ii) Different rows
iii) Different rows and columns iv) Groups of related facts in different classes

Ans - iv) groups of related facts in different classes

Q66. The data that can be classified on the basis of time is: i) Geographical ii) Chronological
iii) Qualitative iv) Quantitative

Ans - ii) chronological

Q67. State True or False i. Tabulation presents the data in a minimum space. ii. Tabulation is a process
of analysis iii. General purpose table deals with specific objectives. iv. Derived tables deal with total,
percentages, ratios, etc

Ans - i) True, ii) False, iii) False, iv) True, v) False

Q68. i) If the data readings are 3, 4, 5, 6, 7, then it is called _________ variable. Height is generally
__________ variable. ii) There are ____________ derived frequency distributions for any frequency
distribution. iii) Width of class-interval is given by the difference between ________ and ______. iv)
There are ________ marginal distributions for a distribution. v) __________ formula is used to calculate
the number of classintervals. vi) The relative frequency distribution is obtained from frequency
distribution by calculating ___________.

Ans - i) Discrete variable, Continuous variable

ii) Five

iii) Upper class limit and lower class limit.


iv) Two
v) Sturge’s
vi) f/N

Q69. i) Diagrams give an accurate value. (True/False) ii) Pie diagram is drawn according to degree
subtended at the centre of a circle. (True/False)

vii) Simple bar diagram is drawn for multiple characteristics. (True/False)

Ans - i) False ii) True iii) False

Q70. The graph plotted in the form of series of rectangles is

i) Frequency ii) Frequency polygon iii) Pie iv) Histogram

Ans - iv) Histogram

Q71. The diagram which are used to show percentages break down is i) A circle
ii) A square iii) A pie diagram iv) A rectangle

Ans - A Pie diagram

Q72. A line graph indicates i) Comparison ii) Variation iii) Range

iv) All the above


Ans – All the above

Q73. . Which of the following is not a type of bar chart?

i) Multiple ii) Percentages iii) Subdivided iv) Ogive

Ans - iv) Ogive

Q74. ‘Range’ represents the differences between the values of the___________.

Ans - Extremes

Q75. Arithmetic mean is defined as the sum of all values divided by_________.

Ans - Number of values

76. The measures of Dispersion are as follows:

i. Range (R)

ii_________________

iii. Mean Deviation (M.D)

iv. Standard Deviation (S.D)

ANS - . Quartile Deviation ( Q.D)

77. Arithmetic means Sum of observations divided by number of_______.

Ans - Observations

78. Inter-quartile range is the difference between the third quartile and_________.

Ans - First quartile

Q79 Percentile values divide the distribution into 100 parts of__________.

Ans - Equal frequency


Q80. State whether the following questions are ‘True’ or ‘False’.

i. For a given set of values if we add a constant 5 to every value, then the arithmetic mean is
affected.
ii. Arithmetic mean can be calculated for distribution with open-end classes.
iii. Arithmetic mean is affected by extreme values.
iv. Arithmetic mean of 12, 16, 23, 25, 28, 32 is 22.

ANS - i) True, ii) False, iii) Tue, iv) False

Q81. A single value within the range of the entire mass of data that is used to represent the whole data
is i) Measures of Central tendency

ii) Statistics

iii) Measures of Dispersion

iv) Skewness

ANS - i) Measures of central tendency

Q82.

ANS –

83. (a) Find the Arithmetic mean 68,41,75,91,53,86,59

i) 67.57 ii) 47.57 iii) 37.57 iv) 27.57

(b) The average computed by considering the relative importance of each of values to the total value, is
called

i) Arithmetic mean ii) Geometric mean iii) Weighted arithmetic mean iv) Harmonic average.

ANS - (a) i) 67.57 (b). iii) weighted arithmetic mean


83. State whether the following questions are ‘True’ or ‘False’.

i) Mode is based on all values

ii) Mode = 3 Median – Mean

iii) Geometric mean is used when we are interested in rate of growth of any phenomena.

iv) Harmonic mean exists if one of the values is zero. v) A.M < G.M < H.M for any two values ‘a’ and ‘b’.

vi) Arithmetic mean can be calculated accurately even when the distribution has open-end class.

vii) Mode can be located graphically. viii) Mode is used when data is on interval scale.

ANS - i) False, ii) False, iii) True, iv) False, v) False, vi) False, vii) True, viii) True

84. If the values of the variables are arranged in ascending order of magnitude, the middle term is

i) mean ii) mode iii) median iv) quartile

ANS - median

7. In a symmetrical distribution the mean, median and mode

i) differ ii) coincide iii) mean-median = mode iv) differ by 0.5

ANS - ii) coincide

85. The relation between mean, median and mode is given by i) Mode= 3 Median-2 Mean ii) Mode=2
Mean-Median iii) Mode= 3Median –Mean iv) Mode= Mean- Median

ANS - i) Mode= 3 Median-2 Mean

86. The harmonic mean of 30 and 20 is

i) 25 ii) 24 iii) 20 iv) 30

ANS - ii) 24

87. If assumed mean A=32.5, i=8, fd =-13 and f= 90 i) mean = 35.31 ii) mean=31.35 iii) mean =
33.15 iv) mean=35.35

Ans - ii) mean=31.35

Q88. In any distribution when the original items differ in size, the value of Arithmetic mean (AM),
Geometric mean (GM) and Harmonic mean (HM) would also differ in the following order

i) AM>GM>HM

ii) AM=GM=HM
iii) AM<HM<GM

iv) AM.GM>HM

ANS - i) AM>GM>HM

Q89. State whether the following questions are ‘True’ or ‘False’. i) Quartiles are positional value. ii)
Quartiles help us to find percentage of readings below or above a certain value. iii) Q2 = P50 = D7 =
Median

ANS - i) True, ii) True, iii) False

Q90. State whether the following questions are ‘True’ or ‘False’. i) The cost of living index numbers
calculated are based on weighted averages. ii) Many of the items which we use in our life can be
assigned weights.

Ans - i) True, ii) True

Q91. State whether the following questions are ‘True’ or ‘False’

i. Standard deviation is based on all the values.

ii. Standard deviation of a set of values is increased if every value of the set is increased by a constant.

iii. Standard deviation can be calculated for distributions with open-end classes.

iv. Coefficient of variation can be used to compare the variability of two sets of data measuring the same
characteristics.

Ans - . i) True, ii) False, iii) False, iv) True

Q92. In an interview conducted by a company, if the probability that an interviewed person is male is
2/3 and female is 1/3. Find the mean and variance of the distribution.

ANS - Solution Let, ‘X’ denote gender of the interviewed person. If interviewed person is male then X
takes value 1 and if interviewed person is a female X takes value 0, with probabilities 2/3 and 1/3
respectively (i.e., p+q=2/3 +1/3=1). And X follows Bernoulli distribution as shown in the following table:
Mean of Bernoulli distribution is E(X)= p= 2/3 and

Variance is Var(X)= pq = 2/3 x 1/3 =2/9.

Q93. State whether the following statements are ‘True’ or ‘False’.

i) The sum of probabilities sometimes will be greater than 1.

ii) The amount of time you study for an exam is a discrete random variable.

iii) The Bernoulli distribution has only one parameter ‘p’.

Ans - i) False, ii) False, iii) True

Q94. State whether the following statements are ‘True’ or ‘False’.

i) Mean of binomial distribution is ‘npq’.

ii) ‘n’ and ‘p’ are the parameters of Binomial distribution.

iii) If the mean and variance of a Binomial distribution are 6 and 5, then p = 1/6. iv) Each trial in a
binomial experiment has the different probability of success ‘p’.

Ans – i) False, ii) True, iii) True, iv) False

Q95. State whether the following statements are ‘True’ or ‘False’ i) ‘X’ is a Poisson variate if p < 0.1 and
n > 10. ii) Poisson distribution is a unimodal distribution.

Ans - i) True, ii) True

Q96. State whether the following statements are ‘True’ or ‘False’.

i) Quartile deviation of normal distribution is 4/ 5 .

ii) Mean and standard deviation of Standard normal distribution are ‘1’ and ‘0’.

iii) Mean, Median and Mode coincide in a Normal distribution.

Ans - i) False, ii) False, iii) True

Qno97. Solved Problem


An unbiased coin is tossed six times. What is the probability that the tosses will result in: i) Exactly two
heads ii) At least five heads iii) At most two heads iv) Not greater than one head v) Not less than five
heads vi) At least one head

Solution Let ‘A’ be the event of getting head. Given that:

Therefore, the probability that the tosses will result in at least five heads is 7/64.

ii) The probability that the tosses will result in at least five heads is given by:

Therefore, the probability that the tosses will result in at least five heads is 7/64.

iii) The probability that the tosses will result in at most two heads is given by:
Therefore, the probability that the tosses will result in at most two heads is 11/32.

iv) The probability that the tosses will result in not greater than one head is given by:

Therefore, the probability that the tosses will result in not greater than one head is 7/64.

v) The probability that the tosses will result in not less than five heads is given by:

Therefore, the probability that the tosses will result in not less than five heads is 7/64.

vi) The probability that the tosses will result in at least one head is given by:

Therefore, the probability that the tosses will result in at least one head is 63/64.

The graph depicted in figure 6.3 illustrates the binomial distribution of probability of ‘x’ number
of heads occurring when a coin is tossed 6 times.
Q 98. Solved Problem

The probability that an employee will get an occupational disease is 20%. In a firm having five
employees, what is the probability that:

i) None of the employees get the disease

ii) Exactly two will get the disease

iii) More than four will contract the disease

Solution Given that:

Therefore, the probability that none of the employees get the disease is 0.3277.

ii) The probability that exactly two employees will get the disease is given by:
Therefore, the probability that exactly two employees will get the disease is 0.2048.

iii) The probability that more than four employees will get the disease is given by:

Therefore, the probability that more than four employees will get the disease is 0.00032.

QNO99. For a binomial distribution with n = 5 and p = 0.2.

Find: i) P(X=3)

ii) P(X<4)

QNO100. In a large consignment of electric lamps, 5% are defective. A random sample of 8 lamps is
taken for inspection. What is the probability that it has one or more defectives?

ANS - Solution Given n = 8, p = 5/100 = 0.05 X: number of defective lamps Therefore by binomial
distribution,
QN101, Poisson process is obtained when the Binomial experiment is conducted many number
of_______.

ANS - Times

QNO102. A sample in which items are chosen without knowing their probability of selection__________.

ANS - A sample in which items are chosen without knowing their probability of selection Non-probability
sample

QNO103. Non-sampling errors are attributed to factors that can be controlled and eliminated
by__________.

ANS - suitable actions

QNO 104.Quota sampling is a type of___________.

ANS - Judgment sampling

QNO.105.

ANS - i) The Z test

QNO. 106.A table with 4 rows and 2 columns has the degrees of freedom of _____________.

ANS - iv) -2.45

QNO107.
ANS - ii) The Chi-Square test

QNO.108 If there are four rows and five columns in classification for 2 – test, then the number of
degrees of freedom equal to __________.

ANS - iii) (r - 1)x(c – 1

QNO109. If the calculated value is less than the tabulated value, then the null hypothesis is
__________.

ANS - ii) 6

QNO110. Chi-Square test is a___________.

ANS - Non-parametric test

QNO111. The objectives of ANOVA are ?

ANS -

1. To obtain a measure of the total variation between or among the components

2. To find a measure of variation between or among the components. Then, the significance of the
difference between the variations in two series or more may be measured

QNO112. ANOVA is mainly carried on under the following two classifications:

i) One-way analysis of variance or One-way classification

ii) __________________________.

ANS - Two-way analysis of variance or Two-way classified data or manifold classification

QNO113. ANOVA is a statistical technique used to evaluate the variances between _________or more
sample means,

ANS - Three

QNO114. ANOVA is classified into one-way ANOVA and__________.

ANS - Two-way ANOVA


Qno115 . Business forecasting is based on___________________.

Ans - Past and present economic condition of the business

Qno116. The forecasting through time series analysis is possible only when the _____________which
reflects a definite trend and seasonal variation.

Ans - Business data of various years are available

Qno117. extrapolation is based on two assumptions:

1. _____________________.

2. There is regularity in fluctuations and the rise and fall is uniform

Ans - There is no sudden jump in figures from one period to another

Qno118. The straight line arithmetic trend assumes that growth will be a____________.

Ans - Constant amount each year

Qno119. The various methods of constructing index numbers can be classified into two groups. They
are: unweighted index numbers and____________.

Ans - weighted index numbers

Qno 120.In the Explicit method the weights are laid down on the basis of _______ of importance of
commodities

Ans - One outward evidence

121. In which of the following situations would you like to use Statistics?

a) Buying a house

b) Purchasing medicine prescribed by a doctor

c) Investing funds in several options


d) Attending relatives marriages

Ans- c

122. Out of the following, which one does not refer to a mass of data?

a) Banking Statistics

b) Mathematical Statistics

c) Agricultural Statistics

d) Income Statistics

Ans- b

123. Which of the following statement is most appropriate?

a) Nature believed in statistics

b) Nature created statistics

c) Nature believed in variation

d) Nature believed in symmetrical variation

Ans- c

124. Which of the following statement is true?

a) Statistics enlarges physical vision

b) Statistics helps in estimation

c) Statistics quantifies uncertainty


Ans- b

125. The origin of statistics can be traced to

a) State

b) Commerce

c) Economics

Ans- a

126. According to the definition of Statistics given by Croxton and Cowden, what are the four
components of Statistics?

Ans- The four components of Statistics are collection, presentation, analysis and interpretation of data.

127. ‘Statistics may be called the science of counting’ is the definition given by

a) Croxton

b) A.L.Bowley

c) Boddington

d) Webster

Ans- b
128. In the olden days statistics was confined only to _______.

Ans- State affair

129. Answer the following:

a) Should the same degree of accuracy be applied while measuring the height of a mountain and the
height of a person?

b) Does Statistics deal with qualitative data?

Ans- a) No

b) No

130. Categorise the following data as qualitative or quantitative data

a) The number of transactions occurring in an ATM per day

b) The popular brand name in cars is Maruthi

Ans- a) Quantitative data

b) Qualitative data

131. The total sale of a product in Area A is 840 for 30 working days. The total sale of the same product
in Area B is 784 for 28 working days. Should Statistics be applied to get an appropriate picture regarding
the comparison of sales?

Ans- Yes

132. What are the main stages in a survey?

Ans- Planning and execution


133. Training of investigators belongs to which stage?

Ans- Planning

136. Analysis of data is a part of the execution of survey. Is this correct?

Ans- Yes

137. Classify the following as finite or infinite population.

i) Production of a product in a factory for a day

ii) Number of points in this page

iii) The set of rational numbers

iv) The weight of new born babies measured up to first decimal place in a state during the first week of
February 2008

Ans- i) Finite ii) Infinite iii) Infinite iv) Finite

138. Classify the following as an attribute or a variable.

i) Eye colour of human beings

ii) Number of pages in a book of various subjects

Ans- i) Attribute ii) Variable


139. Classify the following as discrete or continuous variable

i) Number of shares sold each day in a stock market.

ii) Temperatures recorded every half hour at a regional meteorological centre.

Ans- i) Discrete ii) Continuous

140. Statistics can best be considered as

i) both art and science 1

ii) Art

iii) science

iv) neither art nor science

Ans- i) both art & science

141. Data that possess numerical properties are known as

i) quantitative data

ii) qualitative data

iii) Primary data

iv) Parametric data

Ans- i) quantitative data

142. A tool of all science in research and making an intelligent judgement is


i) statistics

ii) collection

iii) data

iv) judgement

Ans- i) statistics

143. State True or False:

i) Census conducted by Government of India is an example of primary data.

ii) TV News Bulletins gather information on any event through their agents.

iii) Schedules make respondents record their answers.

iv) A covering letter to the questionnaire brings confidence in respondents.

v) Questions in questionnaire should be lengthy.

Ans- i) True ii) True iii) False iv) True v) False

144. State whether each of the following variables is qualitative or quantitative and indicate the
measurement scale that is appropriate for each.

i) Age

ii) Gender

iii) Class Rank

iv) Make of automobile

v) Number of people favouring the death penalty


Ans- i) Quantitative ratio ii) Qualitative nominal iii) Qualitative, ordinal iv) Quantative, ratio

145. State whether each of the following variables is qualitative or quantitative and indicates the
measurement scale that is appropriate for each.

i) Annual sales

ii) Soft drink size (small, medium, large)

iii) Employee classification (GSI through GSIS)

iv) Earning per share

v) Methods of payments (cash, check, credit card)

Ans- i) Quantitative, ratio ii) Qualitative nominal iii) Qualitative ordinal iv) Quantative, ratio

146. Classification is a systematic __________ of the units according to their ____________


__________.

Ans- Grouping, common characteristics

147. Classification reduces _________ of the data.

Ans- Bulk

148. Classification of data that are non-measurable is known as ____ ___.

Ans- Attribute

149. Data arranged logically according to size is known as _________.


Ans- Series

150. Manifold classification involve more than _________ variables.

Ans- Two

151. Data arranged according to time of occurrence is known as _______.

Ans- Chronological classification

152.. Geographical classification means classification of data according to:

i) Location ii) Time iii) Attributes iv) Class intervals

Ans- i) location

153. Classification is a process of arranging the data in:

i) Different columns ii) Different rows iii) Different rows and columns iv) Grouping of related facts in
different classes

Ans- iv) grouping of related facts in different classes

154. The data that can be classified on the basis of time is:

i) Geographical ii) Chronological iii) Qualitative iv) Quantitative

Ans- ii) chronological


155.When the collected data is grouped with reference totime, we have:

a) Quantitative classification b) Qualitative classification

c) Geographical classification d) Chronological classification

Solution–Chronological classification

156.Most quantitative classifications are:

a) Chronological b) Geographical

c) Frequency distribution d) None of these

Solution–Frequency distribution

157.Caption stands for:

a)A numerical information b) The column headings

c) The row headings d)The table headings

Solution–The column headings

158.Asimple table contains data on:

a) Two characteristics b) Several characteristics

c) One characteristic d) Three characteristics

Solution–One characteristic

159.The headings of the rows givenin the first column of atable are called:

a) Stubs b) Captions

c) Titles d) Reference notes

Solution-Stubs
160.Geographical classification means, classification of dataaccording to _______.

Solution–Place

161.The data recorded according to standard of education likeilliterate, primary, secondary, graduate,
technical,etc, willbe known as _______ classification.

Solution –Qualitative

162.An arrangement of data into rows and columns isknown as_______.

Solution –Tabulation

163.Tabulation follows ______.

Solution –Classification

164. State True or False

i. Tabulation presents the data in a minimum space.

ii. Tabulation is a process of analysis

iii. General purpose table deals with specific objectives.

iv. Derived tables deal with total, percentages, ratios, etc

v. Row of a table is represented by the vertical arrangement of data.

Ans- i. True ii. False iii. False iv. True v. False

165 i) If the data readings are 3, 4, 5, 6, 7, then it is called _________ variable.Height is generally
__________ variable.
ii) There are ____________ derived frequency distributions for any frequency distribution.

iii) Width of class-interval is given by the difference between ________ and ______.

iv) There are ________ marginal distributions for a distribution.

v) __________ formula is used to calculate the number of class-intervals.

vi) The relative frequency distribution is obtained from frequency distribution by calculating

___________.

Ans- i. Discrete variableContinuous variable

ii. Five

iii. Upper class limit and lower class limit

iv. Two

v. Sturge’s

vi. F/N

166. i) Diagrams give an accurate value. (True/False)

ii) Pie diagram is drawn according to degree subtended at the centre of a circle. (True/False)

iii) Simple bar diagram is drawn for multiple characteristics. (True/False)

Ans- i) False ii. True iii. False

167. A graphs in the form of steps is known as

i) Frequency ii) Frequency polygon iii) Pie iv) Histogram

Ans- iv) Histogram

168. The diagram which are used to show percentages break down is
i) A circle ii) A square iii) A pie iv) A rectangle

Ans- iii) Pie

169. A line graph indicates

i) Comparison ii) Variation iii) Range iv) All the above

Ans- iv) All the above

170.. Which of the following is not a type of bar chart?

i) Multiple ii) Percentages iii) Subdivided iv) Ogive

Ans- iv) Ogive

171. State whether the following questions are ‘True’ or ‘False’.

i. For a given set of values if we add a constant 5 to every value, then the arithmetic mean is affected.

ii. Arithmetic mean can be calculated for distribution with open-end classes.

iii. Arithmetic mean is affected by extreme values.

iv. Arithmetic mean of 12, 16, 23, 25, 28, 32 is 22.

Ans- i- T, ii- F, iii – T, iv – F

172. Different methods give different averages which are known as the

i) measures of central tendency ii) statistics iii) measures of dispersion iv) skewness
Ans- i) measures of central tendency

173.. (a) Find the Arithmetic mean 68,41,75,91,53,86,59

i) 67.57 ii) 47.57 iii) 37.57 iv) 27.57

(b) The average computed by considering the relative importance of each of values to the total value, is
called i) arithmetic mean ii) geometric mean iii) weighted arithmetic mean iv) harmonic average.

Ans- (a) i) 67.57 (b). iii)

174. State whether the following questions are true ‘T’ or false ‘F’.

i) Mode is based on all values

ii) Mode = 3 Median – Mean

iii) Geometric mean is used when we are interested in rate of growth of any phenomena.

iv) Harmonic mean exists if one of the values is zero.

v) A.M < G.M < H.M for any two values ‘a’ and ‘b’.

vi) Arithmetic mean can be calculated accurately even when the distribution has open-end class.

vii) Mode can be located graphically.

Ans- i – F, ii – F, iii – T, iv – F, v – F, vi – F, vii – T, viii – T

175. If the values of the variables are arranged in ascending order of magnitude, the middle term is

i) mean ii) mode iii) median iv) quartile

Ans- iii) median

176. In a symmetrical distribution the mean, median and mode


i) differ ii) coincide iii) mean-median = mode iv) differ by 0.5

Ans- ii) coincide

177. The relation between mean, median and mode is given by

i) Mode= 3 Median-2 Mean ii) Mode=2 Mean-Median iii) Mode= 3Median –Mean iv) Mode= Mean-
Median

Ans- i) Mode= 3 Median-2 Mean

178. The harmonic mean of 30 and 20 is

i) 25 ii) 24 iii) 20 iv) 30

Ans- ii) 24

179. If assumed mean A=32.5, h=8, fd =-13 and f= 90

i) mean = 35.31 ii) mean=31.35 iii) mean = 33.15 iv) mean=35.35

Ans- ii) mean=31.35

180. In any distribution when the original items differ size, the value of AM, GM and HM would also
differ in the following order

i) AM>GM>HM ii) AM=GM=HM iii) AM<HM<GM iv)AM.GM>HM

Ans- i) AM>GM>HM

181. State whether the following questions are ‘True’ or ‘False’.

i) Quartiles are positional value.

ii) Quartiles help us to find percentage of readings below or above a certain value.
iii) Q2 = P50 = D7 = Median

Ans- i – T, ii – T, iii – F

182.. State whether the following questions are true, ‘T’ or false, ‘F’.

i) The cost of living index numbers calculated are based on weighted averages.

ii) Many of the items which we use in our life can be assigned weights.

Ans- i- T, ii – T T-True

183.. To which approach does the following probability estimates belong:

i. Probability that India will win the game

ii. Probability that Mr. Ram will resign from the post

iii. Probability of drawing a red card

iv. Probability that you will go to America this year

Ans- i) Relative frequency

ii) Subjective

iii) Classical

iv) Subjective

184.. Find the probabilities in the following cases:

i. Getting an even number when a die is thrown

ii. Selecting two ‘y’ from the letters x, x, x, x, y, y, y

iii. Selecting a king and queen from a pack of cards, when two cards are drawn at a time
iv. Getting 53 Mondays in ordinary year

Ans- i) ½ ii) 1/7 iii) 8/663 iv) 1/7

185. Given P(A) = 0.6, P(B) = 0.7, and P(A B) = 0.5. Find P(A U B)?

Ans- 0.8

186. State whether the following questions are true or false:

i. Bayes’ probability estimates sample value

ii. Conditional probability can incorporate costs

iii. Bayes’ probability gives up to date information

Ans- i) False ii) False iii) True

187. State whether the following statements are true ‘T’ or false ‘F’.

i) The sum of probabilities sometimes will be greater than 1.

ii) The amount of time you study for an exam is a discrete random variable.

iii) The Bernoulli distribution has only one parameter ‘p’.

Ans- i – F, ii- F, iii- T

188. State whether the following statements are true or false.


i) Mean of binomial distribution is ‘npq’.

ii) ‘n’ and ‘p’ are the parameters of binomial distribution.

iii) If the mean and variance of a binomial distribution are 6 and 5, then p = 1/6.

iv) Each trial in a binomial experiment has the different probability of success, p.

Ans- i-F, ii- T, iii- T, iv- F

189. State whether the following statements are true ‘T’ or false ‘F’.

i) ‘X’ is a Poisson variate if p < 0.1 and n > 10

ii) Example of bimodal distribution is Poisson distribution

Ans- i- T, ii- T

190. State whether the following statements are true ‘T’ or false ‘F’.

i) Quartile deviation of normal distribution is 4/ 5

ii) Mean and standard deviation of a standard normal distribution are ‘1’ and ‘0’

iii) Mean, median and mode coincide in a normal distribution

Ans- i- F, ii- F, iii- T

191. State whether the following statements are true ‘T’ or false ‘F’.

i) Population is aggregate of objects under study.

ii) Sampling method consume time and resources.

iii) Any summarised figure from population is known as statistics.

iv) We adopt sampling technique in our activities.

v) Population is a subset of sample.

vi) An unbiased sample gives an accurate prediction of characteristics of an entire population.


vii) The standard deviation of sampling distribution of a statistic is known as standard error of that
statistic.

viii) Standard error is used as a reliability measure.

ix) Faulty selection of sample contributes to sampling error.

x) Personal bias increases the non-sampling errors.

xi) Unbiased errors are cumulative in nature.

xii) Biased errors are also known as compensatory errors.

Ans- i- T, ii- F, iii- T, iv- T, v- F, vi- T, vii- T, viii - T, ix- T, x- T, xi- F, xii- F

192. State whether the following statements are true ‘T’ or false ‘F’.

i) Sample in which units are selected by judgment is known as probability sample.

ii) Judgment sampling does not give representativeness of a sample.

iii) Large sample size always results in minimising the standard error.

iv) A sampling plan that divides the population into well-defined groups from which random samples are
drawn is known as cluster sampling.

v) The principles of simple random sampling are the theoretical basis for statistical inference.

vi) If the mean of a certain population is 20, it is likely that most of the sample means will be 20.

vii) Any sampling distribution can be totally described by its mean and standard deviation.

ix) The central limit theorem assures that the sampling distribution of mean is always normal.

x) Stratified sampling is used when each group considered are more homogenous within itself and
heterogeneous between group.

Ans- i- F, ii- T, iii- T, iv- F, v- T, vi- F, vii- F, ix- T, x- T

193. Madhu, a frugal student, wants to buy a used bike. After randomly selecting 125 wanted
advertisements, he found the average price of the bike to be Rs. 3250 with a standard deviation of Rs.
615. Establish an interval estimate for the average price of bike so that Madhu can be:

i) 68.3% certain that the population mean lies in this interval.


.

194. For the following sample sizes and confidence levels, find the approximate ‘t’ values for
constructing confidence intervals (use the ‘t’ table).

i) n = 28; 95%

ii) n = 8; 98%

iii) n = 13; 90%

iv) n = 25; 95%

Ans-. i) 2.052

195. i) Null hypothesis states that there is a significant difference between observed and hypothetical
values. (True/False)

ii) 1% level of significance means we are ready to reject a true hypothesis in 99% of cases. (True/False)

iii) If the Null hypothesis Ho: = s or Ho: p = ps or Ho: 1 = 2 or Ho: p1 = p2 then it is two-tailed test.
(True/False)

iv) If the calculated value of a statistic is not in the rejection region R, then Ho is accepted. (True/False)

v) 1 - is called power of the test. (True/False)

vi) If n1 = 300, n2 = 500, 1 = 50, 2 = 60, 1 = 10, 2 = 12 results of two samples taken from two cities A
and B then we test for between means under different population. (True/False)

vii) If n < 30, then we do not apply z test unless, population S.D is known. (True/False)
Ans- i. False

ii. False

iii. True

iv. True

v. True

vi. True

vii. True

196. i) ‘t’ distribution is __________ probability distribution.

ii) ‘t’ distribution’s parameter is __________.

iii) ‘t’ distribution has ___________ areas at the tail than normal distribution.

iv) The mean and variance of the ‘t’ distribution are ________ and ________.

Ans- i. Continuous

ii. Degrees of freedom

iii. Larger

iv. Zero, greater than one

Potrebbero piacerti anche