Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
(Book ID B1731)
Section A
I. A random variable takes the values -3, -2, 1, 0, 4, 6 with probabilities 1/12, 2/12, 3/12, 4/12, 1/12,
1/12 respectively. The mean or expected value and variance is _______________.
II. i. Size of the class interval is equal to _______________. ii. Tally marks are used to construct
_______________.
III. Steps in construction of cost of living index numbers involve the following in the order of:
a. Conduct family budget inquiry, select the class of people, obtain price quotations, define the scope of
the index, prepare a frame or list of persons.
b. Define scope of the index, Select the class of people, prepare a frame or list of persons, conduct
family budget inquiry, obtain price quotations.
c. Select the class of people, define scope of the index, conduct family budge inquiry, obtain price
quotations, prepare a frame or list of persons
d. Prepare a frame or list of persons, obtain price quotations, conduct family budget inquiry, define
scope of the index, select the class of people
Answer- c. Select the class of people, define scope of the index, conduct family budge inquiry, obtain
price quotations, prepare a frame or list of persons
a. 25
b. 29
c. 32
d. 12
Answer - a. 25
i. The quantitative characteristic that varies from unit to unit is called a variable.
ii. A variable that assumes all the values in the range is known as discrete variable.
VI. i. 1. The totality of all units in a survey is called _______________. ii. A ______________ is a part or a
subset of the population.
a. i - Unit, ii - Statistic
b. i - Variable, ii - Unit
c. i - Population, ii - Sample
d. i - Statistic, ii – Population
VII. In a bivariate data on ‘x’ and ‘y’, variance of ‘x’ = 49, variance of ‘y’ = 9 and covariance Cov(x, y) = -
17.5. Coefficient of correlation between ‘x’ and ‘y’is
a. 0.833
b. -0.833
c. 0.933
d. -0.933
Answer - b. -0.833
VIII. i. Questions that are answered only if the respondent gives a particular response to a previous
question is___________
ii. Questions where the respondents’ answers are limited to a fixed set of responses
are________________
ii - Contingency questions
d. i- always zero, ii – 15
X. From a random sample of 36 New Delhi civil service personnel, the mean age and the sample
standard deviation were found to be 40years and 4.5 years respectively. 95% confidence interval for the
mean age of civil personnel in New Delhi is:
a. 40 ± 1.47
b. 42 ± 2.47
c. 52 ± 3.37
d. 55 ± 5.57
Answer – a. 40 ± 1.47
1. Statistics that is used to make valid inferences from the data for effective decision making among
managers or professionals is
a. Descriptive Statistics
b. Inferential Statistics
a. Webster
b. Boddington
c. A. L Bowley
3. A professor asked the students in a class their heights. On the basis of this information, the
professor states that the average height of all the students in the university is 21 years. This is
an example of
a. a census
b. descriptive statistics
c. an experiment
d. Inferential Statistics
a. Parameter
b. Sample
c. Statistics
d. Census
Answer - a. Parameter
12. 16. Algebraic sum of deviations of a set of values taken form their mean is
a. 1
b. 2
c. 3
d. 0
Answer- d. 0
b. A measure of variability
6. To compare the homogeneity or stability or consistency of two or more data sets we use
a. Arithmetic Mean
b. Standard Deviation
c. Coefficient of Variation
d. Mean Deviation
7. Which of the following represents the fiftieth percentile, or the middle point in a set of
numbers arranged in order of magnitude?
a. Mode
b. Median
c. Mean
d. Variance
Answer - b. Median
a. 8
b. 7
c. 6
d. 5
ANSWER - c. 6
a. 0 to 1
b. -1 to 1
c. 1 to 2
d. -1 to 0
ANSWER - a. 0 to 1
a. E(X)= Σ Xi P (Xi)
b. E(X)= Σ Xi P (Xi2)
times.
a. Probability distribution
b. Normal distribution
c. Poisson process
d. Binomial process
12. If X is a Poisson variate, such that P(X = 1) = P(X = 2), find P(X = 0).
a. 0.04979
b. 0.13534
c. 0.2382
d. 0.14937
ANSWER - b. 0.13534
a. Binomial distribution
b. Bernoulli distribution
c. Poisson distribution
d. Normal distribution
14. Which sampling theory states that, “other things being equal, as the sample size increases,
d. Principle of validity
a. Normal distributions
b. Standard deviation
d. Binomial distribution
a. Type I error
b. Type II error
c. Producer's risk
d. Right decision
17. Which business forecasting method is used when business indices are constructed to study and
analyse the business activities on the basis of which future conditions are predetermined?
a. Business barometers
c. Extrapolation
d. Regression analysis
18. The results of Chi-square test cannot be accurate if the cell frequencies in a contingency
a. 50
b. 5
c. 20
d. 10
Answer - b. 5
19. Which business forecasting method is used when business indices are constructed to
study and analyse the business activities on the basis of which future conditions are
predetermined?
a. Business barometers
c. Extrapolation
d. Regression analysis
20. The long- term oscillations that represent consistent rise and decline in the values of the
b. Seasonal variations
c. Cyclic variations
d. Random variables
a. Positive correlation
b. Negative correlation
c. Zero correlation
d. multiple correlation
22. If the Standard deviation and Mean of the distribution are 2.64 and 53 respectively, the
Co-efficient of Variation is
a. 4.98%
b. 5.54%
c. 6.64%
d. 8.14%
Answer - a. 4.98%
23. Steps in construction of cost of living index numbers involve the following in the order
of:
a. Conduct family budget inquiry, select the class of people, obtain price quotations, define the scope of
the index, prepare a frame or list of persons.
b. Define scope of the index, Select the class of people, prepare a frame or list of persons, conduct
family budget inquiry, obtain price quotations.
c. Select the class of people, define scope of the index, conduct family budge inquiry, obtain price
quotations, prepare a frame or list of persons
d. Prepare a frame or list of persons, obtain price quotations, conduct family budget inquiry, define
scope of the index, select the class of people
Answer - c. Select the class of people, define scope of the index, conduct family budge inquiry, obtain
price quotations, prepare a frame or list of persons
24. 2% of the fuses manufactured by a firm are expected to be defective. The probability that a box
containing 200 fuses contains defective fuses is
a. 0.9817
b. 0.5124
c. 0.4523
d. 0.2222
Answer - a. 0.9817
a) np and npq
b) n and p
c) nq and npq
d) npq and np
26. i. Questions that are answered only if the respondent gives a particular response to a
ii. Questions where the respondents’ answers are limited to a fixed set of responses
are________________
27. The median value of the following set of values 22, 16, 18, 13, 15, 19, 17, 20, 23 is
a. 19
b. 18
c. 15
d. 13
Answer - b. 18
28. i. The theory of Business forecasting based on the assumption that most of the business
data have the lag and lead relationship, that is, changes in business are successive but not
simultaneous is_________________
ii. The theory of business forecasting based on the assumption that history repeats itself
and hence assumes that all economic and business events behave in a rhythmic order
is____________
29. If average height of 30 men is 158 cm and average height of another group of 40 men is 162 cm, the
average height of the combined group is
a. 150.29
b. 160.29
c. 170.29
d. 180.29
Answer - b. 160.29
30. i. If the statistical data are classified according to the time of its occurrence, the type of
classification is_________________
Part A Part B
1. Source note A. The headings and subheadings describing
2. Head note the data present in the columns.
3. Captions B. Indicates the scope and the nature of
4. Title contents in a concise form.
C. Indicates the source from which the data is
taken and is placed at the bottom on the left
hand corner.
D. It is given below the title of the table to
indicate the units of measurement of the data
and is enclosed in brackets.
32. From a random sample of 36 New Delhi civil service personnel, the mean age and the sample
standard deviation were found to be 40years and 4.5 years respectively. 95% confidence interval for
the mean age of civil personnel in New Delhi is:
a. 40 ± 1.47
b. 42 ± 2.47
c. 52 ± 3.37
d. 55 ± 5.57
Answer - a. 40 ± 1.47
33. In a competition, two judges assigned the ranks for seven candidates. The Spearman’s
a. 0.25
b. 0.55
c. 0.35
d. 0.75
Answer - d. 0.75
34. Heights of students are normally distributed with mean 165 cm and standard deviation 5
cm. The probability that the height of a student is greater than 177 cm is:
a. 0.0082
b. 1
c. 1.2
d. 0.5
Answer - a. 0.0082
d. i- always zero, ii - 15
36. Given
a. 151.92
b. 161.92
c. 171.92
d. 181.92
Answer - a. 151.92
b. 120.12
c. 90.35
d. 82.45
37. The time series given below shows the figures of production (in m. tonnes) of a sugar factory. The
best fit for the following data is the straight line trend represented by the equation Y= a+bX
i) The value of a is
a. 50
b. 80
c. 45
d. 90
Answer - d. 90
a. 0
b. 10
c. 1
d. 2
Answer - d. 2
Part A Part B
A. It divides the distribution into 100 parts of
equal frequency.
1. Mode
2. Deciles B. It can be determined graphically (Ogives)
3. Median and is not affected by extreme values.
4. Percentile
C. It divides the arrayed set of variates into
ten portions of equal frequency and they are
sometimes used to characterise the data for
some specific purpose.
a. r = 0.50
b. r = 0.596
c. r = 0.699
d. r = -1.1
Answer - c. r = 0.699
i. P(X = 0) is
a. 0.0625
b. 0.078
c. 0.78
d. 0.008
Ans - a. 0.0625
ii. P(X ≥ 2) is
a. 0.5825
b. 0.6875
c. 0.2875
d. 0.010
Answer - b. 0.6875
Section B
Answer:-
a) Experiment: - An operation that results in a definite outcome is called an experiment. Tossing a coin is
an experiment, if it shows head (H) or tail (T) on falling. In anticipation of outcome of either H or T and
nothing else, tossing a coin which is likely to stand on its edge (figure) over a typical surface is not an
experiment.
c) Sample space: - The set of all possible outcomes of a random experiment is the sample space. The
sample space is denoted by S. The outcomes of the random experiment (elements of the sample space)
are called sample points or outcomes or cases.
d) Event: - Event is a subset of the sample space. Events are denoted by A, B, C, etc. An event which
does not contain any outcome is a null event (impossible event).
e) Equally likely events (equiprobable events):- Two or more events are equally likely if they have equal
chance of occurrence. That is, equally likely events are such that none of them have greater chance of
occurrence than the others.
Q.3.The incidence of occupational disease in an industry is such that the workers have a 20% chance
of suffering from it. What is the probability that out of six workers, 4 or more will contract the
disease?
Answer:-
The probability that at the most two workers contract the disease is
Answer:-
i. The frequencies used in Chi-Square test must be absolute and not in relative terms.
ii. The total number of observations collected for this test must be large.
iii. Each of the observations which make up the sample of this test must be independent of each other.
iv. As test is based wholly on sample data, no assumption is made concerning the population
distribution. In other words, it is a non parametric-test.
v. test is wholly dependent on degrees of freedom. As the degrees of freedom increase, the Chi-
Square distribution curve becomes symmetrical.
Answer:-
Answer:-
(i) Mean forecast: - It is the simplest method of forecasting in which for the time period t, we forecast
the value of the series to be equal to the mean of the series, that is,
In this method the trend effect and cyclic effects do not come into account.
(ii) Naive forecast: - In this method we forecast the value, for the time period t, to be equal to the actual
value observed in the previous period, that is, time period (t-1). This is given as:
(iii) Linear trend forecast: - It is given by Yt = a + bX, where X is to be found from the value of t; a and b
are constants. This method is based on the least squares method where a linear relationship is to be
obtained between time and the response value ‘X’ by the formula which is given as:
(iv) Non-linear trend forecast: - In this method a non-linear relationship between the time and the
response value has been found by the method of least squares. The value of forecast ‘Yt’ for the time
period ‘t’, is given as:
2
Yt = a + bX + cX2 where, X-value will be calculated from the value of ‘t’ and the constant ‘a’.
(v) Forecasting with exponential smoothing: - Exponential smoothing is the forecasting method in
which the observation values are constantly updated and used to revise a forecast. As the observations
get older, they get exponentially decreasing weights. Exponential smoothing is of many types, such as
single, double, triple exponential smoothing.
Section C
Answer:-
Sample units are drawn in such a way that each and every unit in the population has an equal and
independent chance of being included in the sample.
i. Lottery method – we identify each and every unit with distinct numbers by allotting an identical
card.
ii. The use of table of random numbers – There are several random number tables. They are
Tippet’s random number table, Fisher’s and Yate’s tables, Kendall and Babington Smiths random
tables, Rand Corporation random numbers etc.
This sampling design is most appropriate if the population is heterogeneous with respect to
characteristic under study or the population distribution is highly skewed.
3. Systematic sampling:-
This design arranged in some systematic order such as geographical, chronological or alphabetical order.
4. Cluster sampling:-
The total population is divided into recognisable sub-divisions, known as clusters.
5. Multi-stage sampling:-
The total population is divided into several stages. The sampling process is carried out through several
stages.
1. Judgment sampling:-
The choice of sample items depends exclusively on the judgment of the investigator.
Merits Demerits
1. Most useful for small population. 1. It is not a scientific method.
2. Most useful to study some unknown traits of 2. It has a risk of investigator’s bias being
a population some of whose characteristics are introduced.
known.
3. Helpful in solving day-to-day problems.
2. Convenience sampling:-
The sample units are selected according to the convenience of the investigator. It is also called “chunk”
which refers to the fraction of the population being investigated.
3. Quota sampling:-
Quotas are set up according to some specified characteristic such as age groups or income groups. From
each group a specified number of units are sampled according to the quota allotted to the group.
Q.9. Discuss the various steps involved in the analysis of variance in two way classification.
Answer:-
Correction factor =
8. Test statistics F =
9. Decision: If the computed value of F > Table (critical) value of F for degrees of freedom (k-1, n - k) at
α% (5% or 1%), then we reject H0 and conclude that all the population means are unequal. Otherwise
accept H0 and conclude that the population means are not unequal.
1. a) Assume the means of all columns are equal. That is, the effects of all factors in the first kind of
treatment are equal.
b) Assume the means of all rows are equal. That is, the effects of all factors in the second kind of
treatment are equal.
Q.10. Two research workers classified some people in income groups on the basis of sampling studies.
Their results are as follow:
Show that the sampling technique of at least one research worker is defective.
Answer:-
Answer.
a.)
Chi-square test of goodness of fit
The test is applied when you have one categorical variable from a single population. It is used to
determine whether sample data are consistent with a hypothesized distribution.
For example, suppose a company printed baseball cards. It claimed that 30% of its cards were
rookies; 60%, veterans; and 10%, All-Stars. We could gather a random sample of baseball cards
and use a chi-square goodness of fit test to see whether our sample distribution differed
significantly from the distribution claimed by the company. The sample problem at the end of the
lesson considers this example.
Precautions
In order to use a chi-square hypothesis test properly, one has to be extremely careful and keep in
mind certain precautions.
First, a sample size should be large enough. If the expected frequencies are too small, the value of
χ2 gets over-estimated. This will result in the rejection of the hypothesis in several cases.
Another point to note is that the calculations or percentages are used, then the theoretical
distribution would not be applicable.
In most of the cases, the problem of χ2 involves simple calculations. However, for large sets of data
the chi-square test involves very comprehensive calculations. In all such cases, computer should
be used. Several computer Statistics packages contain routines for carrying out chi-square tests.
Goodness-of-fit tests are often used in business decision making. In order to calculate a chi-square
goodness-of-fit, it is necessary to first state the null hypothesis and the alternative hypothesis,
choose a significance level (such as α = 0.5) and determine the critical value. It can be applied in
a wide area including surveys, business decision making, quality control, biological research,
medical research, etc. Also, chi-square tests are commonly used in studies dealing with
demographics, Likert scales, and other discrete data. It is also used to estimate the confidence
interval for a normally distributed population’s standard deviation from the sample standard
deviation; or for other tests like ANOVA and Friedman’s Rank ANOVA.
b.)
Solution:
Let us take the hypothesis that the sampling technique adopted by research workers is similar.
This being so, the expectation of A investigator classifying the people in
Hence,
Degree of freedom=(c-1)(r-1)
=(3-1)(2-1)=2
The table value of χ2 for two degree of freedom at 5 percent level of significance is 5.991.
The calculated value of χ2 is much higher than this table value which means that the calculated
value cannot be said to have arisen just because of chance. It is significant. Hence, the hypothesis
does not hold good. This means that the sampling techniques adopted by two investigators differ
and are not similar. Naturally, then the technique of one must be superior to that of the other.
1 Statistics plays a vital role in almost every facet of human life. Describe the functions of Statistics.
Explain the applications of statistics.
Answer –
MEANING OF STATISTICS
According to Horace Secrist, Statistics may be defined as “an aggregate of facts affected
to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according
to a reasonable standard of accuracy, collected in a systematic manner for a predetermined purpose and
placed in relation to each other”1. This definition is both comprehensive and exhaustive.
Prof. Boddington, on the other hand, defined Statistics as “The science of estimates and
probabilities”2. This definition is also not complete.
FUNCTIONS OF STATISTICS
Statistics is used for various purposes. Let us look at each function of Statistics in detail.
The use of statistical concepts helps in simplification of complex data. Using statistical
concepts, the managers can make decisions more easily. The statistical methods help in reducing the
complexity of the data and in the understanding of any huge mass of data.
After data is collected, it is easy to analyse the trend and tendencies in the data by using the various
concepts of Statistics.
Statistical analysis helps in drawing inferences on the data. Statistical analysis brings out the hidden
relations between variables.
4. Decision making power becomes easier
With the proper application of Statistics and statistical software packages on the collected data,
managers can take effective decisions, which can increase the profits in a business.
Without using statistical methods and concepts, collection of data and comparison would be difficult.
Statistics helps us to compare data collected from various sources. Grand totals, measures of central
tendency and measures of dispersion, graphs and diagrams and coefficient of correlation all provide
ample scope for comparison.
APPLICATIONS OF STATISTICS
Statistical methods are applied to specific problems in various fields such as Biology, Medicine,
Agriculture, Commerce, Business, Economics, Industry, Insurance, Sociology and Psychology. In the field
of medicine, statistical tools like t-tests are used to test the efficiency of the new drug or medicine. In the
field of economics, statistical tools such as index numbers, estimation theory and time series analysis are
used in solving economic problems related to wages, price, production and distribution of income.
In Biology, Medicine and Agriculture, Statistical methods are applied in the following:
2 a) Explain the approaches to define probability. b) State the addition and multiplication rules of
probability giving an example of each case.
Variability and uncertainty make it more difficult to plan or to make decisions. Although they cannot
usually be eliminated, it is however possible to describe and to deal with variability and uncertainty, by
using the theory of probability. This course develops both the theory and applications of probability.
B/
The addition rule of probability states that: i) If ‘A’ and ‘B’ are any two events then the probability of the
occurrence of either ‘A’ or ‘B’ is given by:
ii) If ‘A’ and ‘B’ are two mutually exclusive events then the probability of occurrence of either ‘A’ or ‘B’ is
given by:
iii) If ‘A’, ‘B’ and ‘C’ are any three events then the probability of occurrence of either ‘A’ or ‘B’ or ‘C’ is
given by:
In terms of Venn diagram, from the figure 5.3, we can calculate the probability of
occurrence of either event ‘A’ or event ‘B’, given that event ‘A’ and event ‘B’ are dependent events.
From the figure 5.4, we can calculate the probability of occurrence of either ‘A’ or ‘B’,
given that, events ‘A’ and ‘B’ are independent events. From the figure 5.5, we can calculate the
probability of occurrence of either ‘A’ or ‘B’ or ‘C’, given that, events ‘A’, ‘B’ and ‘C’ are dependent events.
iv) If A1, A2, A3………, An are ‘n’ mutually exclusive and exhaustive events then the
probability of occurrence of at least one of them is given by:
Multiplication rule
If ‘A’ and ‘B’ are two independent events then the probability of occurrence of ‘A’ and ‘B’ is given by:
Solved Problem 6
b) Explain the components of time series. a) Hypothesis testing procedure b) Components of time series
Answer –
To test a hypothesis means to tell (on the basis of the data the researcher has collected) whether or not
the hypothesis seems to be valid. In hypothesis testing the main question is: whether to accept the null
hypothesis or not to accept the null hypothesis? Procedure for hypothesis testing refers to all those steps
that we undertake for making a choice between the two actions i.e., rejection and acceptance of a null
hypothesis
5. Drawing a Conclusion
The null hypothesis (H0) is a statement of no effect, relationship, or difference between two or more
groups or factors. In research studies, a researcher is usually interested in disproving the null hypothesis.
Examples:
• The intervention and control groups have the same survival rate (or, the intervention does not
improve survival rate).
• There is no association between injury type and whether or not the patient received an IV in
the prehospital setting
Step 2: Specify the Alternative Hypothesis
The alternative hypothesis (H1) is the statement that there is an effect or difference. This is usually the
hypothesis the researcher is interested in proving. The alternative hypothesis can be one-sided (only
provides one direction, e.g., lower) or two-sided. We often use two-sided tests even when our true
hypothesis is one-sided because it requires more evidence against the null hypothesis to accept the
alternative hypothesis.
Examples:
• The intubation success rate differs with the age of the patient being treated (two-sided).
• The time to resuscitation from cardiac arrest is lower for the intervention group than for the
control (one-sided).
• There is an association between injury type and whether or not the patient received an IV in
the prehospital setting (two sided).
The significance level (denoted by the Greek letter alpha— a) is generally set at 0.05. This means that
there is a 5% chance that you will accept your alternative hypothesis when your null hypothesis is
actually true. The smaller the significance level, the greater the burden of proof needed to reject the null
hypothesis, or in other words, to support the alternative hypothesis.
In another section we present some basic test statistics to evaluate a hypothesis. Hypothesis testing
generally uses a test statistic that compares groups or examines associations between variables. When
describing a single sample without establishing relationships between variables, a confidence interval is
commonly used.
The p-value describes the probability of obtaining a sample statistic as or more extreme by chance alone
if your null hypothesis is true. This p-value is determined based on the result of your test statistic. Your
conclusions about the hypothesis are based on your p-value and your significance level.
Example:
• P-value = 0.01 This will happen 1 in 100 times by pure chance if your null hypothesis is true.
Not likely to happen strictly by chance.
1. P-value <= significance level (a) => Reject your null hypothesis in favor of your alternative
hypothesis. Your result is statistically significant.
2. P-value > significance level (a) => Fail to reject your null hypothesis. Your result is not
statistically significant.
Hypothesis testing is not set up so that you can absolutely prove a null hypothesis. Therefore, when you
do not find evidence against the null hypothesis, you fail to reject the null hypothesis. When you do find
strong enough evidence against the null hypothesis, you reject the null hypothesis. Your conclusions also
translate into a statement about your alternative hypothesis. When presenting the results of a
hypothesis test, include the descriptive statistics in your conclusions as well. Report exact p-values rather
than a certain range. For example, "The intubation rate differed significantly by patient age with
younger patients have a lower rate of successful intubation (p=0.02)." Here are two more examples with
the conclusion stated in several different ways.
Example:
• H0: There is no difference in survival between the intervention and control group.
• H1: There is a difference in survival between the intervention and control group.
• a = 0.05; 20% increase in survival for the intervention group; p-value = 0.002
Conclusion:
• The difference in survival between the intervention and control group was statistically
significant.
• There was a 20% increase in survival for the intervention group compared to control (p=0.001).
Any time series can contain some or all of the following components:
1. Trend (T)
2. Cyclical (C)
3. Seasonal (S)
4. Irregular (I)
These components may be combined in deferent ways. It is usually assumed that they are multiplied or
added, i.e.,
yt = T _ C _ S _ I
yt = T + C + S + I
To correct for the trend in the _rst case one divides the _rst expression by the trend (T). In the second
case it is subtracted.
Trend component
The trend is the long term pattern of a time series. A trend can be positive or negative depending on
whether the time series exhibits an increasing long term pattern or a decreasing long term pattern. If a
time series does not show an increasing or decreasing pattern then the series is stationary in the mean.
Cyclical component
Any pattern showing an up and down movement around a given trend is identi_ed as a cyclical pattern.
The duration of a cycle depends on the type of business or industry being analyzed.
Seasonal component
Seasonality occurs when the time series exhibits regular uctuations during the same month (or months)
every year, or during the same quarter every year. For instance, retail sales peak during the month of
December.
Irregular component
This component is unpredictable. Every time series has some unpredictable component that makes it a
random variable. In prediction, the objective is to \model" all the components to the point that the only
4 a) What is a Chi-square test? Point out its applications. Under what conditions is this test applicable?
b) Discuss the types of measurement scales with examples.
Answer –
Where, O1, O2, O3….On are the observed frequencies and E1, E2, E3…En are the corresponding expected
or theoretical frequencies.
Application –
The Chi-Square test can also be applied for the discrete distributions. In using Chi-Square test, we
need no assumptions regarding the shape of sampling distributions. The applications of Chi- Square test
include testing:
CONDITIONS
The following are the conditions for using the Chi-Square test:
1. The frequencies used in Chi-Square test must be absolute and not in relative terms.
2. The total number of observations collected for this test must be large.
3. Each of the observations which make up the sample of this test must be independent of each other.
4. As test is based wholly on sample data, no assumption is made concerning the population
distribution. In other words, it is a non parametric-test.
6. The expected frequency of any item or cell must not be less than 5, the frequencies of adjacent items
or cells should be polled together in order to make it more than 5. 7.
8. This test is used only for drawing inferences through test of the hypothesis, so it cannot be used for
estimation of parameter value.
5 Business forecasting acquires an important place in every field of the economy. Explain the
objectives and theories of Business forecasting.
Answer -
Business forecasting provides a guide to long-term strategic planning and helps to inform
decisions about scheduling of production, personnel and distribution. These are common statistical tasks
in business that are often done poorly and frequently confused with planning and setting of goals.
Forecasting of USB-ED introduces participants to forecasting techniques and provides a practical
understanding of the main forecasting tools used by economists, and business, marketing and financial
analysts.
This unique program is designed to provide a balanced mix of theory and practice with the aim of
equipping participants to become operational forecasters, capable of designing, implementing and
evaluating their own forecasting projects. The theories discussed will be cemented by hands-on sessions
in the computer laboratory using industry-standard forecasting software packages.
OBJECTIVE OF FORCASTING
To a very large extent, success or failure would depend upon the ability to successfully
forecast the future course of events. Without some element of continuity between past, present and
future, there would be little possibility of successful prediction. But history is not likely to repeat itself and
we would hardly expect economic conditions next year or over the next 10 years to follow a clear cut
prediction. Yet, past patterns prevail sufficiently to justify using the past as a basis for predicting the
future.
A businessman cannot afford to base his decisions on guesses. Forecasting helps a businessman
in reducing the areas of uncertainty that surround management decision making with respect to costs,
sales, production, profits, capital investment, pricing, expansion of production, extension of credit,
development of markets, increase of inventories and curtailment of loans. These decisions are to be
based on present indications of future conditions.
There are many theories, which are usually followed to make business forecasting. In theory of
economic rhythm the available historical data have to be analyzed into their components, i.e. trend,
seasonal, cyclical, and irregular variations.
The propounders of the theory were of the view that the economic phenomenon behaves in a
rhythmic manner and cycles of nearly the same intensity and duration tend to recur.
The secular trend obtained from historical data is projected a number of years into the future on
a graph or with the help of mathematical trend equation. If the phenomenon is cyclical in behavior, the
trend should be adjusted for cyclical movements. When the forecast for a year is to be split into months
or quarters then the forecasters should adjust the projected figure for seasonal variations also with the
help of seasonal indices.
Action and Reaction theory is based on the Newton’s 3rd law of motion i.e. for every action there
is an equal and opposite reaction. When we apply this law for business forecasting, it implies that if there
is depression in a particular field of business, there is bound to be boom in it sooner or later. It reminds us
of the business of cycle, which has four phases, i.e. prosperity, decline, depression and prosperity. This
theory regards certain levels of business activity as normal and the forecasters have to estimate the
normal level carefully. According to this theory if the price of commodity goes beyond the normal level, it
must come down also below the normal level because of the increased production and supply of that
commodity. Sequence theory or time lag method is based on behavior of different businesses, which
show similar movements occurring successively but not simultaneously.
As such, this method takes into account time lag based on the theory of lead lag relationship,
which hold goods in most cases. The series that usually change earlier serve as forecast for other related
series. This way the element of risk is considerably reduced.
Assumptions
The results of a one-way ANOVA can be considered reliable as long as the following assumptions are
met:
Response variable are normally distributed (or approximately normally distributed).
Samples are independent.
Variances of populations are equal.
Responses for a given group are independent and identically distributed normal random variables (not a
simple random sample (SRS)).
𝑇2 1502
Correction factor = 𝑁 = 15
=1500
𝑇2
SST (Total Sum of the Squares)= Sum of squares of all observations -
𝑁
= 8 + 7 +12 +10 +..........+14 1500 1600 -1500 =100
Sum of the Squares of Error between the columns (samples):
Q7 Distinguish between Classification and Tabulation. Explain the structure and components of a
Table with an example.
Meaning of Classification and Tabulation
Differences between Classification and Tabulation
Structure and Components of a Table with an example
Answer.
Meaning of Classification and Tabulation
Classification
According to Secrist, “Classification is the process of arranging data into sequences and groups according
to their common characteristics or separating them into different but related parts”. According to Stockton
and Clark, “The process of grouping large number of individual facts and observations, on the basis of
similarity among the items is called Classification”.
Tabulation
Tabulation follows classification. It is a logical or systematic listing of related data in rows and columns.
The row of a table represents the horizontal arrangement of data and column represents the vertical
arrangement of data. The presentation of data in tables should be simple, systematic and unambiguous.
The objectives of tabulation are to:
Simplify complex data
Highlight important characteristics
Present data in minimum space
Facilitate comparison
Bring out trends and tendencies
Facilitate further analysis
Differences between Classification and Tabulation
Table depicts the few differences between classification and tabulation.
Table: Differences between Classification and Tabulation
3. Its mean is µ and standard deviation is , where µ and are the parameters of the distribution
4. It is a bell-shaped curve and is symmetric about its mean, as depicted in figure.
Z1= (x1-µ)/σ
= (9-11.35)/3.03
=0.078
Z2= (17-11.35)/3.03
=1.86
From tables
Area between z=0 and z=0.78 is 0.2823
Area between z=0 and z= 1.86 is 0.4686
Area covered by the workers getting wages between rs 9 and rs 17
= 0.2823+0.4686
=0.7509
Q9 a) The procedure of testing hypothesis requires a researcher to adopt several steps. Describe in
brief all such steps.
b) Distinguish between:
i. Stratified random sampling and Systematic sampling
ii. Judgment sampling and Convenience sampling
Hypothesis testing procedure
Differences
Answer.
Steps for procedure of testing hypothesis
Five Steps in Hypothesis Testing:
1. Specify the Null Hypothesis
2. Specify the Alternative Hypothesis
3. Set the Significance Level (a)
4. Calculate the Test Statistic and Corresponding P-Value
5. Drawing a Conclusion
The null hypothesis (H0) is a statement of no effect, relationship, or difference between two or more groups
or factors. In research studies, a researcher is usually interested in disproving the null hypothesis.
Examples:
There is no difference in intubation rates across ages 0 to 5 years.
The intervention and control groups have the same survival rate (or, the intervention does not
improve survival rate).
There is no association between injury type and whether or not the patient received an IV in the
prehospital setting
The alternative hypothesis (H1) is the statement that there is an effect or difference. This is usually the
hypothesis the researcher is interested in proving. The alternative hypothesis can be one-sided (only
provides one direction, e.g., lower) or two-sided. We often use two-sided tests even when our true
hypothesis is one-sided because it requires more evidence against the null hypothesis to accept the
alternative hypothesis.
Examples:
The intubation success rate differs with the age of the patient being treated (two-sided).
The time to resuscitation from cardiac arrest is lower for the intervention group than for the
control (one-sided).
There is an association between injury type and whether or not the patient received an IV in the
prehospital setting (two sided).
Step 3: Set the Significance Level (a)
The significance level (denoted by the Greek letter alpha— a) is generally set at 0.05. This means that there
is a 5% chance that you will accept your alternative hypothesis when your null hypothesis is actually true.
The smaller the significance level, the greater the burden of proof needed to reject the null hypothesis, or
in other words, to support the alternative hypothesis.
In another section we present some basic test statistics to evaluate a hypothesis. Hypothesis testing generally
uses a test statistic that compares groups or examines associations between variables. When describing a
single sample without establishing relationships between variables, a confidence interval is commonly
used.
The p-value describes the probability of obtaining a sample statistic as or more extreme by chance alone if
your null hypothesis is true. This p-value is determined based on the result of your test statistic. Your
conclusions about the hypothesis are based on your p-value and your significance level.
Example:
P-value = 0.01 This will happen 1 in 100 times by pure chance if your null hypothesis is true. Not
likely to happen strictly by chance.
1. P-value <= significance level (a) => Reject your null hypothesis in favor of your alternative
hypothesis. Your result is statistically significant.
2. P-value > significance level (a) => Fail to reject your null hypothesis. Your result is not statistically
significant.
Hypothesis testing is not set up so that you can absolutely prove a null hypothesis. Therefore, when you
do not find evidence against the null hypothesis, you fail to reject the null hypothesis. When you do find
strong enough evidence against the null hypothesis, you reject the null hypothesis. Your conclusions also
translate into a statement about your alternative hypothesis. When presenting the results of a hypothesis
test, include the descriptive statistics in your conclusions as well. Report exact p-values rather than a certain
range. For example, "The intubation rate differed significantly by patient age with younger patients have a
lower rate of successful intubation (p=0.02)." Here are two more examples with the conclusion stated in
several different ways.
Example:
H0: There is no difference in survival between the intervention and control group.
H1: There is a difference in survival between the intervention and control group.
a = 0.05; 20% increase in survival for the intervention group; p-value = 0.002
Conclusion:
Reject the null hypothesis in favor of the alternative hypothesis.
The difference in survival between the intervention and control group was statistically significant.
There was a 20% increase in survival for the intervention group compared to control (p=0.001).
Difference between Stratified random sampling and Systematic sampling & Judgement sampling and
Convenience sampling
Stratified random sampling
This sampling design is most appropriate if the population is heterogeneous with respect to characteristic
under study or the population distribution is highly skewed. We subdivide the population into several
groups or strata such that:
i) Units within each stratum is more homogeneous
ii) Units between strata are heterogeneous
iii) Strata do not overlap, in other words, every unit of the population belongs to one and only one stratum
The criteria used for stratification are geographical, sociological, age, sex, income etc. The population of
size ‘N’ is divided into ‘k’ strata relatively homogenous of size N1, N2…….Nk such that ‘N1 + N2
+……… + Nk = N’.
Then, we draw a simple random sample from each stratum either proportional to size of stratum or equal
units from each stratum.
Systematic sampling
This design is recommended if we have a complete list of sampling units arranged in some systematic order
such as geographical, chronological or alphabetical order.
Suppose the population size is ‘N’. The population units are serially numbered ‘1’ to ‘N’ in some systematic
order and we wish to draw a sample of ‘n’ units. Then we divide units from ‘1’ to ‘N’ into ‘K’ groups such
that each group has ‘n’ units. This implies ‘nK = N’ or ‘K = N/n’. From the first group, we select a unit at
random. Suppose the unit selected is 6th unit, thereafter we select every 6 + Kth units. If ‘K’ is 20, ‘n’ is 5
and ‘N’ is 100 then units selected are 6, 26, 46, 66, 86.
Judgment sampling
The choice of sample items depends exclusively on the judgment of the investigator. The investigator’s
experience and knowledge about the population will help to select the sample units. It is the most suitable
method if the population size is less. The table depicts the merits and demerits of judgement sampling.
2. Convenience sampling
The sample units are selected according to the convenience of the investigator. It is also called “chunk”
which refers to the fraction of the population being investigated, which is selected neither by probability
nor by judgment. Moreover, a list or framework should be available for the selection of the sample. It is
used to make pilot studies. However, there is a high chance of bias being introduced.
Q10 a) What is regression analysis? How does it differ from correlation analysis?
b) Calculate Karl Pearson’s coefficient of correlation between X series and Y series.
x 110 120 130 120 140 135 155 160 165 155
y 12 18 20 15 25 30 35 20 25 10
Answer.
Meaning of Regression and Correlation
Regression analysis
According to M. M. Blair, Regression is defined as, “the measure of the average relationship between two
or more variables in terms of the original units of the data”. Regression analysis – in statistics, this includes
any technique for learning about the relationship between one or more dependent variables Y and one or
more independent variables X. Regression analysis is used to estimate the values of the dependent variables
from the values of the independent variables. Regression analysis is used to get a measure of the error
involved while using the regression line as a basis for estimation. The regression coefficient Y on X is the
coefficient of the variable ‘X’ in the line of regression Y on X. Regression coefficients are used to calculate
the correlation coefficient. The square of correlation is the product of regression coefficients.
Correlation
Correlation analysis attempts to study the relationship between the two variables ‘X and ‘Y’. In regression,
it is attempted to quantify the dependence of one variable on the other. For example, if there are two
variables ‘X’ and ‘Y’ and ‘Y’ depends on ‘X’, then the dependence is expressed in the form of the equations.
When two or more variables move in sympathy with the other, then they are said to be correlated. If both
variables move in the same direction, then they are said to be positively correlated. If the variables move in
the opposite direction, then they are said to be negatively correlated. If they move haphazardly, then there
is no correlation between them. Correlation analysis deals with the following:
Measuring the relationship between variables.
Testing the relationship for its significance.
Giving confidence interval for population correlation measure.
The correlation between two variables may be due to the following causes:
Due to small sample sizes, Correlation may be present in sample and not in population.
Due to a third factor, like in the case, Correlation between yield of rice and tea may be due to a
third factor - ‘rain’.
Differences
Correlation and regression analysis are related in the sense that both deal with relationships among
variables.
The correlation coefficient is a measure of linear association between two variables. Values of the
correlation coefficient are always between -1 and +1. A correlation coefficient of +1 indicates that two
variables are perfectly related in a positive linear sense, a correlation coefficient of -1 indicates that two
variables are perfectly related in a negative linear sense, and a correlation coefficient of 0 indicates that
there is no linear relationship between the two variables. The correlations term is used when
1) Both variables are random variables, and
2) The end goal is simply to find a number that expresses the relation between the variables
Regression analysis involves identifying the relationship between a dependent variable and one or more
independent variables. The regression term is used when
1) One of the variables is a fixed variable, and
2) The end goal is use the measure of relation to predict values of the random variable based on values of
the fixed variable
S. No X Y XY X2 Y2
1 110 12 1320 12100 144
2 120 18 2160 14400 324
3 130 20 2600 16900 400
4 120 15 1800 14400 225
5 140 25 3500 19600 625
6 135 30 4050 18225 900
7 155 35 5425 24025 1225
8 160 20 3200 25600 400
9 165 25 4125 27225 625
10 155 10 1550 24025 100
N=10 ∑X=1390 ∑Y=210 ∑XY= 29730 ∑X2=196500 ∑Y2=4968
n∑XY-(∑X)( ∑Y)
r= --------------------------------
√[n∑X2-(∑X)2] √[n∑Y2-(∑Y)2]
10(29730)-(1390x210)
r= ----------------------------------------
√[10(196500)-(1390)2 ] √[10(4968)-(210)2]
297300-291900 5400
r= ---------------------- = ---------------------
√ [1965000-1932100] √[49680-44100] √32900√ 5580
r=0.3987 Answer.
Business Barometers
Business indices are constructed to study and analyse the business activities on the basis of which future
conditions are predetermined. As business indices are the indicators of future conditions, they are also
known as ’business barometers’ or ‘economic barometers’. With the help of these business barometers the
trend of fluctuations in business conditions are understood and a decision can be taken relating to the
problem by forecasting.
Time series analysis
Time series analysis is also used for the purpose of making business forecasting. The forecasting through
time series analysis is possible only when the business data of various years are available which reflects a
definite trend and seasonal variation. By time series analysis the long term trend, secular trend, seasonal
and cyclical variations are ascertained, analyzed and separated from the data of various years.
Extrapolation
Extrapolation is the simplest method of business forecasting. By extrapolation, a businessman finds out the
possible trend of demand of his goods and also about the future price trends. The accuracy of extrapolation
depends on two factors:
Knowledge about the fluctuations of the figures
Knowledge about the course of events relating to the problem under consideration
Regression analysis
The regression approach offers many valuable contributions to the solution of the forecasting problem. It
is the means by which we select from among the many possible relationships between variables in a
complex economy, which will be useful for forecasting.
Regression relationship may involve one predicted or dependent variable and one independent variable
under simple regression, or it may involve relationships between the variable to be forecasted and several
independent variables under multiple regressions.
Modern econometric methods
Econometric techniques, which originated in the eighteenth century, have recently gained popularity for
forecasting. Econometrics refers to the application of mathematical economic theories and statistical
procedures to economic data to verify economic theorems. Models take the form of a set of simultaneous
equations. The values of the constants in such equations are supplied by a study of statistical time series,
and a large number of equations may be necessary to produce an adequate model.
There are many theories, which are usually followed to make business forecasting. In theory of economic
rhythm the available historical data have to be analyzed into their components, i.e. trend, seasonal, cyclical,
and irregular variations. The propounders of the theory were of the view that the economic phenomenon
behaves in a rhythmic manner and cycles of nearly the same intensity and duration tend to recur.
The secular trend obtained from historical data is projected a number of years into the future on a graph or
with the help of mathematical trend equation. If the phenomenon is cyclical in behavior, the trend should
be adjusted for cyclical movements. When the forecast for a year is to be split into months or quarters then
the forecasters should adjust the projected figure for seasonal variations also with the help of seasonal
indices.
Action and Reaction theory is based on the Newton’s 3rd law of motion i.e. for every action there is an
equal and opposite reaction. When we apply this law for business forecasting, it implies that if there is
depression in a particular field of business, there is bound to be boom in it sooner or later. It reminds us of
the business of cycle, which has four phases, i.e. prosperity, decline, depression and prosperity.
This theory regards certain levels of business activity as normal and the forecasters have to estimate the
normal level carefully. According to this theory if the price of commodity goes beyond the normal level, it
must come down also below the normal level because of the increased production and supply of that
commodity. Sequence theory or time lag method is based on behavior of different businesses, which show
similar movements occurring successively but not simultaneously.
As such, this method takes into account time lag based on the theory of lead lag relationship, which hold
goods in most cases. The series that usually change earlier serve as forecast for other related series. This
way the element of risk is considerably reduced.
Q12 Construct Fisher’s Ideal Index for the given information and check whether Fisher’s formula
satisfies Time Reversal and Factor Reversal Tests.
Items P0 Q0 P1 Q1
A 16 5 20 6
B 12 10 18 12
C 14 8 16 10
D 20 6 22 10
E 80 3 90 5
F 40 2 50 5
Formula of Fishers Ideal Index
Computation of Fisher’s Ideal Index
Fisher’s formula satisfies Time Reversal Test
Fisher’s formula satisfies Factor Reversal Test
Answer.
Formula of Fishers Ideal Index
This method is a combination of Laspeyre’s and Paasche’s method. If we find out the geometric average of
Laspeyre’s index and Paasche’s index, we get the index suggested by Fisher. Fisher’s index number is given
by:
Where,
LP01 & PP01 is Paasche’s price index.
(20x5+18x10+16x8+22x6+90x3+50x2) (20x6+18x12+16x10+22x10+90x5+50x5)
--------------------------------------------------- x ------------------------------------------------------
(16x5+12x10+14x8+20x6+80x3+40x2) (16x6+12x12+14x10+20x10+80x5+40x5)
(100+180+128+132+270+100) (120+216+160+220+450+250)
---------------------------------------- x -------------------------------------------
(80+ 120+112+120+240+80) (96+144+140+200+400+200)
910 1416
------- x -------
752 1180
=
1288560 1.45212766
------------
887360
= 1.20 Answer
Ans:- In some areas, such as mathematics or logic, results of some process can be known with
certainty (e.g., 2+3=5). Most real life situations, however, involve variability and uncertainty.
For example, it is uncertain whether it will rain tomorrow; the price of a given stock a week from
today is uncertain Note_1 ; the number of claims that a car insurance policy holder will make
over a one-year period is uncertain. Uncertainty or "randomness" (meaning variability of results)
is usually due to some mixture of two factors: (1) variability in populations consisting of animate
or inanimate objects (e.g., people vary in size, weight, blood type etc.), and (2) variability in
processes or phenomena (e.g., the random selection of 6 numbers from 49 in a lottery draw can
lead to a very large number of different outcomes; stock or currency prices fluctuate substantially
over time).
Variability and uncertainty make it more difficult to plan or to make decisions. Although they
cannot usually be eliminated, it is however possible to describe and to deal with variability and
uncertainty, by using the theory of probability. This course develops both the theory and
applications of probability.
Ans:-
the Bayes’ theorem states that if A1, A2………….., An are ‘n’ mutually exclusive and exhaustive
events with prior probabilities P(A1),P(A2 ),...P(An ) respectively and ‘B’ be an event for which the
Ans:-
To test a hypothesis means to tell (on the basis of the data the researcher has collected) whether
or not the hypothesis seems to be valid. In hypothesis testing the main question is: whether to
accept the null hypothesis or not to accept the null hypothesis? Procedure for hypothesis testing
refers to all those steps that we undertake for making a choice between the two actions i.e.,
rejection and acceptance of a null hypothesis
(i) Making a formal statement: The step consists in making a formal statement of the null
hypothesis (H0) and also of the alternative hypothesis (Ha) This means that hypotheses should be
clearly stated, considering the nature of the research problem For instance, Mr. Mohan of the
Civil Engineering Department wants to test the load bearing capacity of an old bridge which
must be more than 10 tons In that case he can state his hypotheses as under:
Take another example The average score in an aptitude test administered at the national level is
80 To evaluate a state’s education system, the average score of 100 of the state’s students selec-
ted on random basis was 75. The state wants to know if there is a significant difference between
the local scores and the national scores. In such a situation the hypotheses may be stated as under
Alternative HypothesisHa: m ¹ 80
The formulation of hypotheses is an important step, which must be accomplished with due care
in accordance with the object and nature of the problem under consideration It also indicates
whether we should use a one-tailed test or a two-tailed test. If Ha is of the type greater than (or of
the type lesser than), we use a one-tailed test, but when Ha is of the type “whether greater or
smaller”, then we use a two-tailed test
(ii) Selecting a significance level: The hypotheses are tested on a pre-determined level of
significance and as such the same should be specified Generally, in practice, either 5% level or
1% level is adopted for the purpose The factors that affect the level of significance are
Ques.16. a) What is a Chi-square test? Point out its applications. Under what
conditions is this test applicable?
b) What are the components of time series? Enumerate the methods of determining
trend in time series.
Ans:-
The Chi-square test is one of the most commonly used non-parametric tests in statistical work.
The Greek Letter 2 is used to denote this test. 2 describe the magnitude of discrepancy
between the observed and the expected frequencies. The value of 2 is calculated as:
Where, O1, O2, O3….On are the observed frequencies and E1, E2, E3…En are the
corresponding expected or theoretical frequencies.
In the test for independence, the null hypothesis is that the row and column variables are
independent of each other. We have studied earlier, that the hypothesis testing is done under the
assumption that the null hypothesis is true.
The following are the properties of the test for independence:
The data are the observed frequencies
The data is arranged in the form of a contingency table
The degrees of freedom ‘ ’ can be calculated as:
Number of rows 1 Number of columns 1 where, ‘ ’ is the degrees of freedom
The test for independence has a Chi-Square distribution and is always a right tail test.
The expected value is computed by taking the row total, multiplying it
with the column total and dividing by the grand total. That is given by:
Condition
The following are the conditions for using the Chi-Square test:
1. The frequencies used in Chi-Square test must be absolute and not in relative terms.
2. The total number of observations collected for this test must be large.
3. Each of the observations which make up the sample of this test must be independent of each
other.
4. As 2 test is based wholly on sample data, no assumption is made concerning the population
distribution. In other words, it is a non parametric-test.
Any time series can contain some or all of the following components:
1. Trend (T)
2. Cyclical (C)
3. Seasonal (S)
4. Irregular (I)
These components may be combined in deferent ways. It is usually assumed that they are
multiplied or
added, i.e.,
yt = T _ C _ S _ I
yt = T + C + S + I
To correct for the trend in the _rst case one divides the _rst expression by the trend (T). In the
second case it is subtracted.
Trend component
The trend is the long term pattern of a time series. A trend can be positive or negative depending
on whether the time series exhibits an increasing long term pattern or a decreasing long term
pattern. If a time series does not show an increasing or decreasing pattern then the series is
stationary in the mean.
Cyclical component
Any pattern showing an up and down movement around a given trend is identi_ed as a cyclical
pattern. The duration of a cycle depends on the type of business or industry being analyzed.
Seasonal component
Seasonality occurs when the time series exhibits regular uctuations during the same month (or
months)
every year, or during the same quarter every year. For instance, retail sales peak during the
month of December.
Irregular component
This component is unpredictable. Every time series has some unpredictable component that
makes it a
random variable. In prediction, the objective is to \model" all the components to the point that
the only
component that remains unexplained is the random component.
Ques.17 What do you mean by cost of living index? Discuss the methods of
construction of cost of living index with an example for each.
Ans:-
Cost of living is the cost of maintaining a certain standard of living. A cost-of-living index is
a price index that measures relative cost of living over time. Such indexes are constructed
to have a value of 100 in a given year (or period or place), called the base. An index value of
110 indicates that the current cost of living is ten percent higher than in the base year.
Because the index provides measure of the change in the cost of living, it has no units.
The construction of the price index numbers involves the following steps or problems:
The first step or the problem in preparing the index numbers is the selection of the base year. The
base year is defined as that year with reference to which the price changes in other years arc
compared and expressed as percentages. The base year should be a normal year. In other words, it
should be free from abnormal conditions like wars, famines, floods, political instability, etc.
(a) through fixed base method in which the base year remains fixed; and
(b) through chain base method in which the base year goes on changing, e.g., for 1980 the base
year will be 1979, for 1979 it will be 1978, and so on.
2. Selection of Commodities:
The second problem in the construction of index numbers is the selection of the commodities.
Since all commodities cannot be included, only representative commodities should be selected
keeping in view the purpose and type of the index number.
(a) The items should be representative of the tastes, habits and customs of the people.
(c) Items should be stable in quality over two different periods and places.
(d) The economic and social importance of various items should be considered
(f) All those varieties of a commodity which are in common use and are stable in character should
be included,
3. Collection of Prices:
After selecting the commodities, the next problem is regarding the collection of their prices:
(a) prices are to be collected from those places where a particular commodity is traded in large
quantities,
(b) Published information regarding the prices should also be utilized,
(c) In selecting individuals and institutions who would supply price quotations, care should be
taken that they are not biased.
(d) Selection of wholesale or retail prices depends upon the type of index number to be prepared.
Wholesale prices are used in the construction of general price index and retail prices are used in
the construction of cost-of-living index number,
Ques.18 a) What is analysis of variance? What are the assumptions of this technique?
b) Three samples below have been obtained from normal populations with equal
variances. Test the hypothesis at 5% level that the population means are equal.
Variances. Test the hypothesis at 5% level that the population means are equal.
A B C
8 7 12
10 5 9
7 10 13
14 9 12
11 9 14
Ans:-
The results of a one-way ANOVA can be considered reliable as long as the following
assumptions are met:
T2 1502
Correction factor = = =1500
N 15
T2
SST (Total Sum of the Squares)= Sum of squares of all observations -
N
= 8 + 7 +12 +10 +..........+14 1500 1600 -1500 =100
Sum of the Squares of Error between the columns (samples):
Sum of the squares of the Error within columns (samples):
SSE = SST – SSC = 100 – 40 = 60
Variance between samples:
Ans- c
2. Out of the following, which one does not refer to a mass of data?
a) Banking Statistics
b) Mathematical Statistics
c) Agricultural Statistics
d) Income Statistics
Ans- b
Ans- c
Ans- a
6. According to the definition of Statistics given by Croxton and Cowden, what are the four
components of Statistics?
Ans- The four components of Statistics are collection, presentation, analysis and interpretation of
data.
Ans- b
Ans- a) No
b) No
12. The total sale of a product in Area A is 840 for 30 working days. The total sale of the same product in
Area B is 784 for 28 working days. Should Statistics be applied to get an appropriate picture regarding
the comparison of sales?
Ans- Yes
Ans- Planning
Ans- Yes
Ans- i) statistics
23. State whether each of the following variables is qualitative or quantitative and indicate the
measurement scale that is appropriate for each.
i) Age
ii) Gender
iii) Class Rank
iv) Make of automobile
v) Number of people favouring the death penalty
Ans- i) Quantitative ratio ii) Qualitative nominal iii) Qualitative, ordinal iv) Quantative, ratio
24. State whether each of the following variables is qualitative or quantitative and indicates the
measurement scale that is appropriate for each.
i) Annual sales
ii) Soft drink size (small, medium, large)
iii) Employee classification (GSI through GSIS)
iv) Earning per share
v) Methods of payments (cash, check, credit card)
Ans- i) Quantitative, ratio ii) Qualitative nominal iii) Qualitative ordinal iv) Quantative, ratio
25. Classification is a systematic __________ of the units according to their ____________ __________.
Ans- Bulk
Ans- Attribute
Ans- Two
Ans- i) location
33. The data that can be classified on the basis of time is:
i) Geographical ii) Chronological iii) Qualitative iv) Quantitative
Solution–Chronological classification
Solution–Frequency distribution
Solution–One characteristic
38.The headings of the rows givenin the first column of atable are called:
a) Stubs b) Captions
c) Titles d) Reference notes
Solution-Stubs
Solution–Place
Solution –Qualitative
Solution –Classification
44. i) If the data readings are 3, 4, 5, 6, 7, then it is called _________ variable.Height is generally
__________ variable.
ii) There are ____________ derived frequency distributions for any frequency distribution.
iii) Width of class-interval is given by the difference between ________ and ______.
iv) There are ________ marginal distributions for a distribution.
v) __________ formula is used to calculate the number of class-intervals.
vi) The relative frequency distribution is obtained from frequency distribution by calculating
___________.
47. The diagram which are used to show percentages break down is
i) A circle ii) A square iii) A pie iv) A rectangle
(b) The average computed by considering the relative importance of each of values to the total
value, is called i) arithmetic mean ii) geometric mean iii) weighted arithmetic mean iv) harmonic
average.
53. State whether the following questions are true ‘T’ or false ‘F’.
i) Mode is based on all values
ii) Mode = 3 Median – Mean
iii) Geometric mean is used when we are interested in rate of growth of any phenomena.
iv) Harmonic mean exists if one of the values is zero.
v) A.M < G.M < H.M for any two values ‘a’ and ‘b’.
vi) Arithmetic mean can be calculated accurately even when the distribution has open-end class.
vii) Mode can be located graphically.
54. If the values of the variables are arranged in ascending order of magnitude, the middle term is
i) mean ii) mode iii) median iv) quartile
Ans- ii) 24
58. If assumed mean A=32.5, h=8, fd =-13 and f= 90
i) mean = 35.31 ii) mean=31.35 iii) mean = 33.15 iv) mean=35.35
59. In any distribution when the original items differ size, the value of AM, GM and HM would also
differ in the following order
i) AM>GM>HM ii) AM=GM=HM iii) AM<HM<GM iv)AM.GM>HM
Ans- i) AM>GM>HM
Ans- i – T, ii – T, iii – F
61. State whether the following questions are true, ‘T’ or false, ‘F’.
i) The cost of living index numbers calculated are based on weighted averages.
ii) Many of the items which we use in our life can be assigned weights.
Ans- i- T, ii – T T-True
64. Given P(A) = 0.6, P(B) = 0.7, and P(A B) = 0.5. Find P(A U B)?
Ans- 0.8
66. State whether the following statements are true ‘T’ or false ‘F’.
i) The sum of probabilities sometimes will be greater than 1.
ii) The amount of time you study for an exam is a discrete random variable.
iii) The Bernoulli distribution has only one parameter ‘p’.
68. State whether the following statements are true ‘T’ or false ‘F’.
i) ‘X’ is a Poisson variate if p < 0.1 and n > 10
ii) Example of bimodal distribution is Poisson distribution
Ans- i- T, ii- T
69. State whether the following statements are true ‘T’ or false ‘F’.
i) Quartile deviation of normal distribution is 4/ 5
ii) Mean and standard deviation of a standard normal distribution are ‘1’ and ‘0’
iii) Mean, median and mode coincide in a normal distribution
Ans- i- T, ii- F, iii- T, iv- T, v- F, vi- T, vii- T, viii - T, ix- T, x- T, xi- F, xii- F
71. State whether the following statements are true ‘T’ or false ‘F’.
i) Sample in which units are selected by judgment is known as probability sample.
ii) Judgment sampling does not give representativeness of a sample.
iii) Large sample size always results in minimising the standard error.
iv) A sampling plan that divides the population into well-defined groups from which random
samples are drawn is known as cluster sampling.
v) The principles of simple random sampling are the theoretical basis for statistical inference.
vi) If the mean of a certain population is 20, it is likely that most of the sample means will be 20.
vii) Any sampling distribution can be totally described by its mean and standard deviation.
ix) The central limit theorem assures that the sampling distribution of mean is always normal.
x) Stratified sampling is used when each group considered are more homogenous within itself and
heterogeneous between group.
72. Madhu, a frugal student, wants to buy a used bike. After randomly selecting 125 wanted
advertisements, he found the average price of the bike to be Rs. 3250 with a standard deviation of
Rs. 615. Establish an interval estimate for the average price of bike so that Madhu can be:
i) 68.3% certain that the population mean lies in this interval.
ii) 95.5% certain that the population mean lies in this interval.
Ans-
73. Given the following confidence levels, express the lower and upper limits of the confidence
interval for these levels in terms of and (Use the normal distribution tables). x x
i) 54 percent
ii) 75 percent
iii) 94 percent
iv) 98 percent
Ans-
.
74. For the following sample sizes and confidence levels, find the approximate ‘t’ values for
constructing confidence intervals (use the ‘t’ table).
i) n = 28; 95%
ii) n = 8; 98%
iii) n = 13; 90%
iv) n = 25; 95%
Ans-. i) 2.052
ii) 2.998
iii) 1.782
iv) 2.262
75. i) Null hypothesis states that there is a significant difference between observed and hypothetical
values. (True/False)
ii) 1% level of significance means we are ready to reject a true hypothesis in 99% of cases.
(True/False)
iii) If the Null hypothesis Ho: = s or Ho: p = ps or Ho: 1= 2 or Ho: p1 = p2 then it is two-tailed
test. (True/False)
iv) If the calculated value of a statistic is not in the rejection region R, then Ho is accepted.
(True/False)
vi) If n1 = 300, n2 = 500, 1 = 50, 2 = 60, 1 = 10, 2 = 12 results of two samples taken from two cities
A and B then we test for between means under different population. (True/False)
vii) If n < 30, then we do not apply z test unless, population S.D is known. (True/False)
Ans- i. False
ii. False
iii. True
iv. True
v. True
vi. True
vii. True
iii) ‘t’ distribution has ___________ areas at the tail than normal distribution.
iv) The mean and variance of the ‘t’ distribution are ________ and ________.
Ans- i. Continuous
ii. Degrees of freedom
iii. Larger
iv. Zero, greater than one
Qno - 1. The mean’s of two samples of sizes 50 and 100 respectively are 54.1 and 50.3 and there
standard deviations are 8 and 7 respectively obtain the SD for combined group.
Answer -
Qno The mean wage is Rs. 75 per day, SD wage is Rs. 5 per day for a group of 1000 workers and the
same is Rs. 60 and Rs. 4.5 for the other group of 1500 workers. Find mean and standard deviation for
the entire group.
Qno The runs scored by 3 batsman are 50, 48 and 12. Arithmtic mean’s respectively. The SD of there
runs are 15, 12 and 2 respectively. Who is t he most consistent of the three batsman? If the one of
these three is to be selected who is to be selected?
Evaluation Criteria
1. Less CV indicates more constant player and hence more consistent player is (Player C)
Qno A student while computing the coefficient of variation obtained the mean and SD of 100
observations as 40 and 5.1 respectively. It was later discovered that he had wrongly copied an
observation as 50 instead of 40. Calculate the correct coefficient of variation.
Answer –
Qno. The mean and SD of 21 observations are 30 and 5 respectively. It was subsequently noted that one
of the observations 10 was incorrect. Omit it and determine the mean and SD of the rest.
MB 40 MCQS
Q1.In business context, managers are required to justify decisions on the basis of_______.
ANS - Data
Q5. According to______________, “Statistics is the science of collection, presentation, analysis and
interpretation of numerical data from logical analysis”3.
Q6. Qualitative data deals with meanings while quantitative data deals with________.
ANS – Numbers.
ANS - Population
Q10. Statistics is the art and science of collecting, analysing, presenting and____________.
Q12. In which of the following situations would you like to use Statistics?
a) Buying a house
Q13. Out of the following, which one does not refer to a mass of data?
a) Banking Statistics
b) Mathematical Statistics
c) Agricultural Statistics
d) Income Statistics
a) State
b) Commerce
c) Economics
d) Industry
Ans - a) State
Q17._____________, “Statistics is a science which deals with the method of collecting, classifying,
presenting, comparing and interpreting the numerical data to throw light on enquiry”.
Q18. According to the definition of Statistics given by Croxton and Cowden, what are the four
components of Statistics?
Ans - The four components of Statistics are collection, presentation, analysis and interpretation of data.
Qno19. ‘Statistics may be called the science of counting’ is the definition given by
a) Croxton
b) A.L.Bowley
c) Boddington
d) Webster
Ans - b) A. L. Bowley
Q21. Mention some other areas where there is a scope of applying statistics.
Ans - Industrial Quality control, Investment policies, to find market potential for a product.
Q22 Answer the following:
a) Should the same degree of accuracy be applied while measuring the height of a mountain and the
height of a person?
Ans - a) No
b) No
b) Qualitative data
Q24. The total sale of a product in Area A is 840 for 30 working days. The total sale of the same product
in Area B is 784 for 28 working days. Should Statistics be applied to get an appropriate picture regarding
the comparison of sales?
Ans - . Yes
ANS – Execution
Q29. Sample can never be larger than the population from which the__________.
Ans - Sample was taken
Ans - An interval scale is a scale of measurement where the distance between any two adjacent units of
measurement (or 'intervals') is the same but the zero point is arbitrary.
Q32. Data collected for the first time keeping in view the objective of the survey is_________.
ANS - Planning
ANS - Yes
iii) The weight of new born babies measured up to first decimal place in a state during the first week of
February 2008
Q41. Statistics can best be considered as i) both Art and Science ii) Art iii) Science iv) neither Art nor
Science
Q42. Data that possess numerical properties are known as i) Quantitative data ii) Qualitative data iii)
Primary data iv) Parametric data
Q43. A tool of all science in research and making an intelligent judgement is i) Statistics ii) Collection iii)
Data iv) Judgement
ANS - i) Statistics
i) An official of the Census Board of India is preparing a report on census of population based on the
survey data that is collected by the Census Board.
ii) An HR representative of a software company is deciding on the time taken to perform a particular job
on a project on the basis of random observations collected by him.
iii) A neurologist is examining the relationship between cigarette smoking and brain tumor based on the
data published in a famous neurology journal.
i) sample method
ii) TV News Bulletins gather information on any event through their agents.
Q47. 13. State whether each of the following variables is qualitative or quantitative.
i) Age
ii) Gender
Q48. State whether each of the following variables is qualitative or quantitative and indicates the
measurement scale that is appropriate for each.
i) Annual sales
iv) Earning per share v) Methods of payments (cash, check, credit card)
ANS - i) Quantitative, Ratio, ii) Qualitative, Nominal, iii) Qualitative, Ordinal, iv) Quantitative, Ratio, v)
Qualitative, Nominal
Q49. The Colgate-Palmolive Company started as a small soap and candle shop in __________in1806
Q51. Cumulative frequency distribution is a frequency distribution that indicates ___________that lie
above or below the specified values of the class intervals.
Q57. Pie chart is a graphical device for depicting data summaries based on the _________into sector
that corresponds to the relative frequency for each class.
Ans - Bulk
Ans - Attributes
Q61. Classification done according to two attributes or variables is _________.
Ans - Two
Q64. Geographical classification means classification of data according to: i) Location ii)
Time iii) Attributes iv) Class intervals
Ans - i) location
Q65. Classification is a process of arranging the data into: i) Different columns ii) Different rows
iii) Different rows and columns iv) Groups of related facts in different classes
Q66. The data that can be classified on the basis of time is: i) Geographical ii) Chronological
iii) Qualitative iv) Quantitative
Q67. State True or False i. Tabulation presents the data in a minimum space. ii. Tabulation is a process
of analysis iii. General purpose table deals with specific objectives. iv. Derived tables deal with total,
percentages, ratios, etc
Q68. i) If the data readings are 3, 4, 5, 6, 7, then it is called _________ variable. Height is generally
__________ variable. ii) There are ____________ derived frequency distributions for any frequency
distribution. iii) Width of class-interval is given by the difference between ________ and ______. iv)
There are ________ marginal distributions for a distribution. v) __________ formula is used to calculate
the number of classintervals. vi) The relative frequency distribution is obtained from frequency
distribution by calculating ___________.
ii) Five
Q69. i) Diagrams give an accurate value. (True/False) ii) Pie diagram is drawn according to degree
subtended at the centre of a circle. (True/False)
Q71. The diagram which are used to show percentages break down is i) A circle
ii) A square iii) A pie diagram iv) A rectangle
Ans - Extremes
Q75. Arithmetic mean is defined as the sum of all values divided by_________.
i. Range (R)
ii_________________
Ans - Observations
78. Inter-quartile range is the difference between the third quartile and_________.
Q79 Percentile values divide the distribution into 100 parts of__________.
i. For a given set of values if we add a constant 5 to every value, then the arithmetic mean is
affected.
ii. Arithmetic mean can be calculated for distribution with open-end classes.
iii. Arithmetic mean is affected by extreme values.
iv. Arithmetic mean of 12, 16, 23, 25, 28, 32 is 22.
Q81. A single value within the range of the entire mass of data that is used to represent the whole data
is i) Measures of Central tendency
ii) Statistics
iv) Skewness
Q82.
ANS –
(b) The average computed by considering the relative importance of each of values to the total value, is
called
i) Arithmetic mean ii) Geometric mean iii) Weighted arithmetic mean iv) Harmonic average.
iii) Geometric mean is used when we are interested in rate of growth of any phenomena.
iv) Harmonic mean exists if one of the values is zero. v) A.M < G.M < H.M for any two values ‘a’ and ‘b’.
vi) Arithmetic mean can be calculated accurately even when the distribution has open-end class.
vii) Mode can be located graphically. viii) Mode is used when data is on interval scale.
ANS - i) False, ii) False, iii) True, iv) False, v) False, vi) False, vii) True, viii) True
84. If the values of the variables are arranged in ascending order of magnitude, the middle term is
ANS - median
85. The relation between mean, median and mode is given by i) Mode= 3 Median-2 Mean ii) Mode=2
Mean-Median iii) Mode= 3Median –Mean iv) Mode= Mean- Median
ANS - ii) 24
87. If assumed mean A=32.5, i=8, fd =-13 and f= 90 i) mean = 35.31 ii) mean=31.35 iii) mean =
33.15 iv) mean=35.35
Q88. In any distribution when the original items differ in size, the value of Arithmetic mean (AM),
Geometric mean (GM) and Harmonic mean (HM) would also differ in the following order
i) AM>GM>HM
ii) AM=GM=HM
iii) AM<HM<GM
iv) AM.GM>HM
ANS - i) AM>GM>HM
Q89. State whether the following questions are ‘True’ or ‘False’. i) Quartiles are positional value. ii)
Quartiles help us to find percentage of readings below or above a certain value. iii) Q2 = P50 = D7 =
Median
Q90. State whether the following questions are ‘True’ or ‘False’. i) The cost of living index numbers
calculated are based on weighted averages. ii) Many of the items which we use in our life can be
assigned weights.
ii. Standard deviation of a set of values is increased if every value of the set is increased by a constant.
iii. Standard deviation can be calculated for distributions with open-end classes.
iv. Coefficient of variation can be used to compare the variability of two sets of data measuring the same
characteristics.
Q92. In an interview conducted by a company, if the probability that an interviewed person is male is
2/3 and female is 1/3. Find the mean and variance of the distribution.
ANS - Solution Let, ‘X’ denote gender of the interviewed person. If interviewed person is male then X
takes value 1 and if interviewed person is a female X takes value 0, with probabilities 2/3 and 1/3
respectively (i.e., p+q=2/3 +1/3=1). And X follows Bernoulli distribution as shown in the following table:
Mean of Bernoulli distribution is E(X)= p= 2/3 and
ii) The amount of time you study for an exam is a discrete random variable.
iii) If the mean and variance of a Binomial distribution are 6 and 5, then p = 1/6. iv) Each trial in a
binomial experiment has the different probability of success ‘p’.
Q95. State whether the following statements are ‘True’ or ‘False’ i) ‘X’ is a Poisson variate if p < 0.1 and
n > 10. ii) Poisson distribution is a unimodal distribution.
ii) Mean and standard deviation of Standard normal distribution are ‘1’ and ‘0’.
Therefore, the probability that the tosses will result in at least five heads is 7/64.
ii) The probability that the tosses will result in at least five heads is given by:
Therefore, the probability that the tosses will result in at least five heads is 7/64.
iii) The probability that the tosses will result in at most two heads is given by:
Therefore, the probability that the tosses will result in at most two heads is 11/32.
iv) The probability that the tosses will result in not greater than one head is given by:
Therefore, the probability that the tosses will result in not greater than one head is 7/64.
v) The probability that the tosses will result in not less than five heads is given by:
Therefore, the probability that the tosses will result in not less than five heads is 7/64.
vi) The probability that the tosses will result in at least one head is given by:
Therefore, the probability that the tosses will result in at least one head is 63/64.
The graph depicted in figure 6.3 illustrates the binomial distribution of probability of ‘x’ number
of heads occurring when a coin is tossed 6 times.
Q 98. Solved Problem
The probability that an employee will get an occupational disease is 20%. In a firm having five
employees, what is the probability that:
Therefore, the probability that none of the employees get the disease is 0.3277.
ii) The probability that exactly two employees will get the disease is given by:
Therefore, the probability that exactly two employees will get the disease is 0.2048.
iii) The probability that more than four employees will get the disease is given by:
Therefore, the probability that more than four employees will get the disease is 0.00032.
Find: i) P(X=3)
ii) P(X<4)
QNO100. In a large consignment of electric lamps, 5% are defective. A random sample of 8 lamps is
taken for inspection. What is the probability that it has one or more defectives?
ANS - Solution Given n = 8, p = 5/100 = 0.05 X: number of defective lamps Therefore by binomial
distribution,
QN101, Poisson process is obtained when the Binomial experiment is conducted many number
of_______.
ANS - Times
QNO102. A sample in which items are chosen without knowing their probability of selection__________.
ANS - A sample in which items are chosen without knowing their probability of selection Non-probability
sample
QNO103. Non-sampling errors are attributed to factors that can be controlled and eliminated
by__________.
QNO.105.
QNO. 106.A table with 4 rows and 2 columns has the degrees of freedom of _____________.
QNO107.
ANS - ii) The Chi-Square test
QNO.108 If there are four rows and five columns in classification for 2 – test, then the number of
degrees of freedom equal to __________.
QNO109. If the calculated value is less than the tabulated value, then the null hypothesis is
__________.
ANS - ii) 6
ANS -
2. To find a measure of variation between or among the components. Then, the significance of the
difference between the variations in two series or more may be measured
ii) __________________________.
QNO113. ANOVA is a statistical technique used to evaluate the variances between _________or more
sample means,
ANS - Three
Qno116. The forecasting through time series analysis is possible only when the _____________which
reflects a definite trend and seasonal variation.
1. _____________________.
Qno118. The straight line arithmetic trend assumes that growth will be a____________.
Qno119. The various methods of constructing index numbers can be classified into two groups. They
are: unweighted index numbers and____________.
Qno 120.In the Explicit method the weights are laid down on the basis of _______ of importance of
commodities
121. In which of the following situations would you like to use Statistics?
a) Buying a house
Ans- c
122. Out of the following, which one does not refer to a mass of data?
a) Banking Statistics
b) Mathematical Statistics
c) Agricultural Statistics
d) Income Statistics
Ans- b
Ans- c
a) State
b) Commerce
c) Economics
Ans- a
126. According to the definition of Statistics given by Croxton and Cowden, what are the four
components of Statistics?
Ans- The four components of Statistics are collection, presentation, analysis and interpretation of data.
127. ‘Statistics may be called the science of counting’ is the definition given by
a) Croxton
b) A.L.Bowley
c) Boddington
d) Webster
Ans- b
128. In the olden days statistics was confined only to _______.
a) Should the same degree of accuracy be applied while measuring the height of a mountain and the
height of a person?
Ans- a) No
b) No
b) Qualitative data
131. The total sale of a product in Area A is 840 for 30 working days. The total sale of the same product
in Area B is 784 for 28 working days. Should Statistics be applied to get an appropriate picture regarding
the comparison of sales?
Ans- Yes
Ans- Planning
Ans- Yes
iv) The weight of new born babies measured up to first decimal place in a state during the first week of
February 2008
ii) Art
iii) science
i) quantitative data
ii) collection
iii) data
iv) judgement
Ans- i) statistics
ii) TV News Bulletins gather information on any event through their agents.
144. State whether each of the following variables is qualitative or quantitative and indicate the
measurement scale that is appropriate for each.
i) Age
ii) Gender
145. State whether each of the following variables is qualitative or quantitative and indicates the
measurement scale that is appropriate for each.
i) Annual sales
Ans- i) Quantitative, ratio ii) Qualitative nominal iii) Qualitative ordinal iv) Quantative, ratio
Ans- Bulk
Ans- Attribute
Ans- Two
Ans- i) location
i) Different columns ii) Different rows iii) Different rows and columns iv) Grouping of related facts in
different classes
154. The data that can be classified on the basis of time is:
Solution–Chronological classification
a) Chronological b) Geographical
Solution–Frequency distribution
Solution–One characteristic
159.The headings of the rows givenin the first column of atable are called:
a) Stubs b) Captions
Solution-Stubs
160.Geographical classification means, classification of dataaccording to _______.
Solution–Place
161.The data recorded according to standard of education likeilliterate, primary, secondary, graduate,
technical,etc, willbe known as _______ classification.
Solution –Qualitative
Solution –Tabulation
Solution –Classification
165 i) If the data readings are 3, 4, 5, 6, 7, then it is called _________ variable.Height is generally
__________ variable.
ii) There are ____________ derived frequency distributions for any frequency distribution.
iii) Width of class-interval is given by the difference between ________ and ______.
vi) The relative frequency distribution is obtained from frequency distribution by calculating
___________.
ii. Five
iv. Two
v. Sturge’s
vi. F/N
ii) Pie diagram is drawn according to degree subtended at the centre of a circle. (True/False)
168. The diagram which are used to show percentages break down is
i) A circle ii) A square iii) A pie iv) A rectangle
i. For a given set of values if we add a constant 5 to every value, then the arithmetic mean is affected.
ii. Arithmetic mean can be calculated for distribution with open-end classes.
172. Different methods give different averages which are known as the
i) measures of central tendency ii) statistics iii) measures of dispersion iv) skewness
Ans- i) measures of central tendency
(b) The average computed by considering the relative importance of each of values to the total value, is
called i) arithmetic mean ii) geometric mean iii) weighted arithmetic mean iv) harmonic average.
174. State whether the following questions are true ‘T’ or false ‘F’.
iii) Geometric mean is used when we are interested in rate of growth of any phenomena.
v) A.M < G.M < H.M for any two values ‘a’ and ‘b’.
vi) Arithmetic mean can be calculated accurately even when the distribution has open-end class.
175. If the values of the variables are arranged in ascending order of magnitude, the middle term is
i) Mode= 3 Median-2 Mean ii) Mode=2 Mean-Median iii) Mode= 3Median –Mean iv) Mode= Mean-
Median
Ans- ii) 24
180. In any distribution when the original items differ size, the value of AM, GM and HM would also
differ in the following order
Ans- i) AM>GM>HM
ii) Quartiles help us to find percentage of readings below or above a certain value.
iii) Q2 = P50 = D7 = Median
Ans- i – T, ii – T, iii – F
182.. State whether the following questions are true, ‘T’ or false, ‘F’.
i) The cost of living index numbers calculated are based on weighted averages.
ii) Many of the items which we use in our life can be assigned weights.
Ans- i- T, ii – T T-True
ii. Probability that Mr. Ram will resign from the post
ii) Subjective
iii) Classical
iv) Subjective
iii. Selecting a king and queen from a pack of cards, when two cards are drawn at a time
iv. Getting 53 Mondays in ordinary year
185. Given P(A) = 0.6, P(B) = 0.7, and P(A B) = 0.5. Find P(A U B)?
Ans- 0.8
187. State whether the following statements are true ‘T’ or false ‘F’.
ii) The amount of time you study for an exam is a discrete random variable.
iii) If the mean and variance of a binomial distribution are 6 and 5, then p = 1/6.
iv) Each trial in a binomial experiment has the different probability of success, p.
189. State whether the following statements are true ‘T’ or false ‘F’.
Ans- i- T, ii- T
190. State whether the following statements are true ‘T’ or false ‘F’.
ii) Mean and standard deviation of a standard normal distribution are ‘1’ and ‘0’
191. State whether the following statements are true ‘T’ or false ‘F’.
Ans- i- T, ii- F, iii- T, iv- T, v- F, vi- T, vii- T, viii - T, ix- T, x- T, xi- F, xii- F
192. State whether the following statements are true ‘T’ or false ‘F’.
iii) Large sample size always results in minimising the standard error.
iv) A sampling plan that divides the population into well-defined groups from which random samples are
drawn is known as cluster sampling.
v) The principles of simple random sampling are the theoretical basis for statistical inference.
vi) If the mean of a certain population is 20, it is likely that most of the sample means will be 20.
vii) Any sampling distribution can be totally described by its mean and standard deviation.
ix) The central limit theorem assures that the sampling distribution of mean is always normal.
x) Stratified sampling is used when each group considered are more homogenous within itself and
heterogeneous between group.
193. Madhu, a frugal student, wants to buy a used bike. After randomly selecting 125 wanted
advertisements, he found the average price of the bike to be Rs. 3250 with a standard deviation of Rs.
615. Establish an interval estimate for the average price of bike so that Madhu can be:
194. For the following sample sizes and confidence levels, find the approximate ‘t’ values for
constructing confidence intervals (use the ‘t’ table).
i) n = 28; 95%
ii) n = 8; 98%
Ans-. i) 2.052
195. i) Null hypothesis states that there is a significant difference between observed and hypothetical
values. (True/False)
ii) 1% level of significance means we are ready to reject a true hypothesis in 99% of cases. (True/False)
iii) If the Null hypothesis Ho: = s or Ho: p = ps or Ho: 1 = 2 or Ho: p1 = p2 then it is two-tailed test.
(True/False)
iv) If the calculated value of a statistic is not in the rejection region R, then Ho is accepted. (True/False)
vi) If n1 = 300, n2 = 500, 1 = 50, 2 = 60, 1 = 10, 2 = 12 results of two samples taken from two cities A
and B then we test for between means under different population. (True/False)
vii) If n < 30, then we do not apply z test unless, population S.D is known. (True/False)
Ans- i. False
ii. False
iii. True
iv. True
v. True
vi. True
vii. True
iii) ‘t’ distribution has ___________ areas at the tail than normal distribution.
iv) The mean and variance of the ‘t’ distribution are ________ and ________.
Ans- i. Continuous
iii. Larger