Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
1. Introduction:
1.1 Our knowledge our attitudes and our actions are based to a very large extent on observations of few
samples. This is equally true in everyday life, in scientific research and also in audit. A person’s opinion of
an institution that conducts thousands of transactions every day is often determined by the one or two
encounters which he or she has had with the institution in the course of several years. In science and human
affairs alike we lack the resources to study more than a fragment of the phenomena that might advance our
knowledge. Sampling consists of selecting some part of a population to observe so that one may estimate
something about the whole population. For example to estimate the amount of recoverable oil in a region, a
few (highly expensive) sample holes are drilled .The situation is similar in a national opinion survey, in
which only a sample of the people in the population is contacted, and the opinions in the sample population
is used to estimate the proportions with the various opinions in the whole population. To estimate the
prevalence of a rare disease, the sample might consist of a number of medical institutions, each of which
has records of patients treated. Sampling is the science that guides quantitative studies of content, behavior,
performance, materials and causes of differences
1.2 Some obvious questions for such studies are how best to obtain the sample and make the observations and,
once the sample data are in hand, how best to use them to estimate the characteristic of the whole
population. Obtaining the observations involves the question of sample size, how to select the sample, what
observational methods to use, and what measurements to record. These are the issues, which are
scientifically addressed, in statistical sampling.
1.3 In the basic statistical sampling setup, the population consists of a known, finite number of N units - such as
transactions, households, people etc. With each unit a value of variable of interest is associated, may be
referred to as x-value of the unit .The x-value of the unit in the population is viewed as fixed, unknown
quantity. The units in the population are identifiable and may be labeled with numbers 1,2,…..,N. Only a
sample of the units in the population are selected and observed .The data collected consist of the x-value for
each unit in the sample, together with the unit’s label. The procedure by which the sample of units is
selected from the population is called sampling design. The usual inference problem in sampling is to
estimate some summary characteristics of interest of the population, such as the mean or the total of the x-
values after observing only the sample.
1.4 The basic statistical sampling view assumes that the variable of interest is measured on every unit in the
sample is without error, so that the errors in the estimates occur only because just part of the population is
included in the sample. Such errors are referred to as sampling errors. But in real survey situation non-
sampling errors may arise also. It may be due to non-response, measurement error, fatigue, detectability
problems etc.
2. Sampling in Audit.
2.1 In the early stages of the development of independent audit, it was not an uncommon practice for an
auditor to perform a 100% examination of the entries and records of the entities audited. However, as the
economy grew, it quickly became apparent that a 100 % examination of the tremendous volume of entries was
unwarranted and uneconomical. This developed into the test or test check approach, which is both widely
accepted and widely used in audit .It is quite obvious that such a method, involving examination of a portion of
a large quantity of entries in order to draw conclusions about the larger group, is a sampling operation, even
though the word “sample” is not generally used in connection with a test.
2.2 When sampling became a widely accepted tool in audit then another concept called ‘Risk Assessment’
came almost simultaneously, which allowed auditor to focus on risky areas through an objective analysis of
available information about the auditable unit. This is somewhat similar to auditor’s judgment based on
auditor’s experience and skill. The main idea of risk analysis is to identify risky areas in an objective way so
that the auditor can focus on more risky area and optimally use available resources to meet overall audit
objectives. Obviously sampling based only on risk assessment is non-statistical sampling. These non-statistical
samplings have been called by different names in literature as 1) Judgmental sampling, 2) Convenient
sampling, 3) Purposive sampling or, 4) Haphazard sampling.
2.3 Auditors may choose a non-statistical sampling plan–that is, they may want to rely on judgment or
specific knowledge about the population in selecting units for audit. Auditors cannot use the results from the
judgmental (non-statistical) sample to draw conclusions about the population in general. Alternatively, the
specific knowledge (or judgment) about the population units may be used to develop a statistically valid
sampling plan based on which one can draw conclusion about the population. One method could be to use this
knowledge to divide the population into several homogeneous sub-groups called strata, and from each stratum
some units may be sampled using some statistical procedure for audit. Such sampling plan will improve the
audit procedure.
2.4 The users of the audit report expect fairness in the selection procedure and more transparent reporting.
The auditors on the other hand are interested in commenting about the nature of problems in the population from
the audit findings in the sample. The statistical sampling, which provides estimates including the reliability of
the estimates of character of interest, is the scientific solution to these problems. The audit reports based on this
scientific approach are defensible. This enhances the acceptability and effectiveness of audit report.
4.2. In Systematic sampling, the sample is chosen by selecting a random starting point and then picking every
Ith (sampling interval) unit in succession from the sampling frame. The sampling interval is the ratio of
population size to sample size, rounding to the nearest integer. Systematic sampling is less costly and
easier to implement than SRS, because random selection is done only once. Systematic sampling is of two
types (a) Linear Systematic Sampling and (b) Circular Systematic Sampling.
4.3. Stratified sampling is a two-step process in which the population is partitioned into sub-populations, or
strata. The strata should be mutually exclusive and collectively exhaustive in that every population unit
should be assigned to one and only one stratum and no population unit should be omitted. From each
stratum units are selected by any random procedure, usually following SRS. The population units in each
stratum should be as homogeneous as possible. A major objective of stratified sampling is to increase
reliability without increasing cost.
4.4 In Cluster sampling the target population is first divided into mutually exclusive and collectively
exhaustive sub-populations, or clusters. Then a random sample of clusters is selected, based on a
probability sampling technique such as SRS. For each selected cluster, either all the units are included in
the sample or a sample of units is drawn. Units within each cluster should be as heterogeneous as the
population i.e. heterogeneity within the cluster should be the same as that in population, but cluster
themselves should be as homogeneous as possible; each cluster should be small-scale representation of
population.
4.5 Probability Proportional to Size (PPS) sampling assigns higher inclusion probability of selection for
population units with higher sizes (size may be total expenditure, total population etc.). In other words, the
entities with higher sizes, based on some characteristics, will have higher chances of selection. Monetary
Unit Sampling (MUS) in audit is an example of PPS sampling with money value of transactions as size
measure. If repetition is allowed it is called Probability Proportional to Size With Replacement (PPSWR)
Sampling. MUS is actually PPS -Systematic.
4.6 Multi Stage Sampling: Sometimes, as in the case of cluster sampling, it is not possible to draw ultimate
units of interest, as the sampling frame of such units is not available. However a list of some suitable bigger
units or primary stage unit (psu’s ) or first stage units (fsu’s) each comprising several smaller units of
second stage units (ssu’s) may be available from which samples of psu’s may be selected. Instead of
completely testing all the ssu’s from the list of selected psu's some selected ssu's are then studied. This is
called two-stage sampling. If a sample of tertiary units is selected from each selected ssu’s the sampling
plan is called three-stage sampling. Similarly higher order multistage designs are also possible.
5. Estimation (Extrapolation) procedure & Sample size.
5.1 Unstratified Mean Per Unit (MPU): The unstratified MPU is used to project an estimated value of a
sample. After a sample is selected with SRS and a value is determined for each sample items, the sample
mean x of sample values multiplied by the number of items in the population N to, produce an estimate of
total value of the sample population. Assuming normality the optimum sample size under SRS is
2
Z . SD . N
n= r ,
A
N
xj nx
j 1
2
Where Z r = confidence level coefficient [Refer table 1 pg. 5], A= margin of error and SD= ,
N
Standard Deviation, N=population Size.
Because MPU without stratification produces large sample sizes relative to other sampling methods, its use
in survey sampling is limited.
Z Score - Table
Confidence Level Z -value
75 % 1.15
80% 1.28
85% 1.44
90% 1.65
92% 1.75
94% 1.88
95% 1.96
96% 2.05
99% 2.58
5.2 Stratified Mean Per Unit: When the population is highly variable (large standard deviation), technically
called heterogeneous population, unstratified MPU may produce very large sample sizes. Stratification of
the population, as explained earlier, produces an estimate that has desired level of reliability with reduced
sample size. Using the following formula the sample sizes for each stratum may be optimally determined
as:
( N i . SDi ) ( N i . SDi )
ni
( A / Z r ) 2 N i . SDi
2
5.3 Unstratified Proportion of audit objections (errors): The projected number of audit objections in the
population is the sample proportion of error multiplied by number of items in the population. The optimum
Z r 2 . P.(1 P ) .
sample size under SRS is n= 2
, Where Z r = confidence level coefficient, A= margin of
A
error and P= Proportion of errors that is expected in the population.
5.4 Stratified Proportion of audit objections (errors):
For three strata, the projected number of audit objections in the population is equal to
N1 . p1 N 2 . p 2 N 3 . p3
,where pi is the proportion of audit objections in the ith stratum.
N
& N=N1+ N2+N3.
5.5 Estimation with PPSWR:
1 n xi
The estimate of population total of the character x = X̂ = , where pi is the probability of selecting
n i 1 pi
the ith sample.
N n Mi mi
Population Estimate = Y
n i 1 mi
y
j 1
ij
Case-II :1st stage sample are drawn with PPSWR and the 2nd stage samples are drawn with SRSWOR :
Let yij is the measure of the characteristics of interest ( e.g. completion of roads in k. m) for the
i DPIU and jth Package , where i=1,2,… n and j= 1,2,,…. mi. xi is the size measure .X is total of all
th
size measures.
1 n X Mi mi
Population Estimate = Y y ij
n i 1 xi mi j 1
From the Binomial table (in Annex –III) it is observed that the required sample size is 77 and if the number of
deviation in 77 samples is 0 or 1, then the auditor can conclude with 90% confidence that the tolerable rate is
not more than 5%, in other words internal controls are reliable. Else, the tolerable rate is more than 5% and
confidence level will also be less than 90% and the internal control is unreliable.
Step 1 The internal control assessment is done to assign % to CR and subsequently used in the beta risk
equation.
Step 2. Appropriate variable sampling plan has to be selected based on audit objective and population
characteristics.
Step3 .If SRSWOR is the sampling plan then the sample size (n):
2
U .SD.N
no r
A
n
if o is high then the sample size
N
can be further reduced as
no
n
n
1 o
N
Step9. Perform a test of samples. The sample mean book value and the population mean book value should not
be substantially different. If so a new sample to be selected discarding the first one or the sample design should
be changed.
Step 10. Perform audit procedure on the sample items selected for substantive tests.
Step 11. Analyze misstatements noted in the sample to determine their cause, nature and whether systematic
pattern exists. A systematic misstatement is a recurring misstatement does not occur randomly.
Step12. Calculate SD of the sample observations
Step 13.Calculate Achieved Precision:
SD n
A' Ur N 1
n N
A'
A" A'TM (1 )
A
x
of each audited value
n
Step 16. Calculate Estimated Audited Value (EAV) :
Xˆ Nx
6.4.2.1 It may be noted that the systematic misstatements (also called nonrandom misstatements) are excluded
from statistical evaluation (please see step 17)
6.4.3. Monetary Unit Sampling (MUS):
Sampling methods used by auditors have evolved over the years. The trend now is to use less rigorous
sampling techniques to reduce cost. However, it has been empirically demonstrated by experiments that
MUS is substantially more capable of detecting material error and can be used for both proportional test of
controls and substantive testing. The use of qualitative analysis that documents the nature and cause
of each misstatement found in a sample can mitigate some of the risk associated with sampling. The
use of a statistical approach such as PPS can further reduce this risk, and, at the same time, permit
the use of a smaller sample.
"The auditor has a responsibility to plan and perform the audit to obtain reasonable assurance about
whether the financial statements are free of material misstatement, whether caused by error or
fraud."
6.4.3.2 Finally:
MUS's job is only to warn us of a possible fire, not to assess the extent of the fire or estimate the damage.
This requires classical forms of statistical sampling and extracts the price of a much larger sample. The
auditor's response to the alarm is essentially the same regardless of the degree by which MUS's projected
potential misstatement exceeds tolerable misstatement (assuming the excess is more than trivial).
However, qualitative as well as quantitative analyses are equally important. The auditor should identify
and document the nature and cause of each misstatement found in the sample. It takes finding only one
misstatement of a particular type for the auditor to become aware that that kind of misstatement is
occurring, at which point the auditor can apply additional procedures to determine the extent of
misstatements of that type. One misstatement may indicate a breakdown in a control procedure that
suggests other errors of a similar nature, and might in fact have implications elsewhere in the audit. A
second misstated item might clue the auditor to an inappropriate accounting principle that probably affects
all similar transactions. By working with the client to identify and correct other similar errors, the potential
misstatement might be reduced to an acceptable level. If not, other kinds of tests that serve the same audit
objectives, such as appropriate analytical procedures, may provide the additional evidence needed to
support the corrected book value of the account. Of course, if the possibility of fraud is indicated, further
effort and more careful consideration are required.
6.4.3.3
The PPS sampling approach in auditing was developed to convert misstatement rates into money value
.Goodfellow .Loebbecke and Neter outline the method for PPS sampling evaluation of the maximum
misstatement rates found with the Poisson distribution. Poisson probabilities are obtained from an
idealized mathematical process generating occasional random event (in audit misstatement rate is small
less than 5%).
Let BV = Book Value ; TM=tolerable misstatement ; Sampling Risk=SR and RFx is the corresponding
reliability factor for X number of anticipated misstatements in the population ; N=Population Size.
Step 5: Make a decision about the acceptability of reported book value by comparing MVM with TM
[RF values may be obtained from Poisson table as in Annex –V]
6.5 The statistical sampling described above may also be categorized into three broad categories: Attribute,
Variable and Probability–proportional to size sampling. Attribute sampling is used primarily to estimate
number of incidence or in test of controls .In contrast; variable sampling and PPS sampling are most frequently
used to estimate population average or total or to test monetary value of account balances.
6.5.1 There are some other types of attribute sampling that are being used in audit:
Discovery sampling is a sampling plan which selects a sample of a given size, accepts the population if the
sample is error free, and rejects the population if it contains at least one error. With discovery sampling the
auditor may not be interested in determining how many errors there are in the population. Where there is a
possibility of avoidance of the internal control system, it may be sufficient to disclose one example to
precipitate further action or investigation.
Involves sampling a universe in increments and examining each incremental sample before
deciding when to stop.
Is appropriate for preliminary sampling and survey audit testing.
Allows auditors to determine from the smallest possible sample size if an error rate exceeds a
predetermined level.
Provides assurance, within a fixed degree of confidence, that the error rate in a population is less
than a predetermined acceptable error rate.
Does not provide an estimate of actual error rate; however, it can readily be converted into
attribute sampling, which can be used to estimate actual error rate.
Bias Difference between the true value and the expected value of the estimate.
Cluster Partitioning of the population into sub-population, called cluster, in such a way that
within each cluster the variation is more. It is convenient but less efficient sampling
design often known as area sampling.
Coefficient of Ratio of S.D. to mean .It is unit free & generally expressed in percentage term. This
variations (CV) measure is widely used to measure the reliability of estimate in survey sampling.
Also see Standard Deviation and Mean
Confidence level The certainty with which the estimate lies within the margin of error.
Estimate Projected value to the population from the sample observation.
Estimation Use of sample observation to estimate some characters of interest in the population
sampling
Expected rate of The rate of error (audit objections) that are expected in the population.
occurrence of
error
Extrapolation Projected to the population from the sample .
Heterogeneity Variation in the in population is more. It is opposite to homogeneity.
Homogeneity Variation in the in population is less. It is generally measured by standard deviation
(S.D). Less SD indicates more homogeneity.
Materiality The value of error that an auditor is willing to accept and still concludes that the audit
objective is achieved. The smaller the materiality, the larger is the sample size.
Margin of error A measure of the difference between the estimate from the sample and the population
value
Mean Average of observations
N
X
i 1
i
Symbolically, Mean = X =
N
Multi-Stage Statistical sampling at different levels that is capable of generating estimates at various
Sampling levels. Mostly used in a large-scale sample surveys.
Monetary Unit MUS give transactions with larger recorded amounts proportionally more opportunity
Sampling (MUS) to be selected than units with smaller recorded amount.
Non-random Errors those are systematic in nature.
error
(X
i 1
i X )2
Symbolically, Variance =
N
N
X
i 1
i nX 2
=
N
8. Suggested reading:
1) Audit Sampling : An Introduction (fifth edition)-D. M. Guy, D.R. Carmichael and R Whittington
7) http://saiindia.gov.in/cag/sites/default/files/Rupe_Trail_First_Edition_0.pdf
Annex-I
Methodologies of selection of samples for SRSWOR &PPSWR
(Using Random Number Table)
Repeat Step 3 and Step 4 until we select n distinct schools (please note the sample selected more than once may be ignored)
Please note in this way we may not get exactly n number of samples.
A table of cumulative total of sizes of the units is made .Let T i=x1+x2+x3+….+xi, where xi is the size measure of
ith unit. A random number, say R is drawn between 1 to T N (= Total size) .The unit ‘i’ is selected if T i-1 < R ≤ Ti . The
process is repeated n (sample size) times
4.2 Let us take the following example, for selection of cities with PPSWR, with size measure as expenditure in the scheme
under review .Let the sample size be three.
(Expenditure figures are all fictitious)
List of districts in Punjab along with its expenditure,
Name of the Expenditure under Cumulat
Sr.No. Districts the scheme (00 ive
‘000) Total
1 Amritsar 368 368
2 Bathinda 1095 1463
3 Faridkot 1009 2472
4 Fategarh Sahib 1536 4008
5 Firozpur 3419 7427
6 Gurdaspur 534 7961
7 Hoshiarpur 621 8582
8 Jalandhar 534 9116
9 Kapurthala 323 9439
10 Ludhiana 223 9662
11 Mansa 278 9940
12 Moga 660 10600
13 Muktsar 1474 12074
14 Nawanshahr 1613 13687
15 Patiala 1038 14725
16 Rupnagar 527 15252
17 Sangrur 2131 17383
Total 17383
Select random numbers of 5 digits between 00001 and 17383 (=Total Expenditure), (Random number selection procedure is the
same as indicated above)
Let the Page No. 1 is selected in the Random Number Table (in Annex-VI),
Table for sample selection
Random Decision District Reason
Number selected
Step 1: A Simple Random Sampling Sample Without Replacement (SRSWOR) of size n ([Reliability Factor
(Zero deviation)] / Margin of error) is first selected for audit. Occurrences of deviations (audit objections) are
noted.
Step 2: Final size of the sample say, ‘m’ is calculated using Poisson distribution table on the basis of deviations
observed in step 1, as m= ([Reliability Factor (No. of deviations in step 1)] / Margin of error).
Remaining (m – 150) cases were selected with SRSWOR from the remaining cases in the population for audit.
Here in the beginning the entire population is partitioned into n random groups and from each group a unit is
selected independently by PPS method. It has certain advantages and an improved method. Please note the
estimation formulae are different.
Annex-III
Determination of Sample Size: Reliability, 90%
Binomial Table (Risk of Assessing Control Risk Too Low 10%)
(Allowable numbers of deviations are in parentheses)
Expected
Population Tolerable Rate
Deviation
Rate 2% 3% 4% 5% 6% 7% 8% 9% 10% 15% 20%
0.00% 114(0) 76(0) 57(0) 45(0) 38(0) 32(0) 28(0) 25(0) 22(0) 15(0) 11(0)
0.25 194(1) 129(1) 96(1) 77(1) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
0.50 194(1) 129(1) 96(1) 77(1) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
0.75 265(2) 129(1) 96(1) 77(1) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
1.00 * 176(2) 96(1) 77(1) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
1.25 * 221(3) 132(2) 77(1) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
1.50 * * 132(2) 105(2) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
1.75 * * 166(3) 105(2) 88(2) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
2.00 * * 198(4) 132(3) 88(2) 75(2) 48(1) 42(1) 38(1) 25(1) 18(1)
2.25 * * * 132(3) 88(2) 75(2) 65(2) 42(1) 38(1) 25(1) 18(1)
2.50 * * * 158(4) 110(3) 75(2) 65(2) 58(2) 38(1) 25(1) 18(1)
2.75 * * * 209(6) 132(4) 94(3) 65(2) 58(2) 52(2) 25(1) 18(1)
3.00 * * * * 132(4) 94(3) 65(2) 58(2) 52(2) 25(1) 18(1)
3.25 * * * * 153(5) 113(4) 82(3) 58(2) 52(2) 25(1) 18(1)
3.50 * * * * 194(7) 113(4) 82(3) 73(3) 52(2) 25(1) 18(1)
3.75 * * * * * 131(5) 98(4) 73(3) 52(2) 25(1) 18(1)
4.00 * * * * * 149(6) 98(4) 73(3) 65(3) 25(1) 18(1)
5.00 * * * * * * 160(8) 115(6) 78(4) 34(2) 18(1)
6.00 * * * * * * * 182(11 116(7) 43(3) 25(2)
7.00 * * * * * * ) 199(14) 52(4) 25(2)
*
sample size is too large to be cost effective
Source :AICPA ,Auditing Guide ,Audit Sampling (New York ,2001)
Annex-IV
Normal Curve Area Table
Standard
Deviation .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .0000 .0040 .0080 .0120 .0159 .0199 .0239 .0279 .0319 .0359
0.1 .0398 .0438 .0478 .0517 .0557 .0596 .0636 .0675 .0714 .0753
0.2 .0793 .0832 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141
0.3 .1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1772 .1808 .1844 .1879
0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
0.6 .2257 .2291 .2324 2357 .2389 .2422 .2454 .2486 .2518 .2549
0.7 .2580 .2612 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852
0.8 .2881 .2910 .2939 .2967 .2995 .3023 .3051 .3078 .3106 .3133
0.9 .3159 .3186 .3212 .3238 .3264 .3289 .3315 .3340 .3365 .3389
1.0 .3413 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 .3621
1.1 .3643 .3665 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3830
1.2 .3849 .3869 .3888 .3907 .3925 .3944 .3962 .3980 .3997 .4015
1.3 .4032 .4049 .4066 .4083 .4099 .4115 .4131 .4147 .4162 .4177
1.4 .4192 ..4207 .4222 .4236 .4251 .4265 .4279 .4292 .4306 .4319
1.5 .4332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4430 .4441
1.6 .4452 .4463 .4474 .4485 .4495 .4505 .4515 .4525 .4535 .4545
1.7 .4554 .4564 .4573 .4582 .4591 .4599 .4608 .4616 .4625 .4633
1.8 .4641 .4649 .4656 .4664 .4671 .4678 .4686 .4693 .4699 .4706
1.9 .4713 .4719 .4726 .4732 .4738 .4744 .4750 .4758 .4762 .4767
2.0 .4773 .4778 .4783 .4788 .4793 .4798 .4803 .4808 .4812 .4817
2.1 .4821 .4826 .4830 .4834 .4838 .4842 .4846 .4850 .4854 .4857
2.2 .4861 .4865 .4868 .4871 .4875 .4878 .4881 .4884 .4887 .4890
2.3 .4893 .4896 .4898 .4901 .4904 .4906 .4909 .4911 .4913 .4916
2.4 ..4918 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936
2.5 .4938 .4940 .4941 4943 .4945 .4960 .4948 .4949 .4951 .4952
2.6 .4953 .4955 .4956 .4957 .4959 .4960 .4961 .4962 .4963 .4964
2.7 .4965 .4966 .4967 .4968 .4969 .4970 .4971 .4972 .4973 .4974
2.8 .4974 .4975 .4976 .4977 .4977 .4978 .4979 .4980 .4980 .4981
2.9 .4981 .4982 .4983 .4984 .4984 .4984 .4985 .4985 .4986 .4986
3.0 .4986 4987 .4987 .4988 .4988 .4988 .4989 .4989 .4989 .4990
3.1 .4990 .4991 .4991 .4991 .4992 .4992 .4992 .4992 .4993 .4993
Annex –V: Poisson Table
sampling error
Number of Deviations 10% 5% 2.5%
0 2.4 3.0 3.7
1 3.9 4.8 5.6
2 5.4 6.3 7.3
3 6.7 7.8 8.8
4 8.0 9.2 10.3
5 9.3 10.6 11.7
6 10.6 11.9 13.1
7 11.8 13.2 14.5
8 13.0 14.5 15.8
9 14.3 16.0 17.1
10 15.5 17.0 18.4
11 16.7 18.3 19.7
12 18.0 19.5 21.0
13 19.0 21.0 22.3
14 20.2 22.0 23.5
15 21.4 23.4 24.7
16 22.6 24.3 26.0
17 23.8 26.0 27.3
18 25.0 27.0 28.5
19 26.0 28.0 29.6
20 27.1 29.0 31.0
21 28.3 30.3 32.0
22 29.3 31.5 33.3
23 30.5 32.6 34.6
24 31.4 33.8 35.7
25 32.7 35.0 37.0
26 34.0 36.1 38.1
27 35.0 37.3 39.4
28 36.1 38.5 40.5
29 37.2 39.6 41.7
30 38.4 40.7 42.9
31 39.1 42.0 44.0
32 40.3 43.0 45.1
33 41.5 44.2 46.3
34 42.7 45.3 47.5
35 43.8 46.4 48.8
36 45.0 47.6 49.9
37 46.1 48.7 51.0
38 47.2 49.8 52.1
39 48.3 51.0 53.4
Source: Adapted from a table developed by Marvin Tummins and Robert H. Strawser, “A Confidence Limits Tables for Attribute Sampling,” Accounting Review
(October 1976), pp. 907-912.
Pawan Dhamija
Statistical Advisor
1
What is Statistics ?
(a) Descriptive Statistics: Deals with
Collection, Organisation, presentation,
Summarisation and analysis of Data
(b) Inferential Statistics: In addition deals
with drawing of inference about a set of data
(Population) when only a part of data
(Sample) is observed.
2
Statistics
Collecting
Collecting Summarizing
Summarizing Presenting
Presenting Analyzing
Analyzing
Generalizing
5
Sampling, Census and Statistical Inference
7
Measures of Central Tendency (Averages)
Measures the “centre” of the data set
Single number that can be taken as a
representative of the entire data set
Measures commonly used for averages are:
Mean
Median
Mode
Which measure to use depends on nature of data
It is okay to report more than one measure.
8
Measures of Central Tendency: Mean
10
Median: “Middle observation” according to its rank in
data i.e. after arranging data in ascending order.
Better than mean if extreme observations are present i.e.
for skewed data.
If n is odd: Median = (n+1)/2th item.
11
Mode: Value that occurs most frequently
Good for Qualitative data like intelligence, beauty, honesty.
The value/observation with highest freq. gives mode.
For e.g. for observations: 1, 3, 5, 7, 9, 13, 5, 7, 8, 2, 10, 5
In ascending order: 1, 2, 3, 5, 5, 5, 7, 7, 8, 9, 10, 13;
Observation 5 occurs the maximum number of time (has
highest frequency) so mode = 5
However, if there are two or more observations with
highest frequency we use: Mode = 3*median - 2*mean
If data are symmetric (explained later):
Where N = Σf 13
Mean, Median and Mode of Freq. Distribution
Xi fi (fi*Xi) c.f. Mean=308/50
1 4 4 4 = 6.16
2 6 12 10 N = 50 (even); n/2 =
3 5 15 15 25
4 4 16 19 Median =
5 4 20 23 1/2*(25th + 26th) item
6 3 18 26 Median= ½(6+6) = 6
7 4 28 30
Highest freq. = 6
8 6 48 36
occurring twice so
9 4 36 40
use
10 3 30 43
Mode=3*median-
11 3 33 46
2*mean
12 4 48 50
=3*6 –2*6.16
Total 50 30814
= 5.68
Dispersion: Measures of Variability
Measure the “spread or variation”
[Heterogeneity] in the data
Measures commonly used for variability are:
Variance
Standard Deviation
Range
Semi Inter-quartile Range
or Quartile Deviation (QD)
15
Measures of Variability: Variance
s
2 1 n
2
Xi X
n 1 i 1
Where ‘n’ is the sample size and X is sample mean;
For sample size n > 30 ‘n’ may be used instead of ‘n-1’
Population variance is denoted by S2. This is usually
unknown.
s2 (Sample Variance) can be used as an estimator
(projected value) of S2 (population Variance).
16
Standard Deviation (SD)
Square root of the variance
s = √s2 = sample SD
S = √S2 = population SD; Usually unknown
Merits of SD: Expressed in the same units as the
mean (instead of squared units like the variance)
Demerit: It is difficult to calculate
17
Range and Quartile Deviation (QD)
Range = Maximum - Minimum
QD = ½*(Q3 – Q1)
Where Q3 is third Quartile and Q1 first quartile
25% of observations are below Q1 and 75% below Q3
QD is powerful than range as range is based on just 2
items and QD is based on 50% of the items.
SD is the best and most useful measure of Variation;
however if there are outliers (i.e. if the data are highly
skewed) it should not be used.
18
Find Standard Deviation and Variance of no. of
errors in the vouchers [Solution on next slide]
=137/18 7.6
Mean (m) = Σf*x /Σf
21
Sampling: Some Facts
For very small samples (e.g., <5 observations),
summary statistics (mean, SD etc.) are not meaningful.
Simply list the data.
Beware that poor samples may provide a distorted view
of the population
In general, larger samples are better representative of
the population but they need more resources; so we
have to trade off b/w sample size and feasibility
(available resources).
22
Probability(P)
Measures the likelihood with which an event occurs
1
Sampling
The Process of selection of some members of a
population to generate precise and valid
estimates of population parameters like
averages or proportions.
Sample
A sample is a part of the population, selected by the
investigator/auditor as its representative to gather
information on certain characteristics of the original
population
2
Sampling Terms
Sampling unit (Basic sampling unit)
Example: vouchers, cheques, bills, districts, audit units
Sampling frame
List of all sampling units in the population
Sampling scheme
Method used to select sampling units from the sampling
frame
Parameter : Population characteristic like average, proportion
based on all the units in the population; it is constant/fixed.
Statistic: Sample characteristic like average, proportion based
on sample values; it varies from sample to sample.
3
Advantages of Survey Sampling
less expensive; Saves Time
The quality of information is maintained.
Possible to determine the extent of error due to
Sampling
Non Sampling errors are likely to be less
Even Census Results are verified by sampling
Law of Statistical Regularity lays down that a
moderately large number of items chosen at random
form a large group are almost sure on the average to
possess the characteristic of the large group.
4
Disadvantages of Survey Sampling
Results of a sample survey are subject to error due
to sampling.
A sample may not properly represent the various
subgroups of a population.
Sometimes the sampling methods may become
complicated requiring the services of an expert.
Note: If time & money are not important factors
and if population under consideration is not too
large, census is better than any sampling method.
5
Types of sampling
Non-Statistical sampling
Statistical sampling
6
Non-Statistical Sampling
Units in the study population do not have a known
probability of being included in the sample
Subjective/Biased samples
Used when (i) the number of elements in the population
is either unknown or units in the population can not be
identified and (ii) there are time/ resource constraints
Advantages:
Practical and easy to conduct
Dis-advantages:
Not representative of the population
Not possible to (i) assess the validity of estimates (ii)
Determine sample size
7
Some Non-Statistical Sampling
Techniques
I Accidental/ Haphazard Sampling/
The auditor selects sample (audit units, bills, vouchers)
without any conscious bias ; the sample is expected to be
representative of the population. For e.g. avoiding first
and last voucher in a bundle.
II Judgmental /Purposive Sampling
The auditor selects sample (audit units, bills, vouchers)
which in his opinion contain maximum error say
vouchers with the highest values or vouchers of some
particular treasury.
8
Statistical Sampling
Each unit in the study population has a known probability
(may not be equal) of being included in the sample.
Advantages:
It provides estimates free from personal bias
It permits application of objective methods of minimizing
error under the resource constraints.
Allows to draw valid conclusions about population
Dis-advantages:
Needs sampling frame
Compared to Non Prob. sampling it is difficult to apply
9
1. Simple random sampling (SRS)
The most commonly used Statistical sampling
Principle: Equal chance for each sampling unit to be
included in the sample
Procedure
1. Identify all sampling units in the population
2. Determine sample size (n) using appropriate formula/table
3. Draw (n) units using random tables
or computer programs like Excel or IDEA.
Advantages
Simple
Sampling error easily measured
Disadvantages
Need complete list of units
Not always best representative
10
SRS with replacement (SRSWR)
First unit is randomly selected from population
The sampled unit is replaced in the population
Then second unit is drawn; probability of selection
of an element remains unchanged after each draw
The procedure is repeated until the requisite
sample of size 'n' is drawn.
In practice SRSWR is not attractive; Same units
can be selected more than once which may not
add any value/additional information
But in mathematical terms, it is simpler to relate
the sample to population by SRSWR.
11
SRS without replacement (SRSWOR)
Unlike SRSWR, once an element is selected as a
sample unit, it is not replaced in the population
The selected sample units are distinct
SRSWOR provides two advantages:
Elements are not repeated
Variance estimation is smaller (efficiency is
higher) than SRSWR with same sample size
12
Use the following Random Number Table to draw a
simple random sample (i) of 15 vouchers without
replacement and (ii) of 45 vouchers with replacement;
from a treasury having 500 vouchers.
Part of Random Number table
2952 6641 3992 9792 7979 5911 3170
5624 4167 9524 1545 1396 7203 5356
1300 2693 2370 7483 3408 2762 3563
1089 6913 7691 0560 5246 1112 6107
6008 8126 4233 8776 2754 9143 1405
9025 7002 6111 8816 6446
13
2. Systematic Sampling
Principle:
In this method, first unit is drawn by random numbers;
thereafter, every kth (k = N/n is sampling interval) unit is
drawn. It gives equal chance of selection to each unit
Procedure
1. Prepare a list of all elements in the study population (N)
2. Decide the sample size (n)
3. Determine the sampling interval ‘k’ as the integer
nearest to N/n
4. Have the random start by choosing an integer ‘r’
between 1 and k.
5. Select every kth unit starting with the unit corresponding
to the number ‘r’. 14
Systematic Sampling contd..
Say, Target Population N= 54000 vouchers (Sampling Frame)
Sample size n = 6000
Sample Fraction (K)= Target Population / Sample size
=54000/6000 = 9
Number all vouchers of the population
Select 1 number between 1 to 9 (here K = 9) randomly
Say, number 5 is selected then 5th voucher is selected
Next 5+9=14th , 14 + 9 = 23rd voucher is selected and so on …
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35 ,…….
[Circular, multiple or random systematic methods also used]
15
Systematic Sampling (contd..)
Advantages –
Requires less time, sometimes less costly than SRS.
Ensures representativeness across list
Easy to implement
Disadvantages-
Works well only if the complete and up-to-date frame
is available and if the units are randomly arranged in
the frame; for this reason the units are arranged in
some order say alphabetical or in increasing/decreasing
order of value before selecting a sample.
16
3. Stratified sampling
Principle
Classify population into homogeneous subgroups (strata)
Stratification may be done on the basis of income, age,
rural-urban, Revenue-Capital, Treasuries, major heads, etc.
Draw sample (not necessarily equal) from each strata
Combine results of all strata
Advantages
More precise if variable associated with strata e.g. In MUS
sampling, variable is value of vouchers which is related to
strata so it is likely to yield better results than SRS
All subgroups represented, allowing separate conclusion about
each of them; say separate conclusion for each
state/District/treasury
Administrative convenience 17
3. Stratified sampling (contd..)
Disadvantages
Sampling error difficult to measure
Loss of precision if small numbers sampled in
individual heterogeneous strata
Example of stratified sampling:
(i) To select BPL households for a social audit; divide
the population of BPL into three categories (strata) say
top 25%, Middle 50% and Bottom 25% and select
separate samples from 3 categories/strata. (ii) Monetary
Unit Sampling (MUS) is also a case of Stratified
Sampling where the population is divided into 2 strata –
High value and low value vouchers/items
18
Allocation of Sample size in Stratified sampling
Proportional Allocation: ni = (n/N)*Ni
where ni is size of sample from ith strara, Ni is population of ith
strara; n is sample size and N is the population size
Optimum allocation ni’s are chosen so as to
(a) Maximise the precision for fixed sample size n; Neyman’s
Allocation
(b) Maximise the precision for fixed cost
(c) Minimise the total cost for fixed desired precision
Disproportionate Allocation
No. of items selected from a stratum is independent of its size.
A large sample would be required from a stratum if
1. Stratum size Ni is large.
2. Stratum variability Si (variance/Std. Dev.) is large.
19
Exercise: Proportional Allocation
Number of Vouchers coming from 3 treasuries are 300,
200 and 500 respectively. Draw a proportional stratified
sample of size 60 using the random number table given in
slide No.13.
Solution:
Here N1 = 300, N2 = 200 and N3 = 500; N = 1000
using ni = (n/N)*Ni; i = 1, 2, 3
n1 = (60/1000)* 300 = 18, n2 = (60/1000)* 200 = 12
and n3 = (60/1000)* 500 = 30.
Thus a sample of 18, 12 and 30 will be selected from
these three strata using random number table.
20
4. Cluster Sampling
The population is divided into non-overlapping
groups known as Clusters.
Clusters are commonly formed on the basis of
geographical /administrative/political boundaries, e.g.
GPs, Blocks, Departments may act as clusters.
Procedure
List all the clusters/groups of sampling units of the
study population
Select Random Sample of clusters
Survey all or proportion of sampling units of
selected clusters
For e.g. selecting some Districts from a state and
auditing them leaving other Districts 21
Cluster Sampling (contd..)
Advantages
Simple: Complete list of units (sampling frame) is
required only for clusters selected in the sample
Less travel/resources required
Disadvantages
Imprecise if clusters homogeneous (Large sample as
compared to SRS is required for the same precision)
Sampling error difficult to measure
22
The two stages of a Cluster Sample
First stage: Probability proportional to size (PPS)
• Find the number of clusters to be included
• Compute cumulative totals of the populations for each cluster
with a grand total
• Divide the grand total by the number of clusters and obtain the
sampling interval (k)
• Choose a random number less than k and identify the first
cluster
• Add the sampling interval and identify the second cluster
• By repeating the same procedure, identify all the clusters to be
selected
Second stage
In each selected cluster select a random sample of required
number of units using a sampling frame of Basic Sampling Units
in the cluster.
23
Selection of PPS Sample
Let’s take treasuries as clusters/strata, the objective is to
select 30 clusters/strata i.e. 30 treasuries using PPS; size
being no. of vouchers in a treasury.
Procedure: List all Treasuries with number of vouchers in
them; find the cumulative totals of number of vouchers:
Treasury no. of vouchers Cumulative total
1 34 34
2 60 94
3 30 124
4 76 200
5 315 515 and so on
Total 4,715
Divide the cumulative total = 4715 by 30 – clusters to select
4,715 / 30= 157.1; Sampling Interval ‘k’ is 157 24
Selection of PPS Sample contd…
Find a three digit random number [less than 157] say 123
Select the first cluster corresponding to 123 in cumulative Tot.
Select the remaining clusters from the cumulative distribution
by adding 157 (sampling interval) each time.
Treasury no. of vouchers Cum. total Cluster Selected
3 30 124 * selected
4 76 200
5 315 515 ** selected twice
(2nd 123+157=280)
(3rd 280+157=437)
Second Stage: In each selected cluster (treasury) choose required
number of vouchers by random or systematic selection.
25
Stratified Sampling Vs Cluster Sampling
• In both stratified and cluster sampling, the population is
divided into well-defined groups.
Stratified sampling is used when each group has small
variation (more homogeneity) within itself but wide
variation between the groups.
Cluster Sampling in used in the opposite case, when there
is considerable variation within each group but the groups
are essentially similar to each other.
In Stratified sampling estimate of each and every strata is
also available but not in cluster sampling.
26
Stratified Cluster Sampling - Cont.
• Suppose in a state there are 20 Districts;
• We take a sample of 15 villages in each of the 20
Districts to study the implementation of MGNREGA
• In all 300 villages are selected and studied
• This is an example of stratified sampling when
estimates of the desired characteristics for each of the
Districts (Strata) would also be available
• On the other hand let us select 5 districts out of 20 and
take a sample of 60 villages in each of the selected District
• In all 300 villages are selected and studied
• This is an example of Cluster Sampling
• In this case estimates of the desired characteristics for
each of the Districts (Cluster) would not be available
27
Multistage sampling – an example
To obtain a sample of ‘n’ households in the country: the first
stage units may be states, the second stage units (SSUs) Distts.
from selected states, third stage units villages from selected
districts, ultimate stage units are households in the villages
Advantages
Most feasible approach for large populations
No complete listing of units is required at various stages;
second stage frame is required only for the selected first
stage units.
This leads to great saving in operational cost.
Disadvantages
Several sampling lists
Sampling error difficult to measure
May be less efficient compared to a suitable single stage
sampling of the same size.
28
Sampling in AUDIT - III
1
Sampling
The Process of selection of some members of a
population to generate precise and valid
estimates of population parameters like
averages or proportions.
Sample
A sample is a part of the population, selected by the
investigator/auditor as its representative to gather
information on certain characteristics of the original
population
2
Sampling Terms
Sampling unit (Basic sampling unit)
Example: vouchers, cheques, bills, districts, audit units
Sampling frame
List of all sampling units in the population
Sampling scheme
Method used to select sampling units from the sampling
frame
Parameter : Population characteristic like average, proportion
based on all the units in the population; it is constant/fixed.
Statistic: Sample characteristic like average, proportion based
on sample values; it varies from sample to sample.
3
Advantages of Survey Sampling
less expensive; Saves Time
The quality of information is maintained.
Possible to determine the extent of error due to
Sampling
Non Sampling errors are likely to be less
Even Census Results are verified by sampling
Law of Statistical Regularity states that a
moderately large number of items chosen at random
form a large group are almost sure on the average to
possess the characteristic of the large group –
Forms the Basis of Sampling.
4
Disadvantages of Survey Sampling
Results of a sample survey are subject to error due
to sampling.
A sample may not properly represent the various
subgroups of a population.
Sometimes the sampling methods may become
complicated requiring the services of an expert.
Note: If time & money are not important factors
and if population under consideration is not too
large, census is better than any sampling method.
5
Types of sampling
Non-Statistical sampling
Statistical sampling
6
Non-Statistical Sampling
Units in the study population do not have a known
probability of being included in the sample
Subjective/Biased samples
Used when (i) the number of elements in the population
is either unknown or units in the population can not be
identified and (ii) there are time/ resource constraints
Advantages:
Practical and easy to conduct
Dis-advantages:
Not representative of the population
Not possible to (i) assess the validity of estimates (ii)
Determine sample size using statistical methods.
7
Some Non-Statistical Sampling
Techniques
I Accidental/ Haphazard Sampling/
The auditor selects sample (audit units, bills, vouchers,
Districts) without any conscious bias; the sample is
expected to be representative of the population. For e.g.
avoiding first and last voucher in a bundle.
II Judgmental /Purposive Sampling
The auditor selects sample (audit units, bills, vouchers,
Departments) which in his opinion contains maximum
error say vouchers with the highest values or vouchers of
some particular treasury.
8
Statistical Sampling
Each unit in the study population has a known probability
(may not be equal) of being included in the sample.
Advantages:
It provides estimates free from personal bias
It permits application of objective methods of minimizing
error under the resource constraints.
Allows to draw valid conclusions about population
Dis-advantages:
Needs sampling frame
Compared to Non Prob. sampling it is difficult to apply
9
Audit Sampling
Application of audit procedure to less than 100
% of the items/transactions for the purpose of
evaluating some characteristic of the items/
transactions under audit.
Use of audit sampling may not be possible/
advisable in auditing procedures involving
scanning accounting records for unusual items
(outliers), inquiries (Satyam case), most
analytical/detailed procedures, etc.
10
Need of Statistical Sampling in Auditing
11
Advantages of Statistical Sampling
Offer a means of estimating errors/misstatement in
quantifiable and reliable manner
Takes into account risk and materiality for determining
sample size and cost.
Offers a means of arriving at an optimum sample size to
avoid under or over auditing
Properly designed sampling estimates are unbiased and
transparent.
Helps in forming an opinion about the extent of audit
objection/ value of misstatement [non-levy/short-levy of
taxes] in the population with specified sampling risk (say
5%) - for e.g. with 95% confidence we can say that errors
in vouchers are b/w 2.6 – 3.0%
12
.
Sampling Error
No sample is a perfect mirror image of population
The estimates obtained vary from sample to sample.
The sampling variance is the measure of variability of a
sample estimator like variance of average or proportion.
The square root of the variance of the sample estimator
is called the standard error of the estimator.
The lesser the value of standard error, the more efficient
would be the estimator.
Use of Proper sample design and sufficient sample size
reduce sampling error and increase the efficiency of the
estimate(s) obtained by sampling.
13
Non-Sampling Errors in Audit
include any misjudgement or mistakes by the auditor that may lead
to incorrect conclusion(s) based on audit. They occur even if full
population is examined.
By careful planning & supervision and by using appropriate audit
technique non sampling errors can be reduced but they can't be
eliminated. Some of the cases of non-sampling errors in audit are:
Selecting inappropriate audit procedures to achieve specific
objective. For e.g. an auditor checks controller’s signature on
voucher & not disbursement approval.
Auditor may fail to recognize misstatements (errors) included in
documents that he examines – Can you think some examples?
selecting inappropriate population for e.g. selecting only BPL
households for audit of a scheme involving payment of subsidy.
Auditor makes an error in evaluation (say totalling mistake or
16
Attribute Sampling Plans or Attribute Sampling
An attribute is a qualitative characteristic which can not be
measured quantitatively. However, the population may be
classified into various classes w. r. t. the attribute
Attribute Sampling is used in Tests of Controls (TOC) i.e. to
find out no. of deviations, proportion (%) of deviations etc. –
it deals with ‘How Many’
Examples of attribute sampling:(a) Whether the Financial and
Accounting system of Canteen Stores Department (CSD)
adheres to the laid down standards & procedures?
(b) If the system of identifying targeted beneficiary in Social
Security scheme was in place and was working effectively?
(c) Verifying signatures or approval stamp on a bill/voucher
(d) Entries posted in the correct account/Head?
17
Attribute Sampling - Types
Fixed sample size attribute sampling - objective is
to perform test of control to estimate the
deviation/error rate of a population.
Sequential (stop or go) attribute sampling used for
not so common cases; it prevents oversampling.
Discovery/Exploratory Sampling: observing at least
one deviation - very rare cases.
Block Sampling: It includes all items in a selected
time period/group say all vouchers of January or
all vouchers of a particular department/treasury.
18
Sampling Risks -- Tests of Controls
Actual Extent of Operating
Effectiveness of the Control
Procedure is
Adequate Inadequate
The Test of Controls
Sample Indicates:
Incorrect Decision
Extent of Operating Correct (Risk of Assessing
Effectiveness is Decision Control Risk
Too Low) - (β)
Adequate
Incorrect Decision
Extent of Operating (Risk of Assessing Correct
Effectiveness Control Risk Decision
Inadequate Too High) - (α)
19
Practical Illustration of Attribute Sampling
Objective: To find out if the controls are operating
effectively or not?
We Assume
Risk of Assessing Control Risk Too Low (β) - 5%; it means
95% reliability
Tolerable Deviation Rate — 9 %
21
Sample Size and Evaluating Attributes
Sampling Results
Sample size using the table (previous slide) is 68 (2)
It means auditor should select a sample of 68 items.
Bracketed number - (2) means; if 2 or less
deviations are observed in a sample of 68, we may
conclude that audit objective has been
accomplished. We may conclude like:
“I believe that the deviation/error rate in the
population is less than 9 percent.” We will be wrong
5 % of the time when the deviation is exactly 9 %.
If the deviation rate is in excess of 9 % we will be
wrong even less than 5 % of the time.
Planned assessed level of control risk is achieved.
22
Evaluating Attributes Sampling Results
23
Variable Sampling Plans
Variable (or quantitative) sampling is used when
the objective is to estimate a quantity (like
amount of loss to government , average loss per
transaction, etc.); it deals with “How Much”
It is used primarily for substantive testing. Most
commonly used variable sampling plan is
Probability Proportional to Size (PPS) or Monetary
Unit Sampling (MUS)
MUS: It is a hybrid plan combining the
characteristics of attribute and variable sampling.
24
Sampling Risk/Errors for
Substantive Tests – Variable Sampling
Incorrect Decision
Correct
No material Risk of Assessing
Misstatements Decision Control
Risk Too Low (β)
25
Risk of incorrect rejection’ (Alpha risk) or
Risk of Assessing Control Risk Too High
Risk that sample supports the conclusion that the account
balance is materially misstated when it is not.
Corresponds to risk of rejecting a correct null hypothesis.
10% Alpha risk means there is a prob./chance of 0.1
(10%) of concluding that there is a misstatement while
actually there is none.
Arises when the sample indicates a higher level of
errors/risk than is actually the case.
This situation is usually resolved by additional audit work
being performed i.e. large sample
affects audit efficiency but should not affect the validity of
the resulting audit conclusion
26
Risk of incorrect acceptance’ (Beta risk)
Risk of Assessing Control Risk Too Low
Material error is not detected in a population because the
sample failed to select sufficient items containing errors.
Corresponds to risk of not rejecting a false null hypothesis
– in audit which is more serious – alpha or beta
It affects audit effectiveness,.
This risk is 10% means prob./chance of concluding that
there is no misstatement while actually there is a
misstatement; it also indicates 100 – 10 = 90% reliability.
To control this risk we increase precision and hence
sample size.
27
Sample Size required for Un-stratified MPU*
2
U r .SD. N
no
A
no
if is high; the sample size can be
N
f urther reduced f or SRSWOR as
no
n
no
1
N
31
Sixty- two additional sample items are added to
the pilot sample of 30 to yield the total sample
of 92. The 62 additional sample items are
selected using SRSWOR.
•A standard deviation based on 92 items is
calculated. Assume that standard deviation is
Rs. 136.
35
Attribute Sampling: Sample Size
Objective is to estimate the number (proportion) of
audit objections (errors); estimated number of audit
objections in the population is sample proportion of
error multiplied by number of items in the population.
The optimum sample size under SRSWR is:
Z r . P.(1 P ) .
2
A 2
Where Zr = confidence level coefficient = 1.96
for 95% level of confidence, A= margin of error
(we are prepared to accept normally 10 or 20%)
P= Proportion of errors expected in the
population. 36
Estimating P: The formula requires the
knowledge of P, expected proportion of errors in the
population. However, this is what we are trying to
estimate and is unknown
Ways to estimate proportion :
A pilot or preliminary sample. Observations used
in the pilot study can be counted as part of the
final sample
Estimates may be available from previous audit
reports and the upper bound of P can be used in
the formula
If impossible to obtain a better estimate, set p =
0.5 in the formula to yield maximum value of n
37
Reporting the Results
When reporting the results of a Sample it
is important to cover the following key
factors:
The Sample Size
The Sample selection methodology
The Estimate(s) resulting from the Sample
The precision (Std. Error) and confidence
intervals for the Estimate(s)
38
Monetary Unit Sampling MUS
MUS is nothing but Probability Proportional to
Size (PPS) - systematic sampling, where one
assigns high inclusion probability to the
transactions having high value.
In MUS method the sampling unit is not an
invoice or any other physical unit, but an
individual rupee. However, when the individual
rupee is selected, the auditor does not verify
just that particular rupee, but the rupee acts as
a hook and drags the whole invoice with it.
Difficult to apply/understand manually and can be
explained using IDEA
39
Control Measures for Non
Sampling Errors
1
Sampling Errors arise due to:
• POPULATION SPECIFICATION ERROR—when the researcher does not
understand who (s)he should survey. For e.g. in a survey about
breakfast cereal consumption - who should be surveyed? The mother
makes the purchase decision, but the children influence her choice.
• SAMPLE FRAME ERROR—when the wrong sub-population is used to
select a sample. For e.g. if the sample frame is from car registrations
and telephone directories. The results may be wrongly predicted.
• SELECTION ERROR—This occurs when only those that are interested
respond. It can be controlled by pre-survey contact requesting
cooperation, actual surveying, post survey follow-up if a response is
not received.
• NON-RESPONSE—Non-response errors occur when respondents are
different than those who do not respond. The extent of this non-
response error can be checked through follow-up surveys etc.
• SAMPLING ERRORS—These errors occur because of variation in the
number or representativeness of the sample that responds.
• These errors can be controlled by (1) careful sample designs, (2) large
samples and (3) multiple contacts to assure representative response.
2
Non Sampling Errors: Types
Conceptual Errors:
• Lack of qualified and suitable enumerators
• Lack of proper training of field staff to
make them thorough with the concepts
and definitions involved
Errors of Recording/ Transcription: Due to
carelessness and negligence of the auditor
Errors of Inaccurate Measurement: Due to
erroneous figure of measurement given by the
informant/auditee
3
Non Sampling Errors: Types Contd…
Errors in Totalling:
When there are many items to be totalled
up
Totalling of subtotals may quite often lead
to such errors
Errors of Omission:
When field worker fails to ask certain
questions in the block
Due to non availability of required
information
4
Non Sampling Errors: Types Cont.
Bias of the Interviewer:
Due to inadequate training or partial
understanding of instructions
Putting a question in a specific way or
telling suggestive answers
Errors of Inconsistency: When data are
inconsistent with the similar information
collected in some other part of the same
schedule
5
Non Sampling Errors: Types Contd.
Response Error:
Due to wrong notion present in the mind of
respondent
Due to some kind of fear
Due to wrong understanding of questions
Due to Illiteracy
Due to lack of clarity in questions
Due to deliberate poor response.
Error due to Prestige/ Self interest: Due to
prestige, pride or self interest, informant may
introduce bias by upgrading education,
expenditure and downgrading age, income, etc. 6
Non Sampling Errors: Types Contd…
Errors due to recall lapse: If the recall period is
longer, answers may be based on guess or averages
Error due to absence of right informant
Error due to incorrect identification of sampling
units (say wrong marking of boundaries)
• Boundaries not correctly identified due to lack of
adequate effort or due to misguidance by some
person
Errors due to longer reference period:
• Inclusion of information pertaining to period
out of reference period
• Exclusion of information pertaining to the
period within the reference period .
7
Methods of Controlling Non Sampling Errors
Recruitment of proper primary field worker who
has got:
• Aptitude for field work
• Good knowledge of the survey area/local
language
• Proper academic qualification
• Tactfulness and resourcefulness
Training: Required for facilitating and
understanding the sampling design, various
concepts & definitions, schedules of enquiry and
procedure of data collection 8
Methods of Controlling Non Sampling Errors - Cont.
• Purpose of training/workshops is to bring uniformity
in concepts and procedures
• Active participation by primary field workers and
supervisors
Inspection/ Supervision:
• On the spot verification
• Instant feedback to the investigating staff
• Inspection norms
Probing:
• Probing questions should be simple
• Should not create any sort of bitterness
Cross checking
9
Methods of Controlling Non Sampling Errors - Cont.
Scrutiny and Super Scrutiny
Monthly Meetings
Feedback Reports
Role of experienced field staff in improvement
of quality
Amendments in the Schedules and Clarifications:
Pilot Survey
10
Risk Based Audit Approach Session 7.1
This session is the last of the sessions on (a) Monetary Unit Sampling:
audit sampling. Session 5.1 was on the Monetary Unit Sampling (MUS) is a
basic concepts of sampling and Session 6.1 sampling method in which the sampling
on the application of attribute sampling for unit is not an invoice or any other physical
control procedures. This session will cover unit, but an individual rupee. However,
the basics of another type of sampling, the when the individual rupee is selected, the
Monetary Unit Sampling and its application auditor does not verify just that particular
for substantive test of details. rupee, but the rupee acts as a hook and
drags the whole invoice with it. For
Actually, Monetary Unit Sampling (MUS) example, if as a result of sample selection,
is used widely both for Rs.365 is selected for testing and if that
• Test of Controls rupee falls in voucher number 14, then that
• Account Balance voucher will be audited and its quality
assigned to the sampling unit.
In this session, we will discuss MUS for
Substantive Test of Details covering the Let us assume that there are 6 items out of
following important points: which 2 items are to be selected. The value
of the 6 items are 100, 200, 300, 400, 500
• Monetary Unit Sampling (MUS), and 1000. If attribute sampling is used to
advantages and limitations of MUS and select the 2 items, then all the items have
it’s application to substantive test of equal chance of selection, as the sampling
details unit would be individual item. On the other
• Determining the sample size for hand, if MUS is used, then the total value
substantive test of details; and of 6 items works out to Rs.2500, i.e., there
• Evaluation of results of substantive test are 2500 sampling units. As 2 items are to
of details. be selected, the sampling interval would be
2500/2 (Rs.1250). This means that one
. rupee out of every 1250 rupees would be
selected. In such a case, the chances of
Learning Objectives 1000 rupee item getting picked up is 10
times more than the 100 rupee item getting
At the end of the session, you would be picked up. Thus MUS has a bias towards
able to apply MUS for substantive test of high value items.
details, to the extent that, the steps are
followed correctly and keeping in mind the (b) Most Likely Error:
advantages and limitations of MUS. Most Likely Error (MLE) is an estimation
of the error in the population. Initially MLE
The session is expected to provide only a will be estimated based on past experience
broad overview of MUS and is not meant to and used for determining the sample size.
impart expertise on MUS. Participants may After carrying out substantive test of
read additional study material suggested in details, the MLE will be projected based on
the bibliography for further knowledge. actual sample results and used for drawing
audit conclusions.
Note 7.1 1
Risk Based Audit Approach Session 7.1
Note 7.1 2
Risk Based Audit Approach Session 7.1
Note 7.1 3
Risk Based Audit Approach Session 7.1
Note 7.1 4
Risk Based Audit Approach Session 7.1
Note 7.1 5
Risk Based Audit Approach Session 7.1
Note 7.1 6
Application of Statistical
Sampling in
Performance Audit of
National Rural Employment
Guarantee ACT
(MGNREGA)
1
NREGA Background information
MGNREGA: All-India coverage of rural areas;
Administrative set up in the country: State level
District Level Block Level GP level
Beneficiary Level.
Through MGNREGA, the govt. was committed to
provide employment to every rural family (at min.
wages) which demand such work and whose adult
members volunteer to do such work.
Objectives : (i) enhancement of livelihood security of
households (hhs) in rural areas by providing guaranteed
wage employment (ii) creation of durable assets;
Principle implementing agency is Gram Panchayat (GP) 2
Audit Objectives
Structural mechanisms were in place and adequate
capacity building measures were taken by the Central
and state govts. for implementation of the Act;
Procedures for preparing perspective (long term) and
annual plan at different levels for estimating the likely
demand for work were adequate and effective;
Funds were released, accounted for and utilised by the
governments in compliance with the provisions of the
Act and other extant/existing rules;
Process of registration of households, allotment of job
cards and allocation of employment in compliance
with the Act and rules was effective
5
Audit Objectives – Cont.
• livelihood security was provided by giving 100 days of emp. to
hhs in rural areas on demand & wages as declared, were paid;
• MGNREGS works were efficiently and effectively executed in a
time-bound manner and in compliance with the Act and Rules,
and durable assets were created, maintained and accounted for;
• Convergence of the Scheme with other rural development
programmes as envisaged was effectively achieved in enhancing
the employment opportunities under MGNREGS;
• All required records at various levels were properly maintained
and MGNREGS MIS data was accurate, reliable and timely;
• Transparency was maintained by involving all stakeholders in
various stages of its implementation;
• Effective mechanism at Centre and state level existed to assess
the impact of MGNREGS on individual households, local labour
market, migration cycle and efficacy of assets created. 6
Sampling plan
Selection of Districts (1st Stage Unit): Each state was
stratified into 2-5 strata depending on geographical
contiguity; within a strata 25% districts selected with
SRSWOR subject to minimum of 2 Districts.
Selection of Rural Blocks (2nd SU): Within each
selected district, 2 -3 blocks were selected again with
SRSWOR. (2 blocks if no. of blocks in selected district
are < 10.)
Selection of GPs (3rd SU): Within each selected block
25% of GPs (max. 10) were selected by PPSWOR with
size measure as Number of job cards or any other
similar proxy parameter like no. of applicants under
NREGA or No. of BPL population or Population size. 7
Sampling Plan
Selection of Works: (Final SU - I): For each selected
GP 10 works (including incomplete works and
sanctioned in different years) using SRSWOR were
selected; care was taken to select different types of
works like rural connectivity, afforestation, canal
works, wasteland development etc.
Selection of beneficiaries: (Final SU - II): For each
selected GP, 10 beneficiaries were to be selected
using systematic sampling. (min. 2 SC/ST)
10
Approaches discussed by EPoD, CID
• 3 Indicators viz. (i) NREGA Exp. - MUS (ii) Rural
Population and (iii) Person Days Worked could have
been used for selection of Districts.
• They would have ensured – (i) Each Rupee of
Spending becomes equally likely to be selected (ii)
Potential population which qualifies for NREGA and
(iii) Intensity of implementation in a district.
• Districts could have been stratified based on one of
these 3 variables and 4th quartile (top 25% values)
assigned high probability - up-to 100%
11
Approaches discussed by CID – Cont.
• Selecting one District from states where no District
is selected by above technique so that each state is
represented in the sample.
• Rural population approach would have uncovered a
greater number of areas with high rural population
but low MNREGA coverage.
• Person days sample has 2 shortcomings as compared
to total Exp. Approach (i) It may cover less districts
where material expenses exceeded 40% of total
NREGA Exp. (ii) Proportions of households
reaching 100 days of work is artificially high.
12
Potential Extensions of methodology
• 100% Sampling in Key Areas like districts
with highest exp.
• Sampling on Multiple variables like Rural
population and NREGA Exp. taken together
• Using flags to sample Key Districts like
districts where material expenses exceed
40%
• Purposeful selection of works near Pradhan’s
residence.
13