Sampling Research Data

Statistical Sampling in Audit
1. Introduction:
1.1 Our knowledge our attitudes and our actions are based to a very large extent on observations of few
samples. This is equally true in everyday life, in scientific research and also in audit. A person’s opinion of
an institution that conducts thousands of transactions every day is often determined by the one or two
encounters which he or she has had with the institution in the course of several years. In science and human
affairs alike we lack the resources to study more than a fragment of the phenomena that might advance our
knowledge. Sampling consists of selecting some part of a population to observe so that one may estimate
something about the whole population. For example to estimate the amount of recoverable oil in a region, a
few (highly expensive) sample holes are drilled .The situation is similar in a national opinion survey, in
which only a sample of the people in the population is contacted, and the opinions in the sample population
is used to estimate the proportions with the various opinions in the whole population. To estimate the
prevalence of a rare disease, the sample might consist of a number of medical institutions, each of which
has records of patients treated. Sampling is the science that guides quantitative studies of content, behavior,
performance, materials and causes of differences
1.2 Some obvious questions for such studies are how best to obtain the sample and make the observations and,
once the sample data are in hand, how best to use them to estimate the characteristic of the whole
population. Obtaining the observations involves the question of sample size, how to select the sample, what
observational methods to use, and what measurements to record. These are the issues, which are
scientifically addressed, in statistical sampling.
1.3 In the basic statistical sampling setup, the population consists of a known, finite number of N units - such as
transactions, households, people etc. With each unit a value of variable of interest is associated, may be
referred to as x-value of the unit .The x-value of the unit in the population is viewed as fixed, unknown
quantity. The units in the population are identifiable and may be labeled with numbers 1,2,…..,N. Only a
sample of the units in the population are selected and observed .The data collected consist of the x-value for
each unit in the sample, together with the unit’s label. The procedure by which the sample of units is
selected from the population is called sampling design. The usual inference problem in sampling is to
estimate some summary characteristics of interest of the population, such as the mean or the total of the x-
values after observing only the sample.
1.4 The basic statistical sampling view assumes that the variable of interest is measured on every unit in the
sample is without error, so that the errors in the estimates occur only because just part of the population is
included in the sample. Such errors are referred to as sampling errors. But in real survey situation non-
sampling errors may arise also. It may be due to non-response, measurement error, fatigue, detectability
problems etc.
2. Sampling in Audit.
2.1 In the early stages of the development of independent audit, it was not an uncommon practice for an
auditor to perform a 100% examination of the entries and records of the entities audited. However, as the
economy grew, it quickly became apparent that a 100 % examination of the tremendous volume of entries was
unwarranted and uneconomical. This developed into the test or test check approach, which is both widely
accepted and widely used in audit .It is quite obvious that such a method, involving examination of a portion of
a large quantity of entries in order to draw conclusions about the larger group, is a sampling operation, even
though the word “sample” is not generally used in connection with a test.
2.2 When sampling became a widely accepted tool in audit then another concept called ‘Risk Assessment’
came almost simultaneously, which allowed auditor to focus on risky areas through an objective analysis of
available information about the auditable unit. This is somewhat similar to auditor’s judgment based on
auditor’s experience and skill. The main idea of risk analysis is to identify risky areas in an objective way so
that the auditor can focus on more risky area and optimally use available resources to meet overall audit
objectives. Obviously sampling based only on risk assessment is non-statistical sampling. These non-statistical
samplings have been called by different names in literature as 1) Judgmental sampling, 2) Convenient
sampling, 3) Purposive sampling or, 4) Haphazard sampling.
2.3 Auditors may choose a non-statistical sampling plan–that is, they may want to rely on judgment or
specific knowledge about the population in selecting units for audit. Auditors cannot use the results from the
judgmental (non-statistical) sample to draw conclusions about the population in general. Alternatively, the
specific knowledge (or judgment) about the population units may be used to develop a statistically valid
sampling plan based on which one can draw conclusion about the population. One method could be to use this
knowledge to divide the population into several homogeneous sub-groups called strata, and from each stratum
some units may be sampled using some statistical procedure for audit. Such sampling plan will improve the
audit procedure.
2.4 The users of the audit report expect fairness in the selection procedure and more transparent reporting.
The auditors on the other hand are interested in commenting about the nature of problems in the population from
the audit findings in the sample. The statistical sampling, which provides estimates including the reliability of
the estimates of character of interest, is the scientific solution to these problems. The audit reports based on this
scientific approach are defensible. This enhances the acceptability and effectiveness of audit report.
3. Definition & advantages of statistical sampling.

3.1 The essential features of statistical sampling are:
(i) The sample items should have a known probability of selection- for example, by random selection.
(ii) The sample results should be evaluated mathematically – that is, in accordance with probability
theory.
Just because one of these requirement is met does not mean that the application is statistical. For example,
practitioners and others will sometimes state that they are using statistical sampling solely because a
random number method is employed to select the sample. However this is not statistical sampling as no
attempt has been made to evaluate mathematically sample findings (condition no. 2).
3.2 Statistical sampling allows auditors to calculate sample reliability and risk of reliance on sample. It
permits auditors to optimize the sample size given the mathematically measured risk they are willing to
accept. In this both - over auditing and under auditing can be avoided. It enables auditors to make objective
statements about the sample population on the basis of sample observations. In other words, the sample
finding can be projected to the population.
4. Various Statistical Sampling methods.

4.1 The simplest form of random/statistical sampling consists in selecting the sample unit-by-unit (or, item-by-
item), ensuring equal probability of selection to every unit at each draw. This technique of selection is
termed as Simple Random Sampling (SRS). SRS are of two types:
4.1.1 In Simple Random Sampling With Replacement (SRSWR) a unit is selected from the sampling
frame (list of units in the population); the unit is replaced back and the next unit is selected; the process is
repeated till a sample of the desired size is selected. As a result it is possible for a unit to be included more
than once.
4.1.2 In Simple Random Sampling With Out Replacement (SRSWOR) a unit is selected for inclusion
in the sample, it is removed from the sampling frame and the next unit is selected, therefore, a unit cannot
be selected again.
4.2. In Systematic sampling, the sample is chosen by selecting a random starting point and then picking every
Ith (sampling interval) unit in succession from the sampling frame. The sampling interval is the ratio of
population size to sample size, rounding to the nearest integer. Systematic sampling is less costly and
easier to implement than SRS, because random selection is done only once. Systematic sampling is of two
types (a) Linear Systematic Sampling and (b) Circular Systematic Sampling.
4.3. Stratified sampling is a two-step process in which the population is partitioned into sub-populations, or
strata. The strata should be mutually exclusive and collectively exhaustive in that every population unit
should be assigned to one and only one stratum and no population unit should be omitted. From each
stratum units are selected by any random procedure, usually following SRS. The population units in each
stratum should be as homogeneous as possible. A major objective of stratified sampling is to increase
reliability without increasing cost.
4.4 In Cluster sampling the target population is first divided into mutually exclusive and collectively
exhaustive sub-populations, or clusters. Then a random sample of clusters is selected, based on a
probability sampling technique such as SRS. For each selected cluster, either all the units are included in
the sample or a sample of units is drawn. Units within each cluster should be as heterogeneous as the
population i.e. heterogeneity within the cluster should be the same as that in population, but cluster
themselves should be as homogeneous as possible; each cluster should be small-scale representation of
population.
4.5 Probability Proportional to Size (PPS) sampling assigns higher inclusion probability of selection for
population units with higher sizes (size may be total expenditure, total population etc.). In other words, the
entities with higher sizes, based on some characteristics, will have higher chances of selection. Monetary
Unit Sampling (MUS) in audit is an example of PPS sampling with money value of transactions as size
measure. If repetition is allowed it is called Probability Proportional to Size With Replacement (PPSWR)
Sampling. MUS is actually PPS -Systematic.
4.6 Multi Stage Sampling: Sometimes, as in the case of cluster sampling, it is not possible to draw ultimate
units of interest, as the sampling frame of such units is not available. However a list of some suitable bigger
units or primary stage unit (psu’s ) or first stage units (fsu’s) each comprising several smaller units of
second stage units (ssu’s) may be available from which samples of psu’s may be selected. Instead of
completely testing all the ssu’s from the list of selected psu's some selected ssu's are then studied. This is
called two-stage sampling. If a sample of tertiary units is selected from each selected ssu’s the sampling
plan is called three-stage sampling. Similarly higher order multistage designs are also possible.
5. Estimation (Extrapolation) procedure & Sample size.
5.1 Unstratified Mean Per Unit (MPU): The unstratified MPU is used to project an estimated value of a
sample. After a sample is selected with SRS and a value is determined for each sample items, the sample

mean x of sample values multiplied by the number of items in the population N to, produce an estimate of
total value of the sample population. Assuming normality the optimum sample size under SRS is
2
 Z . SD . N 
n=  r  ,
 A 
N
 xj  nx
j 1
2
Where Z r = confidence level coefficient [Refer table 1 pg. 5], A= margin of error and SD= ,
N
Standard Deviation, N=population Size.
Because MPU without stratification produces large sample sizes relative to other sampling methods, its use
in survey sampling is limited.
Z Score - Table
Confidence Level Z -value
75 % 1.15
80% 1.28
85% 1.44
90% 1.65
92% 1.75
94% 1.88
95% 1.96
96% 2.05
99% 2.58
5.2 Stratified Mean Per Unit: When the population is highly variable (large standard deviation), technically
called heterogeneous population, unstratified MPU may produce very large sample sizes. Stratification of
the population, as explained earlier, produces an estimate that has desired level of reliability with reduced
sample size. Using the following formula the sample sizes for each stratum may be optimally determined
as:
( N i . SDi ) ( N i . SDi )
ni 
( A / Z r ) 2   N i . SDi
2
Where, SDi = Standard Deviation of the ith stratum.
The estimated population total (for three strata) is
Total estimated value = N1 * x1 +N2 * x2 +N3 * x3
5.3 Unstratified Proportion of audit objections (errors): The projected number of audit objections in the
population is the sample proportion of error multiplied by number of items in the population. The optimum
 Z r 2 . P.(1  P ) . 
sample size under SRS is n=  2
 , Where Z r = confidence level coefficient, A= margin of

 A 
error and P= Proportion of errors that is expected in the population.
5.4 Stratified Proportion of audit objections (errors):
For three strata, the projected number of audit objections in the population is equal to
N1 . p1  N 2 . p 2  N 3 . p3
,where pi is the proportion of audit objections in the ith stratum.
N
& N=N1+ N2+N3.
5.5 Estimation with PPSWR:
1 n xi
The estimate of population total of the character x = X̂ =  , where pi is the probability of selecting
n i 1 pi
the ith sample.
The estimate of population mean= Xˆ =

X̂
N
5.6 Estimation with two-stage sampling design:
Case-I: Samples are drawn with SRSWOR in both the stages:
Let yij is the measure of the characteristics of interest for the ith 1st stage unit and jth second stage
unit , where i=1,2,… n and j= 1,2,,…. mi.
 N n Mi mi
Population Estimate = Y  
n i 1 mi
y
j 1
ij
Case-II :1st stage sample are drawn with PPSWR and the 2nd stage samples are drawn with SRSWOR :
Let yij is the measure of the characteristics of interest ( e.g. completion of roads in k. m) for the
i DPIU and jth Package , where i=1,2,… n and j= 1,2,,…. mi. xi is the size measure .X is total of all
th
size measures.
 1 n X Mi mi
Population Estimate = Y   y ij
n i 1 xi mi j 1
6. Concept of “Testing of Hypothesis” (Test of control / substantive testing in audit):
6.2 Some relevant terminologies:
6.2.1 Test of control sampling –Risk Matrix
Operating effectively Not operating effectively

Accept Correct decision Risk of assessing control risk
too low.(beta)
Reject Risk of assessing control risk Correct decision
too high (alpha)
–Alpha risk: risk of incorrect rejection –Relates to Audit Efficiency :Auditee’s risk
–Beta risk: risk of incorrect acceptance – Relates to Audit Effectiveness

6.2.2 Audit risk model
OAR = IR x CR x AP x TD
The auditor assesses IR, CR and AP and the auditor chooses a desired level of OAR. Given OAR, IR, CR and
AP, the auditor can use the audit risk model to quantify how much beta risk he/she is prepared to tolerate:
TD = AR/(IR x CR x AP)
•Example:
OAR = 1%, CR = 35%, IR = 75%, AP = 80%
Beta risk = 5% (one tailed)
6.2.3 Tolerable rate:

The maximum deviation rate established that the auditor would be willing to accept. This
is used in compliance test .
.
6.2.4 Materiality:
The value of error that an auditor is willing to accept and still concludes that the audit
objective is achieved. The smaller the materiality, the larger is the sample size. This is used
in substantive testing
6.3Test of control /Compliance testing:

Assume that an auditor wants to test credit approval on 20,000 sales invoices processed during the year.
He or she needs a statistical sample that will give 90% confidence that not more than 5% of the sales invoices
were not approved .The auditor estimates from previous experience that about 1% deviations (are not
approved).
Expected deviation rate = 1%
Tolerable rate = 5%
Confidence level = 90%
For SRSWOR sampling Hypergeometric distribution is appropriate. But statistical tables are not easily
constructed for this distribution. One can use Binomial tables as a close approximation for large sample. If the
expected rate of deviation is very low then one can even use Poisson table for large sample.
From the Binomial table (in Annex –III) it is observed that the required sample size is 77 and if the number of
deviation in 77 samples is 0 or 1, then the auditor can conclude with 90% confidence that the tolerable rate is
not more than 5%, in other words internal controls are reliable. Else, the tolerable rate is more than 5% and
confidence level will also be less than 90% and the internal control is unreliable.
6.4 Audit Hypothesis Model / substantive testing.

The audit hypothesis approach statistically discriminates between the hypothesis that the amount as
represented is correctly stated and the alternative hypothesis that the amount is materially misstated. This is also
known as substantive testing. An essential first step in planning a testing of hypothesis/substantive testing in
audit is to make a preliminary judgment about the amount that will be considered material to the account or
transactions being audited. This amount is called tolerable misstatement . Then the hypothesis to be tested is
that “the value of misstatement in the account balance is equal to the tolerable misstatement” as against the
alternative hypothesis that “the value of misstatement is greater than the tolerable misstatement”.
6.4.1The Audit Hypothesis Model can be categorized into four separate phases.
–Internal Control Assessment
–Substantive test panning
–Substantive test execution
–Substantive test evaluation
6.4.2 Steps in Audit Hypothesis Model : (Let the hypothesis is there is material correctness of the book value)
.Classical approach
Step 1 The internal control assessment is done to assign % to CR and subsequently used in the beta risk
equation.
Step 2. Appropriate variable sampling plan has to be selected based on audit objective and population
characteristics.
Step3 .If SRSWOR is the sampling plan then the sample size (n):
2
U .SD.N 
no   r 
 A 
n
if o is high then the sample size
N
can be further reduced as
no
n
n
1 o
N
Step 4. Ur is determined based on acceptable alpha risk

Step 5. ‘SD’ may be estimated using a pilot sample of size 30.
Step 6. ‘N’ is the population size
Step 7. ‘A’ must be calculated based on desired or calculated beta risk as:
Ur
A  T .M
Ur  Z
where ,A=precision, Ur=Alpha risk coefficient , Zβ = Beta risk coefficient (Ur& Zβ is obtained from Normal
Curve Area Table in Annex-IV)and T.M=tolerable misstatement
Step 8. Select the sample by SRSWOR
Step9. Perform a test of samples. The sample mean book value and the population mean book value should not
be substantially different. If so a new sample to be selected discarding the first one or the sample design should
be changed.
Step 10. Perform audit procedure on the sample items selected for substantive tests.
Step 11. Analyze misstatements noted in the sample to determine their cause, nature and whether systematic
pattern exists. A systematic misstatement is a recurring misstatement does not occur randomly.
Step12. Calculate SD of the sample observations
Step 13.Calculate Achieved Precision:
SD n
A'  Ur N 1
n N
Step 14. If A’ not equal to A then calculate A”
A'
A"  A'TM (1  )
A
Step 15. Calculate
x
 of each audited value
n
Step 16. Calculate Estimated Audited Value (EAV) :
Xˆ  Nx
Step 17. Calculate Decision Interval:

Book Value (Adjusted for any systematic (nonrandom) differences) ± A”
Step 18. Rule:

** If EAV falls within this interval, conclude that the statistical evidence support the book value
** Otherwise, conclude that the statistical evidence does not support the material correctness of the
book value
6.4.2.1 It may be noted that the systematic misstatements (also called nonrandom misstatements) are excluded
from statistical evaluation (please see step 17)
6.4.3. Monetary Unit Sampling (MUS):
Sampling methods used by auditors have evolved over the years. The trend now is to use less rigorous
sampling techniques to reduce cost. However, it has been empirically demonstrated by experiments that
MUS is substantially more capable of detecting material error and can be used for both proportional test of
controls and substantive testing. The use of qualitative analysis that documents the nature and cause
of each misstatement found in a sample can mitigate some of the risk associated with sampling. The
use of a statistical approach such as PPS can further reduce this risk, and, at the same time, permit
the use of a smaller sample.
"The auditor has a responsibility to plan and perform the audit to obtain reasonable assurance about
whether the financial statements are free of material misstatement, whether caused by error or
fraud."
6.4.3.1 Caveats on MUS

MUS sampling may be especially useful in the audit of accounts receivable and inventory. However, it is
not appropriate for accounts receivable if there are a large number of unapplied credits or for inventory
where the auditor anticipates a significant number of audit differences, or where the detection of an
understated balance is an important consideration.
Moreover, MUS evaluation technique is so sensitive to any errors found in the sample that it tends to
overstate the allowance for sampling risk and thereby project a potential misstatement that could be two or
three times the actual misstatement. The reason PPS tends to exaggerate its projection of misstatement is
that it does not simply extrapolate the total error found in the sample. Instead, it looks at each erroneous
item individually, and projects a misstatement amount proportional to that item's percentage of error rather
than its amount of error. Thus, an item with a Rs.100 book value but an audit value of Rs.10 is considered
90% misstated and results in the same projected misstatement as a Rs.1,000 item 90% misstated. Though
this appears illogical, under the PPS selection method, the Rs.1,000 item has ten times more chance of
being selected for audit than does the Rs.100 item. So when errors are found among the relatively few
small items that have been selected, they are given proportionately more weight. But in so doing, PPS
subjects the auditor to a high risk of incorrect rejection, that is, the risk of rejecting an account balance that
is not materially misstated. To put it bluntly, PPS is prone to false alarms.
6.4.3.2 Finally:
MUS's job is only to warn us of a possible fire, not to assess the extent of the fire or estimate the damage.
This requires classical forms of statistical sampling and extracts the price of a much larger sample. The
auditor's response to the alarm is essentially the same regardless of the degree by which MUS's projected
potential misstatement exceeds tolerable misstatement (assuming the excess is more than trivial).
However, qualitative as well as quantitative analyses are equally important. The auditor should identify
and document the nature and cause of each misstatement found in the sample. It takes finding only one
misstatement of a particular type for the auditor to become aware that that kind of misstatement is
occurring, at which point the auditor can apply additional procedures to determine the extent of
misstatements of that type. One misstatement may indicate a breakdown in a control procedure that
suggests other errors of a similar nature, and might in fact have implications elsewhere in the audit. A
second misstated item might clue the auditor to an inappropriate accounting principle that probably affects
all similar transactions. By working with the client to identify and correct other similar errors, the potential
misstatement might be reduced to an acceptable level. If not, other kinds of tests that serve the same audit
objectives, such as appropriate analytical procedures, may provide the additional evidence needed to
support the corrected book value of the account. Of course, if the possibility of fraud is indicated, further
effort and more careful consideration are required.
6.4.3.3
The PPS sampling approach in auditing was developed to convert misstatement rates into money value
.Goodfellow .Loebbecke and Neter outline the method for PPS sampling evaluation of the maximum
misstatement rates found with the Poisson distribution. Poisson probabilities are obtained from an
idealized mathematical process generating occasional random event (in audit misstatement rate is small
less than 5%).
Let BV = Book Value ; TM=tolerable misstatement ; Sampling Risk=SR and RFx is the corresponding
reliability factor for X number of anticipated misstatements in the population ; N=Population Size.
Step1: Sample size=n= (RF0 * BV)/TM

Step 2: Draw a PPS-systematic sample of size n from the population.
Step3 : Audit the physical n units
Step4: Evaluate the sample and determine tainting t
Where t =(amount of misstatement)/(reported book value of the unit) .
arrange them from highest to lowest as t1, t2 ,….
maximum possible value of misstatement (MVM) with confidence (1- SR) % =
MVM = (BV * RF0/n)*1+ (BV *(RF1-RF0)/n)*.t1+(BV *(RF2-RF1)/n)*.t2+….
Step 5: Make a decision about the acceptability of reported book value by comparing MVM with TM
[RF values may be obtained from Poisson table as in Annex –V]
6.5 The statistical sampling described above may also be categorized into three broad categories: Attribute,
Variable and Probability–proportional to size sampling. Attribute sampling is used primarily to estimate
number of incidence or in test of controls .In contrast; variable sampling and PPS sampling are most frequently
used to estimate population average or total or to test monetary value of account balances.
6.5.1 There are some other types of attribute sampling that are being used in audit:
6.5.1.1 Discovery Sampling
Discovery sampling is a sampling plan which selects a sample of a given size, accepts the population if the
sample is error free, and rejects the population if it contains at least one error. With discovery sampling the
auditor may not be interested in determining how many errors there are in the population. Where there is a
possibility of avoidance of the internal control system, it may be sufficient to disclose one example to
precipitate further action or investigation.
6.5.2 Stop or Go Sampling (also called Sequential Sampling)
 Involves sampling a universe in increments and examining each incremental sample before
deciding when to stop.
 Is appropriate for preliminary sampling and survey audit testing.
 Allows auditors to determine from the smallest possible sample size if an error rate exceeds a
predetermined level.
 Provides assurance, within a fixed degree of confidence, that the error rate in a population is less
than a predetermined acceptable error rate.
 Does not provide an estimate of actual error rate; however, it can readily be converted into
attribute sampling, which can be used to estimate actual error rate.
8. Glossary of statistical terminologies:
Alpha Risk Risk of incorrect rejection
Beta Risk Risk of incorrect acceptance
Bias Difference between the true value and the expected value of the estimate.
Cluster Partitioning of the population into sub-population, called cluster, in such a way that
within each cluster the variation is more. It is convenient but less efficient sampling
design often known as area sampling.
Coefficient of Ratio of S.D. to mean .It is unit free & generally expressed in percentage term. This
variations (CV) measure is widely used to measure the reliability of estimate in survey sampling.
Also see Standard Deviation and Mean
Confidence level The certainty with which the estimate lies within the margin of error.
Estimate Projected value to the population from the sample observation.
Estimation Use of sample observation to estimate some characters of interest in the population
sampling
Expected rate of The rate of error (audit objections) that are expected in the population.
occurrence of
error
Extrapolation Projected to the population from the sample .
Heterogeneity Variation in the in population is more. It is opposite to homogeneity.
Homogeneity Variation in the in population is less. It is generally measured by standard deviation
(S.D). Less SD indicates more homogeneity.
Materiality The value of error that an auditor is willing to accept and still concludes that the audit
objective is achieved. The smaller the materiality, the larger is the sample size.
Margin of error A measure of the difference between the estimate from the sample and the population
value
Mean Average of observations
N
X
i 1
i
Symbolically, Mean = X =
N
Multi-Stage Statistical sampling at different levels that is capable of generating estimates at various
Sampling levels. Mostly used in a large-scale sample surveys.
Monetary Unit MUS give transactions with larger recorded amounts proportionally more opportunity
Sampling (MUS) to be selected than units with smaller recorded amount.
Non-random Errors those are systematic in nature.
error
Non-sampling Non-sampling error is generated due to failure of measurement of true

error characteristic(s). Which may be due to, (i) non-response (ii) measurement error (iii)
fatigue etc. Which can be controlled my better training & management.
Non-Statistical Selection of units based on judgment of an individual and probability of selection of
Sampling any unit is not known. Results of judgmental sampling cannot be used to draw
statistically valid inferences about the population.
Population All the elements (units/transactions) under consideration. In other words the entire
transactions/entities on which auditor wants to draw conclusion.
Population Size Number of elements in the population.
Probability It is a branch of mathematics that measures the likelihood that an event will occur.
Probabilities are expressed as numbers between 0 and 1. The probability of an
impossible event is 0, while an event that is certain to occur has a probability of 1.
Probability Selection probability is more where the size measure is high. Selection of sample units
Proportional to with unequal probability.
Size (PPS)
Sampling
Reliability It is the probability that the value of the feature of interest in the sample is
representative of the entire population i.e. within the desired margin of errors..
Sampling Design It is an organized method of sample selection & plans for analyzing and interpreting
the results.
Sampling error Sampling error is generated due to failure of selecting the representative sample .It is
measurable under statistical sampling.
Sampling Frame List of all elements in the population. Often list may include some more information
other tan the identity of the element.
Simple Random Selection of sample units with equal probability.

Sampling (SRS)
Standard Positive square root of variance
Deviation (S.D)
Statistical It is the measure of reliance an auditor places on inferences drawn using the sample .It
assurance is commonly expressed in a probability statement with margin of error and confidence
level.
Statistical Any sample for which the selection of transactions or units to be included is
Sampling independent of the sampler and probability theory can be associated with it.
Stratification Partitioning of the population into sub-population, called strata, in such a way that
within each stratum the variation is less.
Testing of A method to prove a hypothesis is statistically incorrect.
hypothesis
Variance It is the average of the square deviations of the observations from the mean.
N
(X
i 1
i  X )2
Symbolically, Variance =
N
N
X
i 1
i  nX 2
=
N
8. Suggested reading:
1) Audit Sampling : An Introduction (fifth edition)-D. M. Guy, D.R. Carmichael and R Whittington
2) Statistical Sampling and Risk Analysis in Auditing :Peter Jones
3) Theory and methods of survey sampling -Parimal Mukhopadhdhyay :
4) Sampling Theory & Methods -M.N.Murthy
5) Survey Sampling -Kish
6) Sampling Techniques -Cochran W.C
7) http://saiindia.gov.in/cag/sites/default/files/Rupe_Trail_First_Edition_0.pdf
Annex-I
Methodologies of selection of samples for SRSWOR &PPSWR
(Using Random Number Table)
1. Simple Random Sampling With Out Replacement (SRSWOR).

Let there be N number of auditable units in a stratum from which n number of units to be selected .
.
Step 1: Prepare or get the list of units and associate serial numbers for each school.
Step 2: Open a page at random of the random number table
Step 3: Select a random number( let it be r) from the page of dimension N, starting from the left most top
corner of the table and proceed sequentially from left to right.
Step 4: If ‘ r ‘is between 1 and N, then the school corresponding to the r th serial number is selected. If not,
select the next random number in the sequence & proceed sequentially as per step 3.
Repeat Step 3 and Step 4 until we select n distinct schools (please note the sample selected more than once may be ignored)
2. Linear Systematic Sampling
Step1 : Calculate the Sampling interval I= [N/n]

Step2 : Open a page at random of the random number table
Step3 : Select a random number from the page of dimension I ,starting from left most corner of the
Table and proceed sequentially from left to right. Let it be R .
Ste 4 : Selected samples would be r, R+I, R+2I , R+3I ,….. until we get n samples or the sampling
frame gets exhausted.
Please note in this way we may not get exactly n number of samples.
3 Circular Systematic Sampling
Step1 : Calculate the Sampling interval I= [N/n]

Step2 : Open a page at random of the random number table
Step3 : Select a random number from the page of dimension N ,starting from left most corner of the
Table and proceed sequentially from left to right. Let it be R .
Step 4: Selected samples would be r, R+I, R+2I , R+3I ,….. until we get n samples
Step 5: If at any stage R+kI > N ,N may be subtracted from R+kI as R+kI-N and that is the next
sample and then proceed further i.e. in a circular manner.
Please note in this way we get exactly n number of samples.
4 Probability Proportion to Size With Replacement (PPSWR):
4.1 Cumulative total method:
A table of cumulative total of sizes of the units is made .Let T i=x1+x2+x3+….+xi, where xi is the size measure of
ith unit. A random number, say R is drawn between 1 to T N (= Total size) .The unit ‘i’ is selected if T i-1 < R ≤ Ti . The
process is repeated n (sample size) times
4.2 Let us take the following example, for selection of cities with PPSWR, with size measure as expenditure in the scheme
under review .Let the sample size be three.
(Expenditure figures are all fictitious)
List of districts in Punjab along with its expenditure,
Name of the Expenditure under Cumulat
Sr.No. Districts the scheme (00 ive
‘000) Total
1 Amritsar 368 368
2 Bathinda 1095 1463
3 Faridkot 1009 2472
4 Fategarh Sahib 1536 4008
5 Firozpur 3419 7427
6 Gurdaspur 534 7961
7 Hoshiarpur 621 8582
8 Jalandhar 534 9116
9 Kapurthala 323 9439
10 Ludhiana 223 9662
11 Mansa 278 9940
12 Moga 660 10600
13 Muktsar 1474 12074
14 Nawanshahr 1613 13687
15 Patiala 1038 14725
16 Rupnagar 527 15252
17 Sangrur 2131 17383
Total 17383
Select random numbers of 5 digits between 00001 and 17383 (=Total Expenditure), (Random number selection procedure is the
same as indicated above)
Let the Page No. 1 is selected in the Random Number Table (in Annex-VI),
Table for sample selection
Random Decision District Reason
Number selected
10097 Selected Moga Moga is selected as the selected random number ,

10097 is greater than the cumulative total against
sl.no.11 9940 but less than (or equal to) 10600.
32533 No selection X As the number is > 17383
13586 Selected Nawabnshahr Nawabnshahr is selected as the selected random
number 13586 is greater than the cumulative total
against Sl/No.13 i.e 12074 but less than (or equal to)
13687
09117 Selected Kapurthala Kapurthala is selected as the selected random number
9117 is greater than the cumulative total against
Sl/No.8 i.e 9116 but less than (or equal to) 9439
Hence the selected districts are Moga, Nawabnshahr and Kapurthala with selection probabilities are
(660/17383), (1613/17383) and (323/17383) respectively.
Annex-II
Other statistical sampling techniques
Two-stage sequential sampling
Step 1: A Simple Random Sampling Sample Without Replacement (SRSWOR) of size n ([Reliability Factor
(Zero deviation)] / Margin of error) is first selected for audit. Occurrences of deviations (audit objections) are
noted.
Step 2: Final size of the sample say, ‘m’ is calculated using Poisson distribution table on the basis of deviations
observed in step 1, as m= ([Reliability Factor (No. of deviations in step 1)] / Margin of error).
Remaining (m – 150) cases were selected with SRSWOR from the remaining cases in the population for audit.
RAO-HARTLEY-COCHRAN METHOD (RHC-1962) - It is PPSWOR(n)
Here in the beginning the entire population is partitioned into n random groups and from each group a unit is
selected independently by PPS method. It has certain advantages and an improved method. Please note the
estimation formulae are different.
Annex-III
Determination of Sample Size: Reliability, 90%
Binomial Table (Risk of Assessing Control Risk Too Low 10%)
(Allowable numbers of deviations are in parentheses)
Expected
Population Tolerable Rate
Deviation
Rate 2% 3% 4% 5% 6% 7% 8% 9% 10% 15% 20%
0.00% 114(0) 76(0) 57(0) 45(0) 38(0) 32(0) 28(0) 25(0) 22(0) 15(0) 11(0)
0.25 194(1) 129(1) 96(1) 77(1) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
0.50 194(1) 129(1) 96(1) 77(1) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
0.75 265(2) 129(1) 96(1) 77(1) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
1.00 * 176(2) 96(1) 77(1) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
1.25 * 221(3) 132(2) 77(1) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
1.50 * * 132(2) 105(2) 64(1) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
1.75 * * 166(3) 105(2) 88(2) 55(1) 48(1) 42(1) 38(1) 25(1) 18(1)
2.00 * * 198(4) 132(3) 88(2) 75(2) 48(1) 42(1) 38(1) 25(1) 18(1)
2.25 * * * 132(3) 88(2) 75(2) 65(2) 42(1) 38(1) 25(1) 18(1)
2.50 * * * 158(4) 110(3) 75(2) 65(2) 58(2) 38(1) 25(1) 18(1)
2.75 * * * 209(6) 132(4) 94(3) 65(2) 58(2) 52(2) 25(1) 18(1)
3.00 * * * * 132(4) 94(3) 65(2) 58(2) 52(2) 25(1) 18(1)
3.25 * * * * 153(5) 113(4) 82(3) 58(2) 52(2) 25(1) 18(1)
3.50 * * * * 194(7) 113(4) 82(3) 73(3) 52(2) 25(1) 18(1)
3.75 * * * * * 131(5) 98(4) 73(3) 52(2) 25(1) 18(1)
4.00 * * * * * 149(6) 98(4) 73(3) 65(3) 25(1) 18(1)
5.00 * * * * * * 160(8) 115(6) 78(4) 34(2) 18(1)
6.00 * * * * * * * 182(11 116(7) 43(3) 25(2)
7.00 * * * * * * ) 199(14) 52(4) 25(2)
*
 sample size is too large to be cost effective
 Source :AICPA ,Auditing Guide ,Audit Sampling (New York ,2001)
Annex-IV
Normal Curve Area Table
Standard
Deviation .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .0000 .0040 .0080 .0120 .0159 .0199 .0239 .0279 .0319 .0359
0.1 .0398 .0438 .0478 .0517 .0557 .0596 .0636 .0675 .0714 .0753
0.2 .0793 .0832 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141
0.3 .1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1772 .1808 .1844 .1879
0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
0.6 .2257 .2291 .2324 2357 .2389 .2422 .2454 .2486 .2518 .2549
0.7 .2580 .2612 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852
0.8 .2881 .2910 .2939 .2967 .2995 .3023 .3051 .3078 .3106 .3133
0.9 .3159 .3186 .3212 .3238 .3264 .3289 .3315 .3340 .3365 .3389
1.0 .3413 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 .3621
1.1 .3643 .3665 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3830
1.2 .3849 .3869 .3888 .3907 .3925 .3944 .3962 .3980 .3997 .4015
1.3 .4032 .4049 .4066 .4083 .4099 .4115 .4131 .4147 .4162 .4177
1.4 .4192 ..4207 .4222 .4236 .4251 .4265 .4279 .4292 .4306 .4319
1.5 .4332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4430 .4441
1.6 .4452 .4463 .4474 .4485 .4495 .4505 .4515 .4525 .4535 .4545
1.7 .4554 .4564 .4573 .4582 .4591 .4599 .4608 .4616 .4625 .4633
1.8 .4641 .4649 .4656 .4664 .4671 .4678 .4686 .4693 .4699 .4706
1.9 .4713 .4719 .4726 .4732 .4738 .4744 .4750 .4758 .4762 .4767
2.0 .4773 .4778 .4783 .4788 .4793 .4798 .4803 .4808 .4812 .4817
2.1 .4821 .4826 .4830 .4834 .4838 .4842 .4846 .4850 .4854 .4857
2.2 .4861 .4865 .4868 .4871 .4875 .4878 .4881 .4884 .4887 .4890
2.3 .4893 .4896 .4898 .4901 .4904 .4906 .4909 .4911 .4913 .4916
2.4 ..4918 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936
2.5 .4938 .4940 .4941 4943 .4945 .4960 .4948 .4949 .4951 .4952
2.6 .4953 .4955 .4956 .4957 .4959 .4960 .4961 .4962 .4963 .4964
2.7 .4965 .4966 .4967 .4968 .4969 .4970 .4971 .4972 .4973 .4974
2.8 .4974 .4975 .4976 .4977 .4977 .4978 .4979 .4980 .4980 .4981
2.9 .4981 .4982 .4983 .4984 .4984 .4984 .4985 .4985 .4986 .4986
3.0 .4986 4987 .4987 .4988 .4988 .4988 .4989 .4989 .4989 .4990
3.1 .4990 .4991 .4991 .4991 .4992 .4992 .4992 .4992 .4993 .4993
Annex –V: Poisson Table
sampling error
Number of Deviations 10% 5% 2.5%
0 2.4 3.0 3.7
1 3.9 4.8 5.6
2 5.4 6.3 7.3
3 6.7 7.8 8.8
4 8.0 9.2 10.3
5 9.3 10.6 11.7
6 10.6 11.9 13.1
7 11.8 13.2 14.5
8 13.0 14.5 15.8
9 14.3 16.0 17.1
10 15.5 17.0 18.4
11 16.7 18.3 19.7
12 18.0 19.5 21.0
13 19.0 21.0 22.3
14 20.2 22.0 23.5
15 21.4 23.4 24.7
16 22.6 24.3 26.0
17 23.8 26.0 27.3
18 25.0 27.0 28.5
19 26.0 28.0 29.6
20 27.1 29.0 31.0
21 28.3 30.3 32.0
22 29.3 31.5 33.3
23 30.5 32.6 34.6
24 31.4 33.8 35.7
25 32.7 35.0 37.0
26 34.0 36.1 38.1
27 35.0 37.3 39.4
28 36.1 38.5 40.5
29 37.2 39.6 41.7
30 38.4 40.7 42.9
31 39.1 42.0 44.0
32 40.3 43.0 45.1
33 41.5 44.2 46.3
34 42.7 45.3 47.5
35 43.8 46.4 48.8
36 45.0 47.6 49.9
37 46.1 48.7 51.0
38 47.2 49.8 52.1
39 48.3 51.0 53.4
Source: Adapted from a table developed by Marvin Tummins and Robert H. Strawser, “A Confidence Limits Tables for Attribute Sampling,” Accounting Review
(October 1976), pp. 907-912.
This table is particularly applicable when:

Population Size > 1000
And the estimated population deviation rate < 5 percent,

Annex-VI
Random Numbers
00000 10097 32533 76520 13586 34673 54876 80959 09117 39292 74945
00001 37542 04805 64894 74296 24805 24037 20636 10402 00822 91665
00002 08422 68953 19645 09303 23209 02560 15953 34764 35080 33606
00003 99019 02529 09376 70715 38311 31165 88676 74397 04436 27659
00004 12807 99970 80157 36147 64032 36653 98951 16877 12171 76833
00005 66065 74717 34072 76850 36697 36170 65813 39885 11199 29170
00006 31060 10805 45571 82406 35303 42614 86799 07439 23403 09732
00007 85269 77602 02051 65692 68665 74818 73053 85247 18623 88579
00008 63573 32135 05325 47048 90553 57548 28468 28709 83491 25624
00009 73796 45753 03529 64778 35808 34282 60935 20344 35273 88435
00010 98520 17767 14905 68607 22109 40558 60970 93433 50500 73998
00011 11805 05431 39808 27732 50725 68248 29405 24201 52775 67851
00012 83452 99634 06288 98083 13746 70078 18475 40610 68711 77817
00013 88685 40200 86507 58401 36766 67951 90364 76493 29609 11062
00014 99594 67348 87517 64969 91826 08928 93785 61368 23478 34113
00015 65481 17674 17468 50950 58047 76974 73039 57186 40218 16544
00016 80124 35635 17727 08015 45318 22374 21115 78253 14385 53763
00017 74350 99817 77402 77214 43236 00210 45521 64237 96286 02655
00018 69916 26803 66252 29148 36936 87203 76621 13990 94400 56418
00019 09893 20505 14225 68514 46427 56788 96297 78822 54382 14598
00020 91499 14523 68479 27686 46162 83554 94750 89923 37089 20048
00021 80336 94598 26940 36858 70297 34135 53140 33340 42050 82341
00022 44104 81949 85157 47954 32979 26575 57600 40881 22222 06413
00023 12550 73742 11100 02040 12860 74697 96644 89439 28707 25815
00024 63606 49329 16505 34484 40219 52563 43651 77082 07207 31790
00025 61196 90446 26457 47774 51924 33729 65394 59593 42582 60527
00026 15474 45266 95270 79953 59367 83848 82396 10118 33211 59466
00027 94557 28573 67897 54387 54622 44431 91190 42592 92927 45973
00028 42481 16213 97344 08721 16868 48767 03071 12059 25701 46670
00029 23523 78317 73208 89837 68935 91416 26252 29663 05522 82562
00030 04493 52494 75246 33824 45862 51025 61962 79335 65337 12472
00031 00549 97654 64051 88159 96119 63896 54692 82391 23287 29529
00032 35963 15307 26898 09354 33351 35462 77974 50024 90103 39333
00033 59808 08391 45427 26842 83609 49700 13021 24892 78565 20106
00034 46058 85236 01390 92286 77281 44077 93910 83647 70617 42941
00035 32179 00597 87379 25241 05567 07007 86743 17157 85394 11838
00036 69234 61406 20117 45204 15956 60000 18743 92423 97118 96338
00037 19565 41430 01758 75379 40419 21585 66674 36806 84962 85207
00038 45155 14938 19476 07246 43667 94543 59047 90033 20826 69541
00039 94864 31994 36168 10851 34888 81553 01540 35456 05014 51176
00040 98086 24826 45240 28404 44999 08896 39094 73407 35441 31880
00041 33185 16232 41941 50949 89435 48581 88695 41994 37548 73043
00042 80951 00406 96382 70774 20151 23387 25016 25298 94624 61171
00043 79752 49140 71961 28296 69861 02591 74852 20539 00387 59579
00044 18633 32537 98145 06571 31010 24674 05455 61427 77938 91936
00045 74029 43902 77557 32270 97790 17119 52527 58021 80814 51748
00046 54178 45611 80993 37143 05335 12969 56127 19255 36040 90324
00047 11664 49883 52079 84827 59381 71539 09973 33440 88461 23356
00048 48324 77928 31249 64710 02295 36870 32307 57546 15020 09994
00049 69074 94138 87637 91976 35584 04401 10518 21615 01848 76938
00050 09188 20097 32825 39527 04220 86304 83389 87374 64278 58044
00051 90045 85497 51981 50654 94938 81997 91870 76150 68476 64659
00052 73189 50207 47677 26269 62290 64464 27124 67018 41361 82760
00053 75768 76490 20971 87749 90429 12272 95375 05871 93823 43178
00054 54016 44056 66281 31003 00682 27398 20714 53295 07706 17813
00055 08358 69910 78542 42785 13661 58873 04618 97553 31223 08420
00056 28306 03264 81333 10591 40510 07893 32604 60475 94119 01840
00057 53840 86233 81594 13628 51215 90290 28466 68795 77762 20791
00058 91757 53741 61613 62269 50263 90212 55781 76514 83483 47055
00059 89415 92694 00397 58391 12607 17646 48949 72306 94541 37408
00060 77513 03820 86864 29901 68414 82774 51908 13980 72893 55507
00061 19502 37174 69979 20288 55210 29773 74287 75251 65344 67415
00062 21818 59313 93278 81757 05686 73156 07082 85046 31853 38452
00063 51474 66499 68107 23621 94049 91345 42836 09191 08007 45449
00064 99559 68331 62535 24170 69777 12830 74819 78142 43860 72834
00065 33713 48007 93584 72869 51926 64721 58303 29822 93174 93972
00066 85274 86893 11303 22970 28834 34137 73515 90400 71148 43643
00067 84133 89640 44035 52166 73852 70091 61222 60561 62327 18423
00068 56732 16234 17395 96131 10123 91622 85496 57560 81604 18880
00069 65138 56806 87648 85261 34313 65861 45875 21069 85644 47277
00070 38001 02176 81719 11711 71602 92937 74219 64049 65584 49698
00071 37402 96397 01304 77586 56271 10086 47324 62605 40030 37438
00072 97125 40348 87083 31417 21815 39250 75237 62047 15501 29578
00073 21826 41134 47143 34072 64638 85902 49139 06441 03856 54552
00074 73135 42742 95719 09035 85794 74296 08789 88156 64691 19202
00075 07638 77929 03061 18072 96207 44156 23821 99538 04713 66994
00076 60528 83441 07954 19814 59175 20695 05533 52139 61212 06455
00077 83596 35655 06958 92983 05128 09719 77433 53783 92301 50498
00078 10850 62746 99599 10507 13499 06319 53075 71839 06410 19362
00079 39820 98952 43622 63147 64421 80814 43800 09351 31024 73167
00080 59580 06478 75569 78800 88835 54486 23768 06156 04111 08408
00081 38508 07341 23793 48763 90822 97022 17719 04207 95954 49953
00082 30692 70668 94688 16127 56196 80091 82067 63400 05462 69200
00083 65443 95659 18288 27437 49632 24041 08337 65676 96299 90836
00084 27267 50264 13192 72294 07477 44606 17985 48911 97341 30358
00085 91307 06991 19072 24210 36699 53728 28825 35793 28976 66252
00086 68434 94688 84473 13622 62126 98408 12843 82590 09815 93146
00087 48908 15877 54745 24591 35700 04754 83824 52692 54130 55160
00088 06913 45197 42672 78601 11883 09528 63011 98901 14974 40344
00089 10455 16019 14210 33712 91342 37821 88325 80851 43667 70883
00090 12883 97343 65027 61184 04285 01392 17974 15077 90712 26769
00091 21778 30976 38807 36961 31649 42096 63281 02023 08816 47449
00092 19523 59515 65122 59659 86283 68258 69572 13798 16435 91529
00093 67245 52670 35583 16563 79246 86686 76463 34222 26655 90802
00094 60584 47377 07500 37992 45134 26529 26760 83637 41326 44344
00095 53853 41377 36066 94850 58838 73859 49364 73331 96240 43642
00096 24637 38736 74384 89342 52623 07992 12369 18601 03742 83873
00097 83080 12451 38992 22815 07759 51777 97377 27585 51972 37867
00098 16444 24334 36151 99073 27493 70939 85130 32552 54846 54759
00099 60790 18157 57178 65762 11161 78576 45819 52979 65130 04860
00100 03991 10461 93716 16894 66083 24653 84609 58232 88618 19161
00101 38555 95554 32886 59780 08355 60860 29735 47762 71299 23853
00102 17546 73704 92052 46215 55121 29281 59076 07936 27954 58909
00103 32643 52861 95819 06831 00911 98936 76355 93779 80863 00514
00104 69572 68777 39510 35905 14060 40619 29549 69616 33564 60780
00105 24122 66591 27699 06494 14845 46672 61958 77100 90899 75754
00106 61196 30231 92962 61773 41839 55382 17267 70943 78038 70267
00107 30532 21704 10274 12202 39685 23309 10061 68829 55986 66485
00108 03788 97599 75867 20717 74416 53166 35208 33374 87539 08823
00109 48228 63379 85783 47619 53152 67433 35663 52972 16818 60311
00110 60365 94653 35075 33949 42614 29297 01918 28316 98953 73231
00111 83799 42402 56623 34442 34994 41374 70071 14736 09958 18065
00112 32960 07405 36409 83232 99385 41600 11133 07586 15917 06253
00113 19322 53845 57620 52606 66497 68646 78138 66559 19640 99413
00114 11220 94747 07399 37408 48509 23929 27482 45476 85244 35159
00115 31751 57260 68980 05339 15470 48355 88651 22596 03152 19121
00116 88492 99382 14454 04504 20094 98977 74843 93413 22109 78508
00117 30934 47744 07481 83828 73788 06533 28597 20405 94205 20380
00118 22888 48893 27499 98748 60530 45128 74022 84617 82037 10268
00119 78212 16993 35902 91386 44372 15486 65741 14014 87481 37220
00120 41849 84547 46850 52326 34677 58300 74910 64345 19325 81549
00121 46352 33049 69248 93460 45305 07521 61318 31855 14413 70951
00122 11087 96294 14013 31792 59747 67277 76503 34513 39663 77544
00123 52701 08337 56303 87315 16520 69676 11654 99893 02181 68161
00124 57275 36898 81304 48585 68652 27376 92852 55866 88448 03584
00125 20857 73156 70284 24326 79375 95220 01159 63267 10622 48391
00126 15633 84924 90415 93614 33521 26665 55823 47641 86225 31704
00127 92694 48297 39904 02115 59589 49067 66821 41575 49767 04037
00128 77613 19019 88152 00080 20554 91409 96277 48257 50816 97616
00129 38688 32486 45134 63545 59404 72059 43947 51680 43852 59693
00130 25163 01889 70014 15021 41290 67312 71857 15957 68971 11403
00131 65251 07629 37239 33295 05870 01119 92784 26340 18477 65622
00132 36815 43625 18637 37509 82444 99005 04921 73701 14707 93997
00133 64397 11692 05327 82162 20247 81759 45197 25332 83745 22567
00134 04515 25624 95096 67946 48460 85558 15191 18782 16930 33361
00135 83761 60873 43253 84145 60833 25983 01291 41349 20368 07126
00136 14387 06345 80854 09279 43529 06318 38384 74761 41196 37480
00137 51321 92246 80088 77074 88722 56736 66164 49431 66919 31678
00138 72472 00008 80890 18002 94813 31900 54155 83436 35352 54131
00139 05466 55306 93128 18464 74457 90561 72848 11834 79982 68416
00140 39528 72484 82474 25593 48545 35247 18619 13674 18611 19241
00141 81616 18711 53342 44276 75122 11724 74627 73707 58319 15997
00142 07586 16120 82641 22820 92904 13141 32392 19763 61199 67940
00143 90767 04235 13574 17200 69902 63742 78464 22501 18627 90872
00144 40188 28193 29593 88627 94972 11598 62095 36787 00441 58997
00145 34414 82157 86887 55087 19152 00023 12302 80783 32624 68691
00146 63439 75363 44989 16822 36024 00867 76378 41605 65961 73488
00147 67049 09070 93399 45547 94458 74284 05041 49807 20288 34060
00148 79495 04146 52162 90286 54158 34243 46978 35482 59362 95938
00149 91704 30552 04737 21031 75051 93029 47665 64382 99782 93478
00150 94015 46874 32444 48277 59820 96163 64654 25843 41145 42820
00151 74108 88222 88570 74015 25704 91035 01755 14750 48968 38603
00152 62880 87873 95160 59221 22304 90314 72877 17334 39283 04149
00153 11748 12102 80580 41867 17710 59621 06554 07850 73950 79552
00154 17944 05600 60478 03343 25852 58905 57216 39618 49856 99326
00155 66067 42792 95043 52680 46780 56487 09971 59481 37006 22186
00156 54244 91030 45547 70818 59849 96169 61459 21647 87417 17198
00157 30945 57589 31732 57260 47670 07654 46376 25366 94746 49580
00158 69170 37403 86995 90307 94304 71803 26825 05511 12459 91314
00159 08345 88975 35841 85771 08105 59987 87112 21476 14713 71181
00160 27767 43584 85301 88977 29490 69714 73035 41207 74699 09310
00161 13025 14338 54066 15243 47724 66733 47431 43905 31048 56699
00162 80217 36292 98525 24335 24432 24896 43277 58874 11466 16082
00163 10875 62004 90391 61105 57411 06368 53856 30743 08670 84741
00164 54127 57326 26629 19087 24472 88779 30540 27886 61732 75454
00165 60311 42824 37301 42678 45990 43242 17374 52003 70707 70214
00166 49739 71484 92003 98086 76668 73209 59202 11973 02902 33250
00167 78626 51594 16453 94614 39014 97066 83012 09832 25571 77628
00168 66692 13986 99837 00582 81232 44987 09504 96412 90193 79568
00169 44071 28091 07362 97703 76447 42537 98524 97831 65704 09514
00170 41468 85149 49554 17994 14924 39650 95294 00556 70481 06905
00171 94559 37559 49678 53119 70312 05682 66986 34099 74474 20740
00172 41615 70360 64114 58660 90850 64618 80620 51790 11436 38072
00173 50273 93113 41794 86861 24781 89683 55411 85667 77535 99892
00174 41396 80504 90670 08289 40902 05069 95083 06783 28102 57816
00175 25807 24260 71529 78920 72682 07385 90726 57166 98884 08583
00176 06170 97965 88302 98041 21443 41808 68984 83620 89747 98882
00177 60808 54444 74412 81105 01176 28838 36421 16489 18059 51061
00178 80940 44893 10408 36222 80582 71944 92638 40333 67054 16067
00179 19516 90120 46759 71643 13177 55292 21036 82808 77501 97427
00180 49386 54480 23604 23554 21785 41101 91178 10174 29420 90438
00181 06312 88940 15995 69321 47458 64809 98189 81851 29651 84215
00182 60942 00307 11897 92674 40405 68032 96717 54244 10701 41393
00183 92329 98932 78284 46347 71209 92061 39448 93136 25722 08564
00184 77936 63574 31384 51924 85561 29671 58137 17820 22751 36518
00185 38101 77756 11657 13897 95889 57067 47648 13885 70669 93406
00186 39641 69457 91339 22502 92613 89719 11947 56203 19324 20504
00187 84054 40455 99396 63680 67667 60631 69181 96845 38525 11600
00188 47468 03577 57649 63266 24700 71594 14004 23153 69249 05747
00189 43321 31370 28977 23896 76479 68562 62342 07589 08899 05985
00190 64281 61826 18555 64937 13173 33365 78851 16499 87064 13075
00191 66847 70495 32350 02985 86716 38746 26313 77463 55387 72681
00192 72461 33230 21529 53424 92581 02262 78438 66276 18396 73538
00193 21032 91050 13058 16218 12470 56500 15292 76139 59526 52113
00194 95362 67011 06651 16136 01016 00857 55018 56374 35824 71708
00195 49712 97380 10404 55452 34030 60726 75211 10271 36633 68424
00196 58275 61764 97586 54716 50259 46345 87195 46092 26787 60939
00197 89514 11788 68224 23417 73959 76145 30342 40277 11049 72049
00198 15472 50669 48139 36732 46874 37088 73465 09819 58869 35220
00199 12120 86124 51247 44302 60883 52109 21437 36786 49226 77837
00200 19612 78430 11661 94770 77603 65669 86868 12665 30012 75989
00201 39141 77400 28000 64238 73258 71794 31340 26256 66453 37016
00202 64756 80457 08747 12836 03469 50678 03274 43423 66677 82556
00203 92901 51878 56441 22998 29718 38447 06453 25311 07565 53771
00204 03551 90070 09483 94050 45938 18135 36908 43321 11073 51803
00205 98884 66209 06830 53656 14663 56346 71430 04909 19818 05707
00206 27369 86882 53473 07541 53633 70863 03748 12822 19360 49088
00207 59066 75974 63335 20483 43514 37481 58278 26967 49325 43951
00208 91647 93783 64169 49022 98588 09495 49829 59068 38831 04838
00209 83605 92419 39542 07772 71568 75673 35185 89759 44901 74291
00210 24895 88530 70774 35439 46758 70472 70207 92675 91623 61275
00211 35720 26556 95596 20094 73750 85788 34264 01703 46833 65248
00212 14141 53410 38649 06343 57256 61342 72709 75318 90379 37562
00213 27416 75670 92176 72535 93119 56077 06886 18244 92344 31374
00214 82071 07429 81007 47749 40744 56974 23336 88821 53841 10536
00215 21445 82793 24831 93241 14199 76268 70883 68002 03829 17443
00216 72513 76400 52225 92348 62308 98481 29744 33165 33141 61020
00217 71479 45027 76160 57411 13780 13632 52308 77762 88874 33697
00218 83210 51466 09088 50395 26743 05306 21706 70001 99439 80767
00219 68749 95148 94897 78636 96750 09024 94538 91143 96693 61886
00220 05184 75763 47075 88158 05313 53439 14908 08830 60096 21551
00221 13651 62546 96892 25240 47511 58483 87342 78818 07855 39269
00222 00566 21220 00292 24069 25072 29519 52548 54091 21282 21296
Source : 222 lines of random numbers from the random number table of RAND
Statistical Issues in Relation to Audit
Statistics are no substitute for judgment
Statistics are used much like a drunk uses a

lamppost: for support, not for illumination.
Pawan Dhamija
Statistical Advisor
1
What is Statistics ?
(a) Descriptive Statistics: Deals with
Collection, Organisation, presentation,
Summarisation and analysis of Data
(b) Inferential Statistics: In addition deals
with drawing of inference about a set of data
(Population) when only a part of data
(Sample) is observed.
2
Statistics
Descriptive Statistics Inferential Statistics
Collecting
Collecting Summarizing
Summarizing Presenting
Presenting Analyzing
Analyzing
Generalizing
Draw conclusion about the

Draw conclusion about
items or group which is bigger
the subjects studied
than what has been observed
3
Why Statistics in audit
To develop an appreciation about averages and
variability.
For making data into information
Develop understanding of ideas of statistical
reliability/precision, probability, Risk/errors, etc.
Use these ideas to develop a proper sampling
design including decision about sample size and
for drawing valid inferences based on sample.
4
Population and Sample
Population: Entire group of people or objects
(vouchers, bills, audit entities) to which the
researcher/auditor wishes to generalize the
study/audit findings.
Sample: A sample is a part of the population,
selected by the investigator/auditor to gather
information on certain characteristics of the original
population.
5
Sampling, Census and Statistical Inference
Sampling: The Process of Selection of a sample from a

population to generate precise and valid estimates of
population parameter.
Census (100% enumeration): The process of collecting
relevant information/ data in respect of each and every
member/unit of the population.
Statistical Inference: Drawing Conclusions (Inferences)
about a population based on an examination/audit of
sample(s) taken from the population. 6
Describing Sample/Population
 Descriptive Statistics
 Measures of Central Tendency
 Measures of Variability/Dispersion
 Other Descriptive Measures like
Minimum and Maximum Values
 Sample size (n)
 Percentile: 25 percentile ‘P’ is the value of the variable
X such that 25% of observations are < P.
 Median -50th percentile; Q1-25th Percentile; Q3-75per.
7
Measures of Central Tendency (Averages)
 Measures the “centre” of the data set
 Single number that can be taken as a
representative of the entire data set
 Measures commonly used for averages are:
 Mean
 Median
 Mode
 Which measure to use depends on nature of data
 It is okay to report more than one measure.
8
Measures of Central Tendency: Mean
 The mean is given by the sum of the observations

divided by the number of observations. For e.g., mean
of 1,3,5,7,9 is
 If the data are made up of n observations x1, x2,…, xn.
We can calculate mean as: 1 n
X 
n
 Xi
i 1
Σ is summation, ‘n’ is sample size and X is sample mean.

 Mean fulfills all the conditions of a good average.
However, it is largely affected by the extreme values so
it must not be used for skewed (explained later) data
9
Measures of Central Tendency: Mean Cont.
 The population mean is usually unknown; so we try to

make inference about it.
 According to statistical sampling theory, sample Mean can
be taken as a projection/estimate of the population mean.
 For e.g. average misstatement in book values based on a
sample is an estimator (projection) of average misstatement
in book value in the population.
10
Median: “Middle observation” according to its rank in
data i.e. after arranging data in ascending order.
 Better than mean if extreme observations are present i.e.
for skewed data.
 If n is odd: Median = (n+1)/2th item.
If n is even: Median = ½* [(n/2)th + (n/2+1)th]

item after the data has been ordered.
 For e.g. observations: 1, 3, 5, 7, 9, 13, 5, 7, 8, 2, 10, 5
in ascending order: 1, 2, 3, 5, 5, 5, 7, 7, 8, 9, 10, 13;

n = 12 (even); n/2 = 12/2 = 6
Median = ½(6th + 7th) item = ½(5+7) = 6
11
Mode: Value that occurs most frequently
 Good for Qualitative data like intelligence, beauty, honesty.
 The value/observation with highest freq. gives mode.
For e.g. for observations: 1, 3, 5, 7, 9, 13, 5, 7, 8, 2, 10, 5
In ascending order: 1, 2, 3, 5, 5, 5, 7, 7, 8, 9, 10, 13;
Observation 5 occurs the maximum number of time (has
highest frequency) so mode = 5
 However, if there are two or more observations with
highest frequency we use: Mode = 3*median - 2*mean
 If data are symmetric (explained later):
mean = median = mode

12
Frequency Distribution :
A tabular statement with two columns; first column describes
the variable category/values and the second column
represents the frequencies (i.e. number of times the variable
is taking that particular value). Example of freq. Dist.
No. of 0-5 5-10 10-15 15-20 20-25 25-30 30-35 35-40
errors (x)
No. of 328 350 720 664 598 524 378 244
vouchers (f)
It indicates there are 328 vouchers containing 0-5 (5 excluded)
errors, Arithmetic Mean for frequency Distribution is given by:
Where N = Σf 13
Mean, Median and Mode of Freq. Distribution
Xi fi (fi*Xi) c.f. Mean=308/50
1 4 4 4 = 6.16
2 6 12 10 N = 50 (even); n/2 =
3 5 15 15 25
4 4 16 19 Median =
5 4 20 23 1/2*(25th + 26th) item
6 3 18 26 Median= ½(6+6) = 6
7 4 28 30
Highest freq. = 6
8 6 48 36
occurring twice so
9 4 36 40
use
10 3 30 43
Mode=3*median-
11 3 33 46
2*mean
12 4 48 50
=3*6 –2*6.16
Total 50 30814
= 5.68
Dispersion: Measures of Variability
 Measure the “spread or variation”
[Heterogeneity] in the data
 Measures commonly used for variability are:
 Variance
 Standard Deviation
 Range
 Semi Inter-quartile Range
or Quartile Deviation (QD)
15
Measures of Variability: Variance
 The sample variance (s2) may be calculated from the

data. It is the average of the square deviations of the
observations from the mean.
s 
2 1  n
 2
 Xi  X 
n  1  i 1 
 Where ‘n’ is the sample size and X is sample mean;
For sample size n > 30 ‘n’ may be used instead of ‘n-1’
 Population variance is denoted by S2. This is usually
unknown.
 s2 (Sample Variance) can be used as an estimator
(projected value) of S2 (population Variance).
16
Standard Deviation (SD)
 Square root of the variance
 s = √s2 = sample SD
 S = √S2 = population SD; Usually unknown
 Merits of SD: Expressed in the same units as the
mean (instead of squared units like the variance)
 Demerit: It is difficult to calculate
17
Range and Quartile Deviation (QD)
 Range = Maximum - Minimum
 QD = ½*(Q3 – Q1)
 Where Q3 is third Quartile and Q1 first quartile
25% of observations are below Q1 and 75% below Q3
 QD is powerful than range as range is based on just 2
items and QD is based on 50% of the items.
 SD is the best and most useful measure of Variation;
however if there are outliers (i.e. if the data are highly
skewed) it should not be used.
18
Find Standard Deviation and Variance of no. of
errors in the vouchers [Solution on next slide]
No. of Errors Number of Vouchers

Xi fi
6 5
7 4
8 4
9 3
10 2
Total 18
19
Solution
Number of E: (x-m)2 F: B*E
No. of errors (A) C: A*B D: x - m
vouchers (B)
x f f*x x-7.6 (x-7.6)2 f*(x-7.6)2
6 5 30 -1.6 2.56 12.80
7 4 28 -0.6 0.36 1.44
8 4 32 0.4 0.16 0.64
9 3 27 1.4 1.96 5.88
10 2 20 2.4 5.76 11.52
Total Σ 18 137 2 - 32.28
=137/18 7.6
Mean (m) = Σf*x /Σf
Var. = Σf*(x-m)2 /Σf = 32.28/18 1.79
Std. Dev. = =√Var = √1.79 1.339

20
Skewness: Measures of symmetry of data set
Skewness measures lack of symmetry of data
Positive or right skewed: Longer right tail
Negative or left skewed: Longer left tail
21
Sampling: Some Facts
 For very small samples (e.g., <5 observations),
summary statistics (mean, SD etc.) are not meaningful.
Simply list the data.
 Beware that poor samples may provide a distorted view
of the population
 In general, larger samples are better representative of
the population but they need more resources; so we
have to trade off b/w sample size and feasibility
(available resources).
22
Probability(P)
 Measures the likelihood with which an event occurs
favourable number of cases

P
total number of cases
P lies between 0 and 1
probabilit y of an impossible event  0
probabilit y of a certain event  1
23
Probability Distributions (P. D.)
 A probability distribution describes the behavior of the
character of interest (called variable);
 It identifies possible values of the variable and
provides information about the probability with which
these values (or ranges of values) will occur.
 For e.g. in voucher sampling we have following P.D.
No. of 0 1 2 3 4 More Total
errors than 4
Prob. 0.50 0.20 0.10 0.06 0.04 0.10 1.00
 Important P.Ds are Binomial, Poisson and Normal

24
Normal Curve and 90% value
Sampling and its types
P. K. Dhamija Stat. Advisor
1
Sampling
The Process of selection of some members of a
population to generate precise and valid
estimates of population parameters like
averages or proportions.
Sample
A sample is a part of the population, selected by the
investigator/auditor as its representative to gather
population
2
Sampling Terms
 Sampling unit (Basic sampling unit)
Example: vouchers, cheques, bills, districts, audit units
 Sampling frame
List of all sampling units in the population
 Sampling scheme
Method used to select sampling units from the sampling
frame
 Parameter : Population characteristic like average, proportion
based on all the units in the population; it is constant/fixed.
 Statistic: Sample characteristic like average, proportion based
on sample values; it varies from sample to sample.
3
Advantages of Survey Sampling
 less expensive; Saves Time
 The quality of information is maintained.
 Possible to determine the extent of error due to
Sampling
 Non Sampling errors are likely to be less
 Even Census Results are verified by sampling
 Law of Statistical Regularity lays down that a
moderately large number of items chosen at random
form a large group are almost sure on the average to
possess the characteristic of the large group.
4
Disadvantages of Survey Sampling
 Results of a sample survey are subject to error due
to sampling.
 A sample may not properly represent the various
subgroups of a population.
 Sometimes the sampling methods may become
complicated requiring the services of an expert.
 Note: If time & money are not important factors
and if population under consideration is not too
large, census is better than any sampling method.
5
Types of sampling
Non-Statistical sampling
Statistical sampling
6
Non-Statistical Sampling
 Units in the study population do not have a known
probability of being included in the sample
 Subjective/Biased samples
 Used when (i) the number of elements in the population
is either unknown or units in the population can not be
identified and (ii) there are time/ resource constraints
Advantages:
 Practical and easy to conduct
Dis-advantages:
 Not representative of the population
 Not possible to (i) assess the validity of estimates (ii)
Determine sample size
7
Some Non-Statistical Sampling
Techniques
I Accidental/ Haphazard Sampling/
The auditor selects sample (audit units, bills, vouchers)
without any conscious bias ; the sample is expected to be
representative of the population. For e.g. avoiding first
and last voucher in a bundle.
II Judgmental /Purposive Sampling
The auditor selects sample (audit units, bills, vouchers)
which in his opinion contain maximum error say
vouchers with the highest values or vouchers of some
particular treasury.
8
Statistical Sampling
 Each unit in the study population has a known probability
(may not be equal) of being included in the sample.
Advantages:
 It provides estimates free from personal bias
 It permits application of objective methods of minimizing
error under the resource constraints.
 Allows to draw valid conclusions about population
Dis-advantages:
 Needs sampling frame
 Compared to Non Prob. sampling it is difficult to apply
9
1. Simple random sampling (SRS)
The most commonly used Statistical sampling
 Principle: Equal chance for each sampling unit to be
included in the sample
 Procedure
1. Identify all sampling units in the population
2. Determine sample size (n) using appropriate formula/table
3. Draw (n) units using random tables
or computer programs like Excel or IDEA.
 Advantages
 Simple
 Sampling error easily measured
 Disadvantages
 Need complete list of units
 Not always best representative
10
SRS with replacement (SRSWR)
 First unit is randomly selected from population
 The sampled unit is replaced in the population
 Then second unit is drawn; probability of selection
of an element remains unchanged after each draw
 The procedure is repeated until the requisite
sample of size 'n' is drawn.
 In practice SRSWR is not attractive; Same units
can be selected more than once which may not
add any value/additional information
 But in mathematical terms, it is simpler to relate
the sample to population by SRSWR.
11
SRS without replacement (SRSWOR)
 Unlike SRSWR, once an element is selected as a
sample unit, it is not replaced in the population
 The selected sample units are distinct
 SRSWOR provides two advantages:
 Elements are not repeated
 Variance estimation is smaller (efficiency is
higher) than SRSWR with same sample size
12
Use the following Random Number Table to draw a
simple random sample (i) of 15 vouchers without
replacement and (ii) of 45 vouchers with replacement;
from a treasury having 500 vouchers.
Part of Random Number table
2952 6641 3992 9792 7979 5911 3170
5624 4167 9524 1545 1396 7203 5356
1300 2693 2370 7483 3408 2762 3563
1089 6913 7691 0560 5246 1112 6107
6008 8126 4233 8776 2754 9143 1405
9025 7002 6111 8816 6446
13
2. Systematic Sampling
Principle:
In this method, first unit is drawn by random numbers;
thereafter, every kth (k = N/n is sampling interval) unit is
drawn. It gives equal chance of selection to each unit
Procedure
1. Prepare a list of all elements in the study population (N)
2. Decide the sample size (n)
3. Determine the sampling interval ‘k’ as the integer
nearest to N/n
4. Have the random start by choosing an integer ‘r’
between 1 and k.
5. Select every kth unit starting with the unit corresponding
to the number ‘r’. 14
Systematic Sampling contd..
Say, Target Population N= 54000 vouchers (Sampling Frame)
Sample size n = 6000
Sample Fraction (K)= Target Population / Sample size
=54000/6000 = 9
Number all vouchers of the population
Select 1 number between 1 to 9 (here K = 9) randomly
Say, number 5 is selected then 5th voucher is selected
Next 5+9=14th , 14 + 9 = 23rd voucher is selected and so on …
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35 ,…….
[Circular, multiple or random systematic methods also used]
15
Systematic Sampling (contd..)
 Advantages –
 Requires less time, sometimes less costly than SRS.
 Ensures representativeness across list
 Easy to implement
 Disadvantages-
Works well only if the complete and up-to-date frame
is available and if the units are randomly arranged in
the frame; for this reason the units are arranged in
some order say alphabetical or in increasing/decreasing
order of value before selecting a sample.
16
3. Stratified sampling
 Principle
 Classify population into homogeneous subgroups (strata)
 Stratification may be done on the basis of income, age,
rural-urban, Revenue-Capital, Treasuries, major heads, etc.
 Draw sample (not necessarily equal) from each strata
 Combine results of all strata
 Advantages
 More precise if variable associated with strata e.g. In MUS
sampling, variable is value of vouchers which is related to
strata so it is likely to yield better results than SRS
 All subgroups represented, allowing separate conclusion about
each of them; say separate conclusion for each
state/District/treasury
 Administrative convenience 17
3. Stratified sampling (contd..)
 Disadvantages
Sampling error difficult to measure
 Loss of precision if small numbers sampled in
individual heterogeneous strata
 Example of stratified sampling:
 (i) To select BPL households for a social audit; divide
the population of BPL into three categories (strata) say
top 25%, Middle 50% and Bottom 25% and select
separate samples from 3 categories/strata. (ii) Monetary
Unit Sampling (MUS) is also a case of Stratified
Sampling where the population is divided into 2 strata –
High value and low value vouchers/items
18
Allocation of Sample size in Stratified sampling
 Proportional Allocation: ni = (n/N)*Ni
where ni is size of sample from ith strara, Ni is population of ith
strara; n is sample size and N is the population size
 Optimum allocation ni’s are chosen so as to
(a) Maximise the precision for fixed sample size n; Neyman’s
Allocation
(b) Maximise the precision for fixed cost
(c) Minimise the total cost for fixed desired precision
 Disproportionate Allocation
No. of items selected from a stratum is independent of its size.
 A large sample would be required from a stratum if
1. Stratum size Ni is large.
2. Stratum variability Si (variance/Std. Dev.) is large.
19
Exercise: Proportional Allocation
 Number of Vouchers coming from 3 treasuries are 300,
200 and 500 respectively. Draw a proportional stratified
sample of size 60 using the random number table given in
slide No.13.
 Solution:
Here N1 = 300, N2 = 200 and N3 = 500; N = 1000
using ni = (n/N)*Ni; i = 1, 2, 3
n1 = (60/1000)* 300 = 18, n2 = (60/1000)* 200 = 12
and n3 = (60/1000)* 500 = 30.
Thus a sample of 18, 12 and 30 will be selected from
these three strata using random number table.
20
4. Cluster Sampling
 The population is divided into non-overlapping
groups known as Clusters.
 Clusters are commonly formed on the basis of
geographical /administrative/political boundaries, e.g.
GPs, Blocks, Departments may act as clusters.
Procedure
 List all the clusters/groups of sampling units of the
study population
 Select Random Sample of clusters
 Survey all or proportion of sampling units of
selected clusters
 For e.g. selecting some Districts from a state and
auditing them leaving other Districts 21
Cluster Sampling (contd..)
 Advantages
 Simple: Complete list of units (sampling frame) is
required only for clusters selected in the sample
 Less travel/resources required
 Disadvantages
 Imprecise if clusters homogeneous (Large sample as
compared to SRS is required for the same precision)
 Sampling error difficult to measure
22
The two stages of a Cluster Sample
 First stage: Probability proportional to size (PPS)
• Find the number of clusters to be included
• Compute cumulative totals of the populations for each cluster
with a grand total
• Divide the grand total by the number of clusters and obtain the
sampling interval (k)
• Choose a random number less than k and identify the first
cluster
• Add the sampling interval and identify the second cluster
• By repeating the same procedure, identify all the clusters to be
selected
 Second stage
 In each selected cluster select a random sample of required
number of units using a sampling frame of Basic Sampling Units
in the cluster.
23
Selection of PPS Sample
Let’s take treasuries as clusters/strata, the objective is to
select 30 clusters/strata i.e. 30 treasuries using PPS; size
being no. of vouchers in a treasury.
Procedure: List all Treasuries with number of vouchers in
them; find the cumulative totals of number of vouchers:
Treasury no. of vouchers Cumulative total
1 34 34
2 60 94
3 30 124
4 76 200
5 315 515 and so on
Total 4,715
Divide the cumulative total = 4715 by 30 – clusters to select
4,715 / 30= 157.1; Sampling Interval ‘k’ is 157 24
Selection of PPS Sample contd…
Find a three digit random number [less than 157] say 123
Select the first cluster corresponding to 123 in cumulative Tot.
Select the remaining clusters from the cumulative distribution
by adding 157 (sampling interval) each time.
Treasury no. of vouchers Cum. total Cluster Selected
3 30 124 * selected
4 76 200
5 315 515 ** selected twice
(2nd 123+157=280)
(3rd 280+157=437)
Second Stage: In each selected cluster (treasury) choose required
number of vouchers by random or systematic selection.
25
Stratified Sampling Vs Cluster Sampling
• In both stratified and cluster sampling, the population is
divided into well-defined groups.
 Stratified sampling is used when each group has small
variation (more homogeneity) within itself but wide
variation between the groups.
 Cluster Sampling in used in the opposite case, when there
is considerable variation within each group but the groups
are essentially similar to each other.
 In Stratified sampling estimate of each and every strata is
also available but not in cluster sampling.
26
Stratified Cluster Sampling - Cont.
• Suppose in a state there are 20 Districts;
• We take a sample of 15 villages in each of the 20
Districts to study the implementation of MGNREGA
• In all 300 villages are selected and studied
• This is an example of stratified sampling when
estimates of the desired characteristics for each of the
Districts (Strata) would also be available
• On the other hand let us select 5 districts out of 20 and
take a sample of 60 villages in each of the selected District
• In all 300 villages are selected and studied
• This is an example of Cluster Sampling
• In this case estimates of the desired characteristics for
each of the Districts (Cluster) would not be available
27
Multistage sampling – an example
 To obtain a sample of ‘n’ households in the country: the first
stage units may be states, the second stage units (SSUs) Distts.
from selected states, third stage units villages from selected
districts, ultimate stage units are households in the villages
 Advantages
 Most feasible approach for large populations
 No complete listing of units is required at various stages;
second stage frame is required only for the selected first
stage units.
 This leads to great saving in operational cost.
 Disadvantages
 Several sampling lists
 Sampling error difficult to measure
 May be less efficient compared to a suitable single stage
sampling of the same size.
28
Sampling in AUDIT - III
P. K. Dhamija Stat. Advisor
1
Sampling
The Process of selection of some members of a
population to generate precise and valid
estimates of population parameters like
averages or proportions.
Sample
A sample is a part of the population, selected by the
investigator/auditor as its representative to gather
population
2
Sampling Terms
 Sampling unit (Basic sampling unit)
Example: vouchers, cheques, bills, districts, audit units
 Sampling frame
List of all sampling units in the population
 Sampling scheme
Method used to select sampling units from the sampling
frame
 Parameter : Population characteristic like average, proportion
based on all the units in the population; it is constant/fixed.
 Statistic: Sample characteristic like average, proportion based
on sample values; it varies from sample to sample.
3
Advantages of Survey Sampling
 less expensive; Saves Time
 The quality of information is maintained.
 Possible to determine the extent of error due to
Sampling
 Non Sampling errors are likely to be less
 Even Census Results are verified by sampling
 Law of Statistical Regularity states that a
moderately large number of items chosen at random
form a large group are almost sure on the average to
possess the characteristic of the large group –
Forms the Basis of Sampling.
4
Disadvantages of Survey Sampling
 Results of a sample survey are subject to error due
to sampling.
 A sample may not properly represent the various
subgroups of a population.
 Sometimes the sampling methods may become
complicated requiring the services of an expert.
 Note: If time & money are not important factors
and if population under consideration is not too
large, census is better than any sampling method.
5
Types of sampling
Non-Statistical sampling
Statistical sampling
6
Non-Statistical Sampling
 Units in the study population do not have a known
probability of being included in the sample
 Subjective/Biased samples
 Used when (i) the number of elements in the population
is either unknown or units in the population can not be
identified and (ii) there are time/ resource constraints
Advantages:
 Practical and easy to conduct
Dis-advantages:
 Not representative of the population
 Not possible to (i) assess the validity of estimates (ii)
Determine sample size using statistical methods.
7
Some Non-Statistical Sampling
Techniques
I Accidental/ Haphazard Sampling/
The auditor selects sample (audit units, bills, vouchers,
Districts) without any conscious bias; the sample is
expected to be representative of the population. For e.g.
avoiding first and last voucher in a bundle.
II Judgmental /Purposive Sampling
The auditor selects sample (audit units, bills, vouchers,
Departments) which in his opinion contains maximum
error say vouchers with the highest values or vouchers of
some particular treasury.
8
Statistical Sampling
 Each unit in the study population has a known probability
(may not be equal) of being included in the sample.
Advantages:
 It provides estimates free from personal bias
 It permits application of objective methods of minimizing
error under the resource constraints.
 Allows to draw valid conclusions about population
Dis-advantages:
 Needs sampling frame
 Compared to Non Prob. sampling it is difficult to apply
9
Audit Sampling
 Application of audit procedure to less than 100
% of the items/transactions for the purpose of
evaluating some characteristic of the items/
transactions under audit.
 Use of audit sampling may not be possible/
advisable in auditing procedures involving
scanning accounting records for unusual items
(outliers), inquiries (Satyam case), most
analytical/detailed procedures, etc.
10
Need of Statistical Sampling in Auditing
 No auditor can check 100% of auditable entities

because of resource constraints
 To provide assurance based on test checks
 Audit methodology is under increasing scrutiny
so auditor need to use scientific/statistical tools
& techniques
 Conclusions based on statistical sampling can
stand scrutiny of auditee/ other professionals.
11
Advantages of Statistical Sampling
 Offer a means of estimating errors/misstatement in
quantifiable and reliable manner
 Takes into account risk and materiality for determining
sample size and cost.
 Offers a means of arriving at an optimum sample size to
avoid under or over auditing
 Properly designed sampling estimates are unbiased and
transparent.
 Helps in forming an opinion about the extent of audit
objection/ value of misstatement [non-levy/short-levy of
taxes] in the population with specified sampling risk (say
5%) - for e.g. with 95% confidence we can say that errors
in vouchers are b/w 2.6 – 3.0%
12
 .
Sampling Error
 No sample is a perfect mirror image of population
 The estimates obtained vary from sample to sample.
 The sampling variance is the measure of variability of a
sample estimator like variance of average or proportion.
 The square root of the variance of the sample estimator
is called the standard error of the estimator.
 The lesser the value of standard error, the more efficient
would be the estimator.
 Use of Proper sample design and sufficient sample size
reduce sampling error and increase the efficiency of the
estimate(s) obtained by sampling.
13
Non-Sampling Errors in Audit
include any misjudgement or mistakes by the auditor that may lead
to incorrect conclusion(s) based on audit. They occur even if full
population is examined.
By careful planning & supervision and by using appropriate audit
technique non sampling errors can be reduced but they can't be
eliminated. Some of the cases of non-sampling errors in audit are:
 Selecting inappropriate audit procedures to achieve specific
objective. For e.g. an auditor checks controller’s signature on
voucher & not disbursement approval.
 Auditor may fail to recognize misstatements (errors) included in
documents that he examines – Can you think some examples?
 selecting inappropriate population for e.g. selecting only BPL
households for audit of a scheme involving payment of subsidy.
 Auditor makes an error in evaluation (say totalling mistake or
skipping some vouchers containing error) of the results. 14

Factors Affecting Sample Size
 Variability in the target population – measured by Standard
Deviation of the characteristic under audit
 Margin of error or precision: a measure of possible difference
between the sample estimate and actual population value;
 Feasibility/Resource Constraints - trade off between ideal
sample size and survey cost (cost of sampling per unit, time,
money, human resources)
 Importance of the decision - extent of penalty for making
mistake
 Nature of the analysis/audit intended (exploratory - small
sample or conclusive – large sample)
 Sample sizes used in similar audits in the past
 Expected Incidence/Deviation (error) rates – larger error

rates means larger sample size 15
Statistical Sampling Plans used in
Audit
 Attribute Sampling Plans or Attribute
Sampling
 Variable Sampling Plans or Variable
Sampling
16
Attribute Sampling Plans or Attribute Sampling
 An attribute is a qualitative characteristic which can not be
measured quantitatively. However, the population may be
classified into various classes w. r. t. the attribute
 Attribute Sampling is used in Tests of Controls (TOC) i.e. to
find out no. of deviations, proportion (%) of deviations etc. –
it deals with ‘How Many’
Examples of attribute sampling:(a) Whether the Financial and
Accounting system of Canteen Stores Department (CSD)
adheres to the laid down standards & procedures?
(b) If the system of identifying targeted beneficiary in Social
Security scheme was in place and was working effectively?
(c) Verifying signatures or approval stamp on a bill/voucher
(d) Entries posted in the correct account/Head?
17
Attribute Sampling - Types
 Fixed sample size attribute sampling - objective is
to perform test of control to estimate the
deviation/error rate of a population.
 Sequential (stop or go) attribute sampling used for
not so common cases; it prevents oversampling.
 Discovery/Exploratory Sampling: observing at least
one deviation - very rare cases.
 Block Sampling: It includes all items in a selected
time period/group say all vouchers of January or
all vouchers of a particular department/treasury.
18
Sampling Risks -- Tests of Controls
Actual Extent of Operating
Effectiveness of the Control
Procedure is
Adequate Inadequate
The Test of Controls
Sample Indicates:
Incorrect Decision
Extent of Operating Correct (Risk of Assessing
Effectiveness is Decision Control Risk
Too Low) - (β)
Adequate
Incorrect Decision
Extent of Operating (Risk of Assessing Correct
Effectiveness Control Risk Decision
Inadequate Too High) - (α)
19
Practical Illustration of Attribute Sampling
Objective: To find out if the controls are operating
effectively or not?
We Assume
 Risk of Assessing Control Risk Too Low (β) - 5%; it means
95% reliability
 Tolerable Deviation Rate — 9 %
 Expected Population Deviation Rate —2%
It means we assume: (i) 95% reliability i.e. 5% statistical

chance of concluding that the control is operating
effectively when it is not (ii) we are ready to tolerate a
maximum deviation/error rate of 9% - it is similar to
materiality (iii) Estimate of population deviation rate is 2%
(based on prior knowledge like an audit conducted in the
past or based on some pilot) 20
Statistical Sample Sizes for Tests of Controls at 5
Percent Risk of Assessing Control Risk Too Low
21
Sample Size and Evaluating Attributes
Sampling Results
 Sample size using the table (previous slide) is 68 (2)
 It means auditor should select a sample of 68 items.
 Bracketed number - (2) means; if 2 or less
deviations are observed in a sample of 68, we may
conclude that audit objective has been
accomplished. We may conclude like:
“I believe that the deviation/error rate in the
population is less than 9 percent.” We will be wrong
5 % of the time when the deviation is exactly 9 %.
If the deviation rate is in excess of 9 % we will be
wrong even less than 5 % of the time.
Planned assessed level of control risk is achieved.
22
Evaluating Attributes Sampling Results
Case 2: more than 2 deviations identified - Since the

bracketed number was (2), audit objective has
not been met. We may conclude like:
“The achieved deviation rate is higher than 9
percent.” Accordingly, auditor may not “rely” on
internal control to the extent planned.
23
Variable Sampling Plans
 Variable (or quantitative) sampling is used when
the objective is to estimate a quantity (like
amount of loss to government , average loss per
transaction, etc.); it deals with “How Much”
 It is used primarily for substantive testing. Most
commonly used variable sampling plan is
Probability Proportional to Size (PPS) or Monetary
Unit Sampling (MUS)
 MUS: It is a hybrid plan combining the
characteristics of attribute and variable sampling.
24
Sampling Risk/Errors for
Substantive Tests – Variable Sampling

Actual Conclusion based on 100% check

material
No material misstatement
Sample Results misstatements
Indicate
Incorrect Decision
Correct
No material Risk of Assessing
Misstatements Decision Control
Risk Too Low (β)
Risk of Assessing Correct

material Control
misstatements Risk Too High (α)
Decision
25
Risk of incorrect rejection’ (Alpha risk) or
Risk of Assessing Control Risk Too High
 Risk that sample supports the conclusion that the account
balance is materially misstated when it is not.
 Corresponds to risk of rejecting a correct null hypothesis.
 10% Alpha risk means there is a prob./chance of 0.1
(10%) of concluding that there is a misstatement while
actually there is none.
 Arises when the sample indicates a higher level of
errors/risk than is actually the case.
 This situation is usually resolved by additional audit work
being performed i.e. large sample
 affects audit efficiency but should not affect the validity of
the resulting audit conclusion
26
Risk of incorrect acceptance’ (Beta risk)
Risk of Assessing Control Risk Too Low
 Material error is not detected in a population because the
sample failed to select sufficient items containing errors.
 Corresponds to risk of not rejecting a false null hypothesis
– in audit which is more serious – alpha or beta
 It affects audit effectiveness,.
 This risk is 10% means prob./chance of concluding that
there is no misstatement while actually there is a
misstatement; it also indicates 100 – 10 = 90% reliability.
 To control this risk we increase precision and hence
sample size.
27
Sample Size required for Un-stratified MPU*
2
 U r .SD. N 
no   
 A 
no
if is high; the sample size can be
N
f urther reduced f or SRSWOR as
no
n 
no
1
N
Ur = reliability coefficient: depends on confidence/reliability reqd.

SD= Standard Deviation; A = Precision i.e. materiality
n0 = sample size, N = population size;
SRSWOR: Simple Random Sampling without Replacement
*Un-stratified Mean Per Unit (MPU): Statistical sampling tech. (not
involving stratification) whereby sample mean is calculated and
projected as an estimated total. 28
Exercise -steps of un-stratified MPU
ABC Limited desires 95 per cent reliability and
plans to use unrestricted (un-stratified) random
sampling without replacement to estimate the
value of inventory of a subsidiary. To estimate
mean and standard deviation (SD) of the
inventory population, a pilot sample of 30 items
from the total population of 2000 items was
selected. The pilot sample produced an
arithmetic mean of Rs. 4000 and a (SD) of Rs.
150.
29
Exercise cont.
 95 per cent reliability means firm is willing to
tolerate 5 per cent chance of sampling error.
 That is, 5 % of the time, projection plus/minus
precision may not include the true population total.
 Based on 95 % reliability, reliability coefficient (UR)
is 1.96 – based on normal tables.
 Precision (A) is judgmentally set equal to Rs. 60,000
(termed as materiality) – the amount considered
material for this application.
30
Using the Formula, we have the sample size as:
For SRSWOR sample size reduces to
31
Sixty- two additional sample items are added to
the pilot sample of 30 to yield the total sample
of 92. The 62 additional sample items are
selected using SRSWOR.
•A standard deviation based on 92 items is
calculated. Assume that standard deviation is
Rs. 136.
Revised precision will be calculated as:

32
33
Exercise cont.
 Calculated precision A’ is less than predefined
precision (materiality) A = Rs. 60000; so the
sample size is adequate. What happens if revised
precision A’ is greater than materiality?
 The mean of the 92 inventory items is calculated
as follows, assuming the sample totals Rs. 370977

 Estimated Population Total Value (EV) =

Rs. 4032.36 * 2000 = Rs. 8064,720.
34
Exercise cont.
 ABC Limited is thus 95 per cent certain

that the true inventory balance of all 2000
inventory items is within Rs. 8,064,720 +
Rs. 54,273. (Calculated Precision)
 ABC Limited should book Rs. 8,064,720 as
inventory.
35
Attribute Sampling: Sample Size
Objective is to estimate the number (proportion) of
audit objections (errors); estimated number of audit
objections in the population is sample proportion of
error multiplied by number of items in the population.
The optimum sample size under SRSWR is:
 Z r . P.(1  P ) .
2

 
 A 2 
 
Where Zr = confidence level coefficient = 1.96
for 95% level of confidence, A= margin of error
(we are prepared to accept normally 10 or 20%)
P= Proportion of errors expected in the
population. 36
 Estimating P: The formula requires the
knowledge of P, expected proportion of errors in the
population. However, this is what we are trying to
estimate and is unknown
 Ways to estimate proportion :
 A pilot or preliminary sample. Observations used
in the pilot study can be counted as part of the
final sample
 Estimates may be available from previous audit
reports and the upper bound of P can be used in
the formula
 If impossible to obtain a better estimate, set p =
0.5 in the formula to yield maximum value of n
37
Reporting the Results
 When reporting the results of a Sample it
is important to cover the following key
factors:
 The Sample Size
 The Sample selection methodology
 The Estimate(s) resulting from the Sample
 The precision (Std. Error) and confidence
intervals for the Estimate(s)
38
Monetary Unit Sampling MUS
 MUS is nothing but Probability Proportional to
Size (PPS) - systematic sampling, where one
assigns high inclusion probability to the
transactions having high value.
 In MUS method the sampling unit is not an
invoice or any other physical unit, but an
individual rupee. However, when the individual
rupee is selected, the auditor does not verify
just that particular rupee, but the rupee acts as
a hook and drags the whole invoice with it.
Difficult to apply/understand manually and can be
explained using IDEA
39
Control Measures for Non
Sampling Errors
1
Sampling Errors arise due to:
• POPULATION SPECIFICATION ERROR—when the researcher does not
understand who (s)he should survey. For e.g. in a survey about
breakfast cereal consumption - who should be surveyed? The mother
makes the purchase decision, but the children influence her choice.
• SAMPLE FRAME ERROR—when the wrong sub-population is used to
select a sample. For e.g. if the sample frame is from car registrations
and telephone directories. The results may be wrongly predicted.
• SELECTION ERROR—This occurs when only those that are interested
respond. It can be controlled by pre-survey contact requesting
cooperation, actual surveying, post survey follow-up if a response is
not received.
• NON-RESPONSE—Non-response errors occur when respondents are
different than those who do not respond. The extent of this non-
response error can be checked through follow-up surveys etc.
• SAMPLING ERRORS—These errors occur because of variation in the
number or representativeness of the sample that responds.
• These errors can be controlled by (1) careful sample designs, (2) large
samples and (3) multiple contacts to assure representative response.
2
Non Sampling Errors: Types
 Conceptual Errors:
• Lack of qualified and suitable enumerators
• Lack of proper training of field staff to
make them thorough with the concepts
and definitions involved
 Errors of Recording/ Transcription: Due to
carelessness and negligence of the auditor
Errors of Inaccurate Measurement: Due to
erroneous figure of measurement given by the
informant/auditee
3
Non Sampling Errors: Types Contd…
Errors in Totalling:
 When there are many items to be totalled
up
 Totalling of subtotals may quite often lead
to such errors
 Errors of Omission:
 When field worker fails to ask certain
questions in the block
 Due to non availability of required
information
4
Non Sampling Errors: Types Cont.
Bias of the Interviewer:
 Due to inadequate training or partial
understanding of instructions
 Putting a question in a specific way or
telling suggestive answers
Errors of Inconsistency: When data are
inconsistent with the similar information
collected in some other part of the same
schedule
5
Non Sampling Errors: Types Contd.
 Response Error:
 Due to wrong notion present in the mind of
respondent
 Due to some kind of fear
 Due to wrong understanding of questions
 Due to Illiteracy
 Due to lack of clarity in questions
 Due to deliberate poor response.
Error due to Prestige/ Self interest: Due to
prestige, pride or self interest, informant may
introduce bias by upgrading education,
expenditure and downgrading age, income, etc. 6
Non Sampling Errors: Types Contd…
Errors due to recall lapse: If the recall period is
longer, answers may be based on guess or averages
 Error due to absence of right informant
 Error due to incorrect identification of sampling
units (say wrong marking of boundaries)
• Boundaries not correctly identified due to lack of
adequate effort or due to misguidance by some
person
 Errors due to longer reference period:
• Inclusion of information pertaining to period
out of reference period
• Exclusion of information pertaining to the
period within the reference period .
7
Methods of Controlling Non Sampling Errors
 Recruitment of proper primary field worker who
has got:
• Aptitude for field work
• Good knowledge of the survey area/local
language
• Proper academic qualification
• Tactfulness and resourcefulness
 Training: Required for facilitating and
understanding the sampling design, various
concepts & definitions, schedules of enquiry and
procedure of data collection 8
Methods of Controlling Non Sampling Errors - Cont.
• Purpose of training/workshops is to bring uniformity
in concepts and procedures
• Active participation by primary field workers and
supervisors
 Inspection/ Supervision:
• On the spot verification
• Instant feedback to the investigating staff
• Inspection norms
 Probing:
• Probing questions should be simple
• Should not create any sort of bitterness
 Cross checking
9
Methods of Controlling Non Sampling Errors - Cont.
 Scrutiny and Super Scrutiny
 Monthly Meetings
Feedback Reports
 Role of experienced field staff in improvement
of quality
Amendments in the Schedules and Clarifications:
Pilot Survey
10
Risk Based Audit Approach Session 7.1
Monetary Unit Basic Concepts

Sampling Some of the new concepts that would be
discussed in this session are explained
Session Overview below:
This session is the last of the sessions on (a) Monetary Unit Sampling:
audit sampling. Session 5.1 was on the Monetary Unit Sampling (MUS) is a
basic concepts of sampling and Session 6.1 sampling method in which the sampling
on the application of attribute sampling for unit is not an invoice or any other physical
control procedures. This session will cover unit, but an individual rupee. However,
the basics of another type of sampling, the when the individual rupee is selected, the
Monetary Unit Sampling and its application auditor does not verify just that particular
for substantive test of details. rupee, but the rupee acts as a hook and
drags the whole invoice with it. For
Actually, Monetary Unit Sampling (MUS) example, if as a result of sample selection,
is used widely both for Rs.365 is selected for testing and if that
• Test of Controls rupee falls in voucher number 14, then that
• Account Balance voucher will be audited and its quality
assigned to the sampling unit.
In this session, we will discuss MUS for
Substantive Test of Details covering the Let us assume that there are 6 items out of
following important points: which 2 items are to be selected. The value
of the 6 items are 100, 200, 300, 400, 500
• Monetary Unit Sampling (MUS), and 1000. If attribute sampling is used to
advantages and limitations of MUS and select the 2 items, then all the items have
it’s application to substantive test of equal chance of selection, as the sampling
details unit would be individual item. On the other
• Determining the sample size for hand, if MUS is used, then the total value
substantive test of details; and of 6 items works out to Rs.2500, i.e., there
• Evaluation of results of substantive test are 2500 sampling units. As 2 items are to
of details. be selected, the sampling interval would be
2500/2 (Rs.1250). This means that one
. rupee out of every 1250 rupees would be
selected. In such a case, the chances of
Learning Objectives 1000 rupee item getting picked up is 10
times more than the 100 rupee item getting
At the end of the session, you would be picked up. Thus MUS has a bias towards
able to apply MUS for substantive test of high value items.
details, to the extent that, the steps are
followed correctly and keeping in mind the (b) Most Likely Error:
advantages and limitations of MUS. Most Likely Error (MLE) is an estimation
of the error in the population. Initially MLE
The session is expected to provide only a will be estimated based on past experience
broad overview of MUS and is not meant to and used for determining the sample size.
impart expertise on MUS. Participants may After carrying out substantive test of
read additional study material suggested in details, the MLE will be projected based on
the bibliography for further knowledge. actual sample results and used for drawing
audit conclusions.
Note 7.1 1
(c) Basic Precision: to check whether there is a material error or

Basic Precision (BP) is the allowance for conclude that there is a material error.
errors which exist but no evidence of that is
found in the sample. It is dependent on the (h) Tainting:
confidence level and the size of the sample Tainting is the percentage of error found in
and is present even where no errors have monetary terms in a sample item. For e.g.,
been found in the sample. if the accounts receivable balance of ‘x’ as
per financial statement is Rs.10,000 and its
(d) Precision Gap Widening: value as per auditor’s findings is Rs.8,000
Precision Gap Widening (PGW) is the then the tainting would be (Rs.10,000-
additional allowance that must be made in 8,000)/10,000 = 20%.
the Precision as a result of errors in the
sample. MUS – Advantages, Limitations
and Relevance to Substantive Tests
(e) Planned Precision:
Planned Precision (PP) is the difference As described previously, substantive tests
between Upper Error Limit and the Most are those tests of transactions and balances
Likely Error, i.e., Planned Precision = that seek to provide evidence as to the
Materiality - Most Likely Error = Basic completeness, accuracy and validity of
Precision + Precision Gap Widening. information in the financial statements. The
objective of substantive testing is to obtain
(f) High value and key items: reasonable assurance that financial
The Auditor may decide that all items/ statement assertions individually and
transactions above a particular monetary together correspond to established criteria
value are to be audited 100%. These items within limits not exceeding materiality.
are called high value items. For e.g., the Thus, substantive tests are intended to
auditor may decide that all items above determine the monetary effect of errors in
Rs.100,000 are to be audited fully. the financial statements. For example, the
Similarly the auditor using his judgement aim of substantive tests of accounts
may decide that some items due to their receivable could be to check the extent to
nature are prone to error. Such items are which the balance is overstated. For
called key items. For e.g., if there is a substantive test of details (audit of
complete breakdown of controls in a individual transactions), we must use
particular division in one account area, he sampling to select individual items for
may treat all items relating to that account examination.
area from that division as key items and
check all these items. It is to remembered In MUS, as explained earlier, each rupee is
that all high and key items are to be treated as a sampling unit and acts as a
deducted from population to arrive at hook for the physical unit in which it
representative population for sampling. occurs; conclusions on the physical unit in
monetary terms can be reached. The results
(g) Maximum Possible Misstatements or from the tests of sample are then used to
Upper Error Limit: project the most likely error and the upper
The Upper Error Limit (UEL) is the error limit in the population. As MUS helps
maximum possible error estimated in the in arriving at audit conclusions in monetary
population as a result of the substantive test terms with quantification of the degree of
of samples. If the Upper Error Limit is confidence in the result, it is the preferred
above the materiality limit, then the auditor method of sampling for substantive test of
will either perform further substantive tests details.
Note 7.1 2
Advantages of MUS out the particular item in which the

dollar falls, is an onerous task.
The important advantages of MUS are: (v) MUS is more time consuming than
(i) It normally produces smaller sample other sampling plans, as the number of
sizes than other substantive sampling sampling units (rupee) is higher than in
plans. attribute sampling (physical unit like
(ii) There is no difficulty in expressing a voucher, cheque).
conclusion in monetary terms.
(iii) The application of rupee unit sampling Steps involved in MUS
is not contingent on knowledge of the
population size. This permits sample The stages in MUS are:
selection to be started before the total (a) determining the sample size;
value of the final population is known. (b) selecting the sample for performing
As will be explained later in the substantive test of details; and
session, the average sampling interval (c) evaluation of sample test results.
can be worked out without details of
population size. Steps involved in performing these stages
(iv) No rupee stratification is necessary, as are explained below:
this will be accomplished
automatically, thus avoiding problems (a) Determining the Sample Size
of determining optimum strata The steps involved in determining the
boundaries and allocation of sample sample size are as follows:
size among strata. (i) Set Upper Error Limit (UEL) (mostly
(v) It is relatively easy to apply compared equal to materiality).
to other sampling plans. (ii) Subtract the estimated Most Likely
(vi) The problem of detecting the large but Error (MLE) (usually based on prior
infrequent errors is solved, since all knowledge, i.e. the results of last
items greater than the sampling year’s audit; in the absence of sound
interval will be selected. prior knowledge approximately 15-20
percent of materiality is a good rule of
thumb)
(iii) Subtract Precision Gap Widening
(PGW) (approximately 1/2 materiality
Disadvantages of MUS is a good rule of thumb, but can vary
depending upon the number and
The main disadvantages of MUS are the magnitude of errors anticipated and
following: the assurance level).
(i) Accounts/items with nil balances will (iv) Obtain Basic Precision (BP): BP =
have no chance of selection. UEL -MLE- PGW.
(ii) The more an item is understated; the (v) For the given assurance level, use the
less likely the item has a chance of Assurance table to determine the
selection. Hence, MUS is less useful basic precision factor.
for finding understatements. (vi) Calculate the sampling interval by
(iii) A large percentage error in a small using the formula: Average Sampling
transaction can significantly increase Interval (ASI) = BP/BP factor.
the computed error limit. (vii) Deduct the high value and key value
(iv) It is very difficult to use MUS in a items from the total population to
non-computerized environment as arrive at the estimated representative
totaling the sample items in the population.
population for the purpose of finding
Note 7.1 3
(vii) Calculate the sample size by using

the formula: Sample size = 3. Average sampling interval
representative population/ average (1/3) Rs.261,000(B)
sampling interval.
Calculation of Sample Size
Example 7.1.1
1. Average sampling interval Rs.261,000(B)
An auditor is performing substantive tests as above
on the valuation of buses in the company.
The total value of the fleet is 2. Total population Rs.25,000,000
Rs.25,000,000. The materiality limit
established for the audit is Rs.1,000,000. 3. Less: High-value items Rs.1,000,000
The auditor anticipates an error of
Rs.250,000 (based on past experience). The 4. Less: Key-value items Rs.1,000,000
auditor estimates the precision gap
widening of Rs.150,000. The auditor wants 5. Representative test
90 percent assurance from the substantive population Rs.23,500,000
tests. The auditor has identified high-value
items of Rs.1,000,000 and key-value items 6. Representative sample
of Rs.500,000 in the population. Using the size (5/1) 90
table of basic precision and precision gap Thus the representative sample size for
widening factors (Appendix - I), the basic substantive test of details will be 90.
precision, average sampling interval and
representative sample size are calculated as (b) Selecting the Sample
explained below: Out of the four samples selection methods
is described in Session 5.1 Basic
Calculation of Basic Precision concepts of statistical sampling, the
two methods that are used in MUS are
1. Upper Error Limit systematic selection and cell selection.
(equal to materiality) Rs.1,000,000 In systematic selection, one or two
items are selected randomly and the
2. Less: anticipated most average sampling interval is added to
likely errors Rs.250,000 arrive at the other items to be selected.
In cell selection, the population is
3. Estimated precision divided into various cells and one item
available (1-2) Rs.750,000 is selected from each cell. Systematic
selection ensures that all items whose
4. Less: estimated precision value is higher than the average
gap widening Rs.150,000 sampling interval are automatically
selected. In Cell selection all items
5. Basic precision (3-4) Rs.600,000 (A) whose value is twice more than the
sampling interval are selected
Calculation of Average Sampling Interval automatically. Thus, chances of high-
value but infrequent error escaping
1. Basic precision for audit Rs.600,000 audit scrutiny are avoided. After
selecting the sample refer to Handout
2. Assurance level expected 90% 7.1.3 go through it before we discuss it
in detail.
3. Basic precision factor (c) Evaluating Results of Substantive
(from table for 90%) 2.3 Test of Details
Note 7.1 4
In the example 7.1.1 discussed earlier,

The steps involved in evaluating the results assume that the auditor has identified the
of substantive test of details are: following errors:
(i) Calculate the percentage of tainting
for individual items. Item . Book Audited
(ii) Add the individual tainting No. value value
percentages to arrive at net tainting
percentage. Representative sample:
(iii) Calculate the Most Likely Error 14 Rs.5,000 Rs.3,000
(MLE) for the representative sample 24 Rs.7,500 Rs.1,500
by using the formula: MLE = Net 16 Rs.4,000 Rs.6,000
tainting percentage * Average High value:
Sampling Interval (ASI) 28 Rs.30,000 Rs.21,000
(iv) Add the errors in the high-value and
key-value items to the MLE to arrive Using the basic precision and precision gap
at total most likely errors as follows: widening factors table (Appendix 7.1-A),
Total most likely error = MLE + error the total most likely error, the upper error
in high value items + error in key limits for overstatement and understatement
value items and audit conclusion can be arrived at as
(v) Arrange the tainting according to the explained below:
percentage of tainting, in descending
order, for overstatement and Calculation of Total Most Likely Error
understatement separately. Only
tainting found out from the 1. Net sample error tainting will be:
representative sample are to be used
for this purpose and tainting in high- Item Book Audit
and key-value items is to be excluded. number value value Error Tainting
(vi) Calculate the tainting adjusted PGW
factor for each tainting sorted as at 14 Rs.5,000 Rs.3,000 Rs.2,000 40%
(v).
Tainting adjusted PGW factor = 24 Rs.7,500 Rs.1,500 Rs.6,000 80%
Tainting percentage * PGW factor.
(vii) Total the tainting adjusted PGW 16 Rs.4,000 Rs.6,000(Rs.2,000) (50%)
factors for overstatement and
understatement separately. Net tainting 70%
(viii) Calculate overstatements and
understatements PGW of separately: 2. Net most likely error for representative
PGW = sum of tainting adjusted sample will be:
PGW factors * ASI. Sum of net tainting percentage % x
(ix) Calculate UELs for overstatement and Average sampling interval
understatement separately:
UEL(overstatement) = MLE + BP + We know that average sampling interval is
PGW Rs. 261,000 (B) in Example 7.1.1.
UEL(understatement) = MLE - BP –
PGW 70% X Rs.261,000 = Rs.182,700
Example 7.1.2 3. Error in high-value item= Rs.30,000-

Rs.21,000=Rs.9,000
Note 7.1 5
4. Total most likely error 2+3 Upper Error

=Rs.191,700(X) Limits 958,740 (485,295)
Calculation of Upper Error Limits Audit Conclusions

The most likely error in the population is
1. Tainting adjusted Precision Gap Rs.191,700 overstatement. The UELs at 90
Widening factors percent confidence are that the
overstatement is at most Rs.958,740 and
(a) Overstatements the understatement at the most is
Rs.485,295. As the UELs are less than the
Taintings Ranked PGW Tainting materiality limit the auditor can conclude
by size factors adjusted that there is no material error. If the UEL is
(from table) PGW factors more than the materiality, then the Auditor
will have to conclude either that there is a
40% 2nd 0.43 40%X0.43=0.172
material misstatement in the accounts or
80 1st 0.59 80%X0.59=0.472 increase the quantum of substantive test of
details.
(b) Understatements Thus MUS is used, both at the stage of

50% 1st 0.59 50%X0.59=0.295 planning the sample size and subsequently
at the stage of evaluation of sample results
of substantive test of details. We will now
2. Precision Gap Widening discuss Handout 7.1.3 performing
substantive test of details
PGW = Sum of tainting adjusted PGW
factors x Average sampling interval
Summary
(a) Overstatements = 0.644 X
The key points that were discussed in the
261,000=Rs.167,040
session are:
(Y) • Monetary Unit Sampling (MUS) –
Advantages, limitations and relevance
(b) Understatements =0.295 X
of MUS for substantive test of details
261,000=Rs.76,995 (Z)
• Steps involved in MUS
• Determining the sample size for
3. Upper Error Limits
substantive test of details
• Selecting the sample for performing
Basic Precision is Rs.600,000 as
substantive test of details and
worked out at (A) in Example 7.1.1
• Evaluation of results of substantive
test of details.
Overstatements Understatements
Rs. Rs.
Basic Precision 600,000 (600,000)
PGW 167,040 (76,995)

(Y) as above (Z) as above
Total Precision 767,040 (676,995)
Add total most

likely error 191,700 191,700
(X) as above
Note 7.1 6
Application of Statistical
Sampling in
Performance Audit of
National Rural Employment
Guarantee ACT
(MGNREGA)
1
NREGA Background information
 MGNREGA: All-India coverage of rural areas;
 Administrative set up in the country: State level
District Level Block Level GP level
Beneficiary Level.
 Through MGNREGA, the govt. was committed to
provide employment to every rural family (at min.
wages) which demand such work and whose adult
members volunteer to do such work.
 Objectives : (i) enhancement of livelihood security of
households (hhs) in rural areas by providing guaranteed
wage employment (ii) creation of durable assets;
 Principle implementing agency is Gram Panchayat (GP) 2
Audit Objectives
 Structural mechanisms were in place and adequate
capacity building measures were taken by the Central
and state govts. for implementation of the Act;
 Procedures for preparing perspective (long term) and
annual plan at different levels for estimating the likely
demand for work were adequate and effective;
 Funds were released, accounted for and utilised by the
governments in compliance with the provisions of the
Act and other extant/existing rules;
 Process of registration of households, allotment of job
cards and allocation of employment in compliance
with the Act and rules was effective
5
Audit Objectives – Cont.
• livelihood security was provided by giving 100 days of emp. to
hhs in rural areas on demand & wages as declared, were paid;
• MGNREGS works were efficiently and effectively executed in a
time-bound manner and in compliance with the Act and Rules,
and durable assets were created, maintained and accounted for;
• Convergence of the Scheme with other rural development
programmes as envisaged was effectively achieved in enhancing
the employment opportunities under MGNREGS;
• All required records at various levels were properly maintained
and MGNREGS MIS data was accurate, reliable and timely;
• Transparency was maintained by involving all stakeholders in
various stages of its implementation;
• Effective mechanism at Centre and state level existed to assess
the impact of MGNREGS on individual households, local labour
market, migration cycle and efficacy of assets created. 6
Sampling plan
 Selection of Districts (1st Stage Unit): Each state was
stratified into 2-5 strata depending on geographical
contiguity; within a strata 25% districts selected with
SRSWOR subject to minimum of 2 Districts.
 Selection of Rural Blocks (2nd SU): Within each
selected district, 2 -3 blocks were selected again with
SRSWOR. (2 blocks if no. of blocks in selected district
are < 10.)
 Selection of GPs (3rd SU): Within each selected block
25% of GPs (max. 10) were selected by PPSWOR with
size measure as Number of job cards or any other
similar proxy parameter like no. of applicants under
NREGA or No. of BPL population or Population size. 7
Sampling Plan
 Selection of Works: (Final SU - I): For each selected
GP 10 works (including incomplete works and
sanctioned in different years) using SRSWOR were
selected; care was taken to select different types of
works like rural connectivity, afforestation, canal
works, wasteland development etc.
 Selection of beneficiaries: (Final SU - II): For each
selected GP, 10 beneficiaries were to be selected
using systematic sampling. (min. 2 SC/ST)
Thus a Multi stage sampling plan was used

for each state.
8
Scope for Improvement
Observations by EPoD [Evidence for

Policy Design] Centre for International
Development (CID), Harvard Univ., USA
after finalisation of audit
9
Scope for Improvement
• Selection Process used (i) ensured equal
probability to each district (ii) Neglected wide
variation in NREGA spending (varies b/w Rs.
0 to 338 crores) & rural population (varies b/w
5620 to 61 lakhs) of Districts
• FAAM guidelines indicate, “Where either
monetary values or assessed risk vary widely,
alternative methods are preferred to SRS”.
10
Approaches discussed by EPoD, CID
• 3 Indicators viz. (i) NREGA Exp. - MUS (ii) Rural
Population and (iii) Person Days Worked could have
been used for selection of Districts.
• They would have ensured – (i) Each Rupee of
Spending becomes equally likely to be selected (ii)
Potential population which qualifies for NREGA and
(iii) Intensity of implementation in a district.
• Districts could have been stratified based on one of
these 3 variables and 4th quartile (top 25% values)
assigned high probability - up-to 100%
11
Approaches discussed by CID – Cont.
• Selecting one District from states where no District
is selected by above technique so that each state is
represented in the sample.
• Rural population approach would have uncovered a
greater number of areas with high rural population
but low MNREGA coverage.
• Person days sample has 2 shortcomings as compared
to total Exp. Approach (i) It may cover less districts
where material expenses exceeded 40% of total
NREGA Exp. (ii) Proportions of households
reaching 100 days of work is artificially high.
12
Potential Extensions of methodology
• 100% Sampling in Key Areas like districts
with highest exp.
• Sampling on Multiple variables like Rural
population and NREGA Exp. taken together
• Using flags to sample Key Districts like
districts where material expenses exceed
40%
• Purposeful selection of works near Pradhan’s
residence.
13

Sampling Research Data

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Sampling Research Data

Caricato da

Copyright:

Formati disponibili

Statistical Sampling in Audit

3. Definition & advantages of statistical sampling.

4. Various Statistical Sampling methods.

Where, SDi = Standard Deviation of the ith stratum.

The estimated population total (for three strata) is

Total estimated value = N1 * x1 +N2 * x2 +N3 * x3

The estimate of population mean= Xˆ =

6. Concept of “Testing of Hypothesis” (Test of control / substantive testing in audit):

6.2 Some relevant terminologies:

6.2.1 Test of control sampling –Risk Matrix

Operating effectively Not operating effectively

–Beta risk: risk of incorrect acceptance – Relates to Audit Effectiveness

6.2.3 Tolerable rate:

6.3Test of control /Compliance testing:

6.4 Audit Hypothesis Model / substantive testing.

Step 4. Ur is determined based on acceptable alpha risk

Step 8. Select the sample by SRSWOR

Step 14. If A’ not equal to A then calculate A”

Step 15. Calculate

Step 17. Calculate Decision Interval:

Step 18. Rule:

6.4.3.1 Caveats on MUS

Step1: Sample size=n= (RF0 * BV)/TM

maximum possible value of misstatement (MVM) with confidence (1- SR) % =

MVM = (BV * RF0/n)*1+ (BV *(RF1-RF0)/n)*.t1+(BV *(RF2-RF1)/n)*.t2+….

6.5.1.1 Discovery Sampling

6.5.2 Stop or Go Sampling (also called Sequential Sampling)

8. Glossary of statistical terminologies:

Alpha Risk Risk of incorrect rejection

Beta Risk Risk of incorrect acceptance

Non-sampling Non-sampling error is generated due to failure of measurement of true

Simple Random Selection of sample units with equal probability.

2) Statistical Sampling and Risk Analysis in Auditing :Peter Jones

3) Theory and methods of survey sampling -Parimal Mukhopadhdhyay :

4) Sampling Theory & Methods -M.N.Murthy

5) Survey Sampling -Kish

6) Sampling Techniques -Cochran W.C

1. Simple Random Sampling With Out Replacement (SRSWOR).

2. Linear Systematic Sampling

Step1 : Calculate the Sampling interval I= [N/n]

3 Circular Systematic Sampling

Step1 : Calculate the Sampling interval I= [N/n]

Please note in this way we get exactly n number of samples.

4 Probability Proportion to Size With Replacement (PPSWR):

4.1 Cumulative total method:

10097 Selected Moga Moga is selected as the selected random number ,

RAO-HARTLEY-COCHRAN METHOD (RHC-1962) - It is PPSWOR(n)

This table is particularly applicable when:

And the estimated population deviation rate < 5 percent,

Statistics are no substitute for judgment

Statistics are used much like a drunk uses a

Descriptive Statistics Inferential Statistics

Draw conclusion about the

Sampling: The Process of Selection of a sample from a

 The mean is given by the sum of the observations

Σ is summation, ‘n’ is sample size and X is sample mean.

 The population mean is usually unknown; so we try to

If n is even: Median = ½* [(n/2)th + (n/2+1)th]

in ascending order: 1, 2, 3, 5, 5, 5, 7, 7, 8, 9, 10, 13;

mean = median = mode

 The sample variance (s2) may be calculated from the

No. of Errors Number of Vouchers

Var. = Σf*(x-m)2 /Σf = 32.28/18 1.79

MVM = (BV * RF0/n)1+ (BV (RF1-RF0)/n).t1+(BV (RF2-RF1)/n)*.t2+….