Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
www.elsevier.com/locate/chemolab
Received 28 August 2003; received in revised form 6 April 2004; accepted 28 May 2004
Abstract
Parts II and III of this series are initiated by a joint discussion of features related to the lot. Part II then delineates the central elements of
the Theory of Sampling for zero-dimensional objects. It is necessary to be brief within the limited format of the present tutorial series, but all
essential model rigour has been maintained. An attempt has been made to focus on the central mathematical theoretical core of TOS while
also showing how this relates directly to sampling practise (materials, equipment and procedures). A highlight of the latter issue concerns
experimental estimation of the Fundamental Sampling Error (FSE). Part II is also fundamental for further developments in Part III, as it
presents a complete overview discussion of the basic sampling operation of the one-dimensional object as well.
D 2004 Elsevier B.V. All rights reserved.
Keywords: Discrete; Quantitative approach; Zero-dimensional objects; Sampling of Particulate matter; Theory of Sampling
1. Joint introduction of parts II and III: three-, two-, o Elongated objects such as a rail, a cable or a flux of
one-, zero-dimensional models matter whose length is:
n very large in comparison with the two dimensions
o Strictly speaking, all material objects, lots L, occupy a of its cross-section,
three-dimensional Cartesian space. From a practical as n practically uniform (with a tolerance of, say, 20%)
well as a theoretical standpoint, however, it may be can be well represented by a one-dimensional
useful to represent a physical object by a model of a model. From a physical and mathematical stand-
smaller number of dimensions. point, the lot is here represented by its projection
o A three-dimensional model alone can represent bulky on the axis of elongation.
lots L, e.g., an ore body and similar. o Discrete objects such as lots made up of a large
o Flat objects, such as a sheet of paper, a steel sheet, the number of unspecified units, assumed to be inde-
thickness of which is: pendent from one another; i.e., populations of non-
n small in comparison with the two dimensions of ordered units can, by extension and by convention, be
its surface, defined as zero-dimensional objects. There are two
n practically uniform (with a tolerance of, say, cases:
20%) can often be well represented by a two- n Unit masses are more or less uniform (with a
dimensional model. From a physical and math- tolerance of, say, 20%): Here, conventional
ematical standpoint, every element of the object statistics can be applied.
is represented by its projection on a plane (often n No hypothesis of uniformity of the unit mass is
horizontal). We often have occasion to work on made. Conventional statistics cannot be applied.
lots L, which can be considered as practically We shall here deal exclusively with this most
two-dimensional. realistic case.
0169-7439/$ - see front matter D 2004 Elsevier B.V. All rights reserved.
doi:10.1016/j.chemolab.2004.05.015
26 P. Gy / Chemometrics and Intelligent Laboratory Systems 74 (2004) 25–38
o Bulky, flat or elongated objects have a common dimensional model presented here in Part II deals with this
property: Their constituent elements are physically tied problem.
or correlated to one another. They are therefore liable, A series of ordered units can be described as a one-
often likely, to show an essential spatial correlation dimensional object: With a flowing stream, the dimension is
within their specific dimensional framework. time: the order between consecutive units is chronological.
There usually exists a correlation between the rank of a unit
The prototype of a bulky three-dimensional compact in the series and its composition. If such a correlation does
object is a mineral deposit. Following the works of not exist, the mathematical model detects it. The one-
Krige, Sichel, de Wijs and more generally the South dimensional model presented in Part III of this article deals
African school of mathematicians who were particularly with this problem.
interested in the first half of the 20th century in the gold Hence, the subdivision of the quantitative approach:
deposits found in their country, the French mining
engineer Georges Matheron (see Ref. [12]) founded a Part II. Sampling of zero-dimensional objects
new science known as bgeostatisticsQ: the science of Part III. Sampling of one-dimensional objects
evaluation of mineral deposits. Michel David, another
In Section 5 of Part I, we defined the concepts of
French mining engineer living in Canada (see Ref. [13]),
homogeneity and heterogeneity. According to these defi-
helped significantly to diffuse the knowledge of geo-
nitions, sampling would be an exact process if the material
statistics in English-speaking countries. In the present
being sampled indeed was homogeneous. Unfortunately
work, we shall, however, not deal much with this science,
(but fortunately for the theoretician), homogeneity is a
which today is widely known, acknowledged and
concept that can be defined mathematically but which
respected, and taught everywhere. In this series, we will
cannot be observed in the real world. Matter is de facto
specifically study only the sampling of zero- and one-
always heterogeneous and all sampling errors result from
dimensional objects.
one form or another of heterogeneity. This justifies our
Whereas the qualitative approach presented in Part I was
objective of quantifying the various forms of heterogeneity
common to all multi-elemental objects susceptible of being
and looking for tractable relationships between sampling
sampled (which excludes only compact solids), the present
errors and quantified heterogeneities.
quantitative approach involves two different mathematical
models corresponding to sampling of populations (zero-
dimensional objects) or to time series (one-dimensional
2. Zero-dimensional model—contribution made to the
objects) respectively. It will be necessary to remind the
heterogeneity of lot L by an unspecified unit U m
reader of a few salient definitions (extracted from Section 4
of Part I).
In the zero-dimensional model, the lot L can be regarded
in general terms as a population of unspecified units U m
1.1. Definitions and notations
which will be defined as the case requires. A is the
component of interest. The grade, or concentration, of
o LOT L: From a theoretical standpoint, a lot L of discrete
component A in a given object (lot L, unit U, sample S, etc.)
material can be regarded as a set of units. We must take
refers to its actual but unknown grade, not to some estimate
into consideration two kinds of sets.
of it. The following definitions are made:
o SET: The set can be:
! Either a population of non-ordered units (e.g., NU number of units U m in lot L,
batch of stationary material), Mm mass of unit U m (with wet materials, M m is generally
! Or a series of ordered units (e.g., a sequence of the mass of dry solids)
elementary cross-sections of a flowing stream. Am mass of component A in unit U m
The order is then chronological; ordered spatial am mass proportion of component A in unit U m (i.e.,
series can very often be treated with exactly the grade of U m ), defined by the identity
same mathematical apparatus). A m ¼ am M m ð1Þ
ML mass of lot L. M L is the sum Am of the masses M m of
Different mathematical laws apply to these two kinds of the N U units U m . Am is a sum extended from m=1 to
sets. It is a grave mistake, which is often very costly, to use m=N U
inappropriate laws or formulas! X
A population of non-ordered units is, by convention, ML ¼ Mm ð2Þ
described as a zero-dimensional object. The absence of m
M m* mass of the average unit U m* of lot L, defined as the The Constitutional Heterogeneity CHL of L is defined as
average of the masses M m , the variance of h i in the population of N F elements making
P up the lot L (with i=1,2,. . .,N F ).
m Mm ML
1 X 2 X
Mm4 ¼ ¼ ð4Þ
NU NU CHL ¼ r2 ðhi Þ ¼ hi remembering that hi ¼ 0
NF i i
A m* mass of component A in the average unit U *m of lot ð9Þ
L
Like h i , CHL is physically dimensionless. It can be
AL observed that CHL is an intrinsic property of L, i.e., of the
Am4 ¼ ð5Þ population of N F elements that make up the lot at the
NU
moment of its sampling. By extension, we will also regard it
aL proportion of component A in lot L (i.e., grade of L) as a property of the material of which L is made up. With
defined as the weighted mean: particulate solids, crushing and grinding can increase the
P value of CHL . Agglomeration can reduce it, but mixing or
Pm am M m AL homogenizing has no effect.
aL ¼ ¼ ð6Þ
M
m m M L
h m : contribution to the heterogeneity of lot L made by The scale of observation is now such that unit U m is a
unit U m , defined as follows : group G n of neighboring elements F i , (with n=1,2,. . .,N G ).
The group G n forms a potential increment, I.
am aL Mm The Distributional Heterogeneity DHL of L is defined as
hm ¼ ðdimensionlessÞ ð7Þ
aL Mm4 the variance of h n in the population of N G groups (potential
increments) filling up the domain occupied by lot L:
We note especially that h m is physically dimensionless
and that the sum (and mean) of h m is zero: 1 X 2 X
DHL ¼ r2 ðhn Þ ¼ hn remembering that hn ¼ 0
X NG n n
hm ¼ 0 ð8Þ
m
ð10Þ
Our attention had been drawn to the quantity h m when DHL is physically dimensionless. Unlike CHL , the
formulating our first bequiprobableQ model as early as Distributional Heterogeneity DHL is not an invariable
1950. Indeed, the variance of the sampling error was found property of the population of the N F elements that make
to be proportional to the variance of h m within the up the lot at the moment of its sampling. Any mixing,
population of N U units (fragments) U m . The contribution blending, homogenizing or segregation, whether sponta-
h m made by unit U m to the overall heterogeneity of lot L neous or purposeful, alters the critical element distribution
appears therefore as a true vector of sampling errors— throughout the domain occupied by lot L. While mixing,
almost in the medical sense of the term—the contribution blending, homogenizing tend to reduce DHL , segregation
of each unit to the total sampling variance. Our objective, increases DHL .
in the more recent presentations of the sampling theory, is
to quantify the heterogeneity of the lot L to be sampled
and to establish simple, algebraic relationships between the 5. Relationship between CHL and DHL
sampling variance(s) and the quantified expressions of the
various forms of heterogeneity. We will now define the An important theoretical development was shown in Ref.
cardinal concepts involved in this task, CHL and DHL . [18], that DHL is proportional to CHL :
1 þ YZ
DHL ¼ CHL ð11Þ
1þY
3. First case: unit U m is a single element F i — definition
of the Constitutional Heterogeneity CHL of lot L
o Y is a dimensionless grouping parameter that
We here choose, as a first didactic step, to define the unit characterizes the size of the groups G n which, when
U m as comprised by a single element F i , i.e., a solid sampling lot L, will become the increments I q
fragment (hence the origin of the notation, F), a grain, a
molecule or an ion. Y z0 ð12Þ
28 P. Gy / Chemometrics and Intelligent Laboratory Systems 74 (2004) 25–38
1zZzO ð14Þ
Fig. 1. Schematic population of 144 units. Illustration of a completely Fig. 3. Schemetic population of 144 units. Illustration of a completely
homogenous material: all units are strictly identical. modular material.
P. Gy / Chemometrics and Intelligent Laboratory Systems 74 (2004) 25–38 29
theoretical developments. In practical applications, it is probabilities P m and that, after each trial, a census is made
replaced by the Heterogeneity Invariant, HIL , which can of the units selected. The lot L is reconstituted prior to the
nearly always be estimated, at least to its order of next trial.
magnitude. It is defined as follows:
K: Number of repeated selection trials. K is meant to tend
towards infinity
ML
HIL ¼ CHL physical dimension of a mass expressed S k : Sample obtained at the end of the kth trial (k=1,2,. . . ,K)
NF N k : Number of units in sample S k
in the same unit as ML ð15Þ M k : Mass of active components in sample S k
A k : Mass of critical component in sample S k
a k : Critical content of sample S k with:
8. The zero-dimensional probabilistic sampling model
ak ¼ Ak =Mk ð16Þ
This model has been presented in English only once in SETK : Imaginary set obtained by gathering all samples S k
Ref. [18]—a 700-page book that nobody reads (sic), but obtained at the end of the Kth trial. Beware of a
never before published in an accessible scientific journal. It possible confusion between k and K.
will perhaps interest the more fundamentally inclined N K : number of units in SETK
readers of Chemometrics and Intelligent Laboratory Sys- M K : mass of active components in SETK
tems to have access to it here. As mentioned, the original A K : mass of critical component in SETK
demonstration proposed in 1950 was rather awkward and a K : critical content of SETK with:
we are very pleased to have this opportunity to present the
generalized probabilistic model, Sections 9–15 below. aK ¼ AK =MK ð17Þ
Sections 16–20 pick up the present track again and ends
the general Part II. P m : number of units U m observed in SETK .
5 Distribution of the number N K of units in SET K : N K can 5 Distribution of the mass M k of active components in
be expressed in two ways: sample S k : From Eqs. (32) and (35) on the one hand and
Eqs. (33) and (36) on the other:
!N K is the sum of the K values of N k
X m ðM k Þ ¼
X
Mm P m ð37Þ
NK ¼ Nk with k ¼ 1; 2; N ; K ð23Þ m
k X
2
The K values have the same distribution, which implies: r ðM k Þ ¼ Mm2 Pm ð1 Pm Þ ð38Þ
m
mðNK Þ ¼ KmðNk Þ ð24Þ
5 Distribution of the mass A K of critical component in
2
r ðNK Þ ¼ Kr ðNk Þ 2
ð25Þ SET K : This mass can be expressed in two ways:
Geary (1930) and Bastien (1960) have studied the tion is not met, but it is when the following conditions are
distribution of the quotient of two random variables and fulfilled:
reached the following conclusions:
ð1Þ Pm ¼ P ¼ constant; irrespective of m ð50Þ
n In the most general case, the distribution law of this
quotient does not follow any of the simple laws ð2Þ PNPo : the probability of selection is not too small
presented in textbooks of mathematical statistics, and ð51Þ
there is no simple way of relating its moments to those Or in other words, when:
of the numerator and denominator.
n The distribution of the quotient, however, tends (1) The selection is correct,
towards a normal distribution when two conditions (2) The average sample mass and with it the actual
are fulfilled: sample mass is not too small:
(1) Both the numerator and the denominator follow
X
normal or quasi-normal distributions, Mk ¼ fmðMk Þ ¼ P Mm ¼ PML NPo ML ð52Þ
(2) The coefficient of variation (or relative standard m
deviation) of the denominator remains small in
comparison with unity. Then, the relative standard deviation of M k is small
n When these two conditions are fulfilled, but only then, enough to be regarded as negligible in comparison with
the expected value of the quotient is practically equal to unity. The second condition of Bastien is fulfilled, and
the quotient of the expected values of its two terms. we may regard the distribution of a k as approximately
normal.
mðAk Þ One of the major conclusions of the qualitative approach
m ð ak Þ ¼ f ð48Þ
m ðM k Þ (Part I) was that sampling simply had to be carried out
correctly. That conclusion is confirmed and reinforced by
5 Normality of the distributions of A k and M k . Discussion: the theoretical quantitative approach above.
The central limit theorem of Laplace–Liapounoff states that For all practical intents and purposes, we may assume
the characteristics (such as A k and M k ) of a sample S k tend that the second condition is fulfilled when the sample is
to follow a normal distribution as soon as the number N k of breproducible enoughQ, with the corollary that both
units comprising the sample is large enough and this BastienTs conditions are fulfilled when the sample is
property is valid irrespective of the distribution law of the representative, and only then.
individual characteristics.
The number of fragments in a sample of particulate
solids is nearly always a large, often a very large number, to 10. Expected value of the critical content aS of sample Sk
say nothing of the number of molecules and ions in a liquid
for example. When the units are no longer single constit- 5 The selection is not assumed to be correct: In order to
uents but increments, their number is much smaller and we render the text easier to read, we will write S instead of S k
have to answer the more subjective question: How large is a and adopt the following notations:
blarge numberQ in this context. If we reason in terms of X
a4 ¼ mðaS Þ; A4 ¼ mðAS Þ ¼ am Mm Pm ;
consequences, we would say that a number is large when
m
the addition or subtraction of one (more) unit does not alter X
the result significantly. M 4 ¼ m ðM S Þ ¼ Mm Pm
m ð53Þ
According to our experience, 30 could be regarded as a
practical minimum and 50 as a reasonably large number of We will now define the relative deviations a, b, c as
increments making up a sample. When this condition is functions of the mean of a S , A S and M S , respectively, and
fulfilled, therefore, we may conclude that the distributions write, for a given sample S:
of the numerator, of the denominator and of the quotient are
AS ¼ ð1 þ aÞa4; AS ¼ ð1 þ bÞA4; MS ¼ ð1 þ cÞM 4
approximately and reasonably normal.
5 Relative standard deviation of M k : From Eqs. (37)/(38), ð54Þ
we can compute the value of
AS ð1 þ bÞA4 ð1 þ b Þ
mðMk Þ aS ¼ ¼ ¼ a4 ¼ ð1 þ aÞa4 ð55Þ
ð49Þ MS ð1 þ cÞM 4 ð 1 þ cÞ
rðMk Þ
ð1 þ bÞ
We must remind the reader that so far, we have made Thence : ð 1 þ aÞ ¼ ð56Þ
ð 1 þ cÞ
no assumption regarding the set of probabilities P m . A
long demonstration (see Ref. 18]) shows that in the general In this expression, a, b, c are random variables. Our
case, the condition involving the relative standard devia- problem is to express a as a function of b and c. This
P. Gy / Chemometrics and Intelligent Laboratory Systems 74 (2004) 25–38 33
difficulty can be overcome by expanding the quotient 1/ The second approximation has been given in Ref. [18]. It
(1+c) into a power series which is strictly convergent as is only used in theoretical studies.
long as c remains smaller than unity (cb1), which we
assume, and rapidly convergent when cb1. We will write:
12. Expected value of the Correct Sampling Error, CSE
1
¼ 1 c þ c2 c3 þ c4 N etc: ð57Þ
ð 1 þ cÞ
Reminder of the general definition of the Total Sampling
Error, TSE:
1 þ a ¼ ð1 þ bÞ 1 c þ c2 c3 þ c4 N etc: ð58Þ aS aL
TSE ¼ ð67Þ
aL
a ¼ ðb cÞ þ c2 bc c3 bc2 þ N etc: ð59Þ
If the selection is now assumed to be correct:
In this series, each term between brackets is small in P m =P=constant: TSE becomes CSE.
comparison with the former. We can therefore define several The Total Sampling Error, TSE, boils down to what is
approximations: called the Correct Sampling Error, CSE.
mðaS Þ aL
First approximation: a1 ¼ ð b c Þ ð60Þ Hence mðTSEÞ ¼ mðCSEÞ ¼ ð68Þ
aL
Second approximation: a2 ¼ ðb cÞ þ ðc2 bcÞ ð61Þ First approximation: deduced from Eq. (64):
mðCSEÞ1 ¼ 0 ð69Þ
Or, after a few computations, in more explicit notations
Second approximation: deduced from Eq. (65):
(see Ref. [18]):
P 1 P X ðam aL Þ Mm2
m am Mm Pm mðCSEÞ2 ¼ 2 ð70Þ
First approximation: mðaS Þ1 ¼ a4 ¼ P bNaL P aL ML
m Mm Pm
m
P
m am M m
¼ P ð62Þ 13. Variance of the Correct Sampling Error, CSE
m Mm
Second approximation: First approximation deduced from Eq. (67) (for practical
P
ðam a4ÞMm2 Pm ð1 Pm Þ purposes, we do not need the second one)
mðaS Þ2 ¼ mðaS Þ1 m P ð63Þ
ð m Mm Pm Þ2 r2 ðaS Þ1 1 P X ðam aL Þ2 M2
r2 ðCSEÞ1 ¼ 2
¼ 2 2
m2
5 The selection is now assumed to be correct: aL PNU m aL ML
P m =P=constant ð71Þ
First approximation: mðaS Þ1 ¼ aL ð64Þ Or, remembering the definition Eq. (7) of h m and the fact
that m(h m )=0
In the first approximation, a Correct Sampling is unbiased.
1P X 2 1P 2
Second approximation: r2 ðCSEÞ1 ¼ h ¼ r ð hm Þ ð72Þ
PNU2 m m PNU
1P X M2
mðaS Þ2 ¼ aL ðam aL Þ m2 ð65Þ which can also be written, substituting M S for m(M S )=PM L ,
P m ML as
In the second approximation, a Structurally Correct 2 1P 1P 1 1
r ðCSEÞ1 ¼ CHL ¼ HIL ¼ HI L
Sampling Bias is detected. PNU PML MS ML
ð73Þ
This justifies the introduction and the definitions of h m
11. Variance of the critical content a S of the Correct
and HIL , respectively.
Sample, S
15. Practical implementation of the above formulas The Fundamental Sampling Error, FSE, is defined as the
error generated when the constituents (fragments, mole-
5 Incorrect sampling: expected value m(a S ) 1 : Reminder of cules, ions) making up the sample have been selected:
equality Eq. (62): n At random, i.e., with a uniform probability P of being
P
selected. The selection is correct.
m a m Mm P m
mðaS Þ1 ¼ P ð62Þ n Individually and independently: The selected constitu-
m Mm P m
ents are independent from one another.
Except under special experimental conditions, or in
computer simulations, we never know with precision the It has been shown that FSE is the absolute minimum of
sets of values of a m , M m and P m . Furthermore, sums Am , CSE. One cannot do better under any circumstances! This
usually extended to an innumerable number of terms, are justifies the name of the Fundamental Sampling Error, FSE.
5 Variance of FSE: devising a simplified formula: From
involved. It is therefore practically impossible to estimate
the incorrect sampling bias from this formula. Experience Eqs. (71)/(73), we write:
shows that with particulate solids but also with solutions r2 ðaS Þ1 1 P X ðam aL Þ2 M2
of heavy metal ions, the probability of selection P m is r2 ðFSEÞ1 ¼ 2
¼ 2 2
m2
aL PNU m aL ML
often a function of the physical and chemical properties
of unit U m , such as size, density, shape, mass and critical ð75Þ
content. A few experiments carried out on iron ores So far, we have used only strict formulas, except in the
showed that this bias might indeed be very large. The last steps when we introduced approximations. For the same
results of these experiments and simulations of the set of reasons as above, we can never implement this formula in
probabilities P m can be found in our books (especially practice. It is to overcome this difficulty that we devised, as
Ref. [13] dated 1967). In addition to this, we have also early as 1950, an approximate way to estimate its order of
shown, contrary to what is frequently believed, that it is magnitude and that, later, we introduced the Heterogeneity
simply wishful thinking to speak of a constant incorrect Invariant, HIL .
sampling bias, comparable to the shifting of the zero of a
pair of scales. For these reasons, we reached the 1P 1P 1 1
r2 ðFSEÞ1 ¼ CH L ¼ HI L ¼ HI L
conclusion that the only safe strategy with sampling bias PNU PML MS ML
is to suppress it by implementing a correct sampling, ð76Þ
which we know how to achieve (Part I).
2
5 Incorrect sampling: variance r (a S ) 1 : The same remarks
We showed in Ref. [18] that HIL (dimensions of a mass
apply to the variance. expressed in grams) could be written as:
5 Correct sampling: expected value of CSE:
HI L ¼ cbf gd 3 ð77Þ
First approximation: a S =a L Ym(CSE) 1 =0: the sampling
is unbiased. (74) where c : is a bconstitutional parameterQ (dimensions of
But in the second approximation, it is not: reminder of specific gravity expressed in g/cm 3 ). It is mathematically
equality Eq. (70) defined. It can vary from a fraction of unity to several
millions.
Several formulas have been proposed, including the author’s by the coarsest size classes, e.g., the size classes coarser
own. None seems completely satisfactory yet. than d max/2.
! The value of [HIL ]1 can be expressed as the product of the
f : is a particle shape parameter ðdimensionlessÞ : 0V f V1 Heterogeneity Invariant HIL1 of the size class L 1 multiplied
ð79Þ by the proportion M L1/M L .
In most cases, its value is near 0.50. HIL ¼ HIL1 ½ML1 =ML ð83Þ
g : is a size range parameter ðdimensionlessÞ : 0VgV1 ! HI L1 is an intrinsic property of the material that makes up
ð80Þ the size class L 1 and can be estimated from a Test Sample, S,
on the conditions that:
In most cases, its value is 0.25. From 0.4 to 0.8, when the
material is calibrated (sized). n The fragments making up the test sample S have been
taken from L 1 correctly (at random) and one by one
d: is the btop particle sizeQ defined as the aperture of the (independently).
square-mesh screen that retains 5% of the material n The number of fragments making up this sample is
(dimension of length, to be expressed in cm). Also denoted large enough. Above, this range of large numbers
by d 95 or d max. began with 30–50. This explains our choice of a
In expression Eq. (77) of HIL , we have succeeded in number larger than 50 (or better 100 with a reasonable
transforming a sum extended to a very large and unknown safety factor) as well as the name of the method. The
number of terms into a product of five factors only, the order examples we are going to present below will confirm
of magnitude of which can usually be estimated. Exception: this point.
liberation parameter of gold ores (tricky). n The major difficulty now lies in the estimation of the
In nearly all places where bThe FormulaQ is described in proportion M L1/M L. When this proportion is unknown,
the literature, one will find complete descriptions of these the following rules of thumb are often acceptable:
four parameters and further details on the top particle size, Natural, uncalibrated materials ( g=0.25): M L 1 /
etc. M L=0.34 (or 34%).
Upward calibrated materials: M L1/M L=0.40–0.70.
17. Variance of the Fundamental Sampling Error, 5Method of the 50–100 fragments: introduction: We will
FSE—experimental estimation regard each fragment F i as a size class in itself, with the
consequence that Eq. (81) becomes:
5 Experimental estimation of HI L : introduction: We always
recommend an experimental estimation, especially when X ðai aS Þ2 Mi2
HI S ¼
doubts may arise as to the validity of the estimate of the i
a2S MS ð84Þ
liberation parameter b. We will now describe what is called ðdimension of mass expressed in gramsÞ
the bMethod of 50–100 coarse fragmentsQ. From a practical
standpoint, this experimental method is feasible, and well 5 Method of the 50–100 fragments: practical implementa-
worth implementing when the top particle size d is coarser tion: The protocol described here is a simplified version of
than 10 mm. the method described in our books, see Part V.
5 Experimental estimation of HI L : method of 50–100 coarse
are presented in Fig. 7. Surprisingly, this curve shows uniform probability P of being selected. They can be taken
jumps as high as those observed with the ore of precious as:
metals. From point 10 to the end of the curve, the order
of magnitude of HIS remains between 26 and 36 mg. o Either one by one, independently, or
From point 30 onwards, the curve oscillates between 30 o by groups of neighbouring, potentially correlated
and 35 mg. elements.
It is interesting to note the variability of HIS from one
material to the next. It passes from 30 mg with the (slightly) We can now define these two components of CSE:
humid pellets to 25 g with the iron ore and to 250 g with the
precious metal ore. n Fundamental Sampling Error, FSE: When the
5 Conclusions of the Method: What is important is to condition of independence is fulfilled, i.e., when the
estimate the order of magnitude of HIL , i.e., to know in each fragments making up the sample are selected one by one,
case whether the minimum sample weight is 100 g, 10 kg the Correct Sampling Error, CSE, is limited to its
or, say, 1 ton. It would not be realistic to state, e.g., that this incompressible minimum that we called the Fundamental
minimum weight must be 6.75 kg. In a good number of Sampling Error, FSE. The fundamental error is the
cases, the simplified formula Eq. (75) is sufficient, but its consequence of the sole constitutional heterogeneity. When
weak point is the estimation of the liberation parameter b the selected elements are independent:
(or k or l according to different authors’ choice of
terminology). This is especially important when dealing CSE ¼ FSE ð87Þ
with precious metal ores or traces of any component in the
ppm range or below. We remind the reader that the formula n Grouping and Segregation Error, GSE: In practice,
proposed for b in our former publications is to be avoided except in experimental studies, this condition of independ-
(sic). Francois-Bongarcon has proposed a formula (see ence is never fulfilled: we never extract the elements only
literature survey in Part V) which, according to its author, one by one. What we do in practice is to extract increments,
seems to work well with precious metal ores, but the present I, made of neighbouring elements, and it is these groups
author has no personal experience of its use. In example 2 which have a uniform probability, P, of being selected: The
(precious metal ore), we believe that the only safe way to selection is correct, but the condition of independence is not
estimate HIL is to implement the 50–100-fragment method fulfilled. In Part I, we further discussed some of the primary
as described in this section. The mining and metallurgical reasons why in the Earth’s field of gravity, neighbouring
company that implemented it on its ores under our control elements (fragments, molecules or ions) cannot be assumed
found the results to be in good agreement with the to be independent from one another (differential gravity
metallurgical balance. segregation). There is often some intrinsic correlation of the
element size, density or shape and its position within the
domain occupied by lot L, see e.g. Fig. 5B.
18. Breaking up the Correct Sampling Error, CSE In this case, a second error is added to FSE, namely the
Grouping and Segregation Error, GSE. This error is a
All sampling errors result from the existence of one form consequence of the distributional heterogeneity, which is
or another of heterogeneity. We have here defined the two itself a function of the constitutional heterogeneity and of
main forms of heterogeneity, the constitutional heteroge- the increment size (the smaller the increment, the smaller
neity and the distributional heterogeneity. The quantitative GSE). In the general case, applying to all real-world
approach shows that the correct sampling error CSE is the systems, therefore:
sum of two components generated by these two forms of
heterogeneity. In our probabilistic mathematical model, all CSE ¼ FSE þ GSE ð88Þ
elements of lot L are submitted to the selection with a
19. Quantitative approach—zero-dimensional model.
Difference between solids, liquids and aerosols
Fig. 8. Progressive estimate of HIS for 60 sphalerite (ZnS) flotation 5 Particulate solids. Both the qualitative and the quanti-
concentrate pellets (cf. text). tative approaches are equally important. We must
38 P. Gy / Chemometrics and Intelligent Laboratory Systems 74 (2004) 25–38
worry about the mean (qualitative approach: the 5 Defined the Constitutional Heterogeneity, CHL , of
sampling must be correct, therefore accurate) and the lot L considered as a population of single elements.
variance as well (quantitative approach). The most CHL is the variance of the corresponding population
dangerous errors, i.e., those which generate sampling of h.
biases, take place when the major recommendation of 5 Defined the Heterogeneity Invariant, HIL , derived from
the qualitative approach is not respected. With solids, CHL for practical purposes and usage.
all sampling must always be correct! 5 Defined the Distributional Heterogeneity, DHL , of lot
5 Liquids. The quantitative approach is here much less L considered as a population of groups of neighbouring
important than the qualitative approach. This is due to elements. DHL is the variance of the corresponding
the fact that we are speaking of sizes in the angstrom population of h.
range, instead of mm or lm. We must above all worry 5 Defined the Total Sampling Error, TSE, generated
about the mean to prevent bias. when selecting constituents in a probabilistic way
5 Gases and aerosols. Again, both aspects of the theory (non-probabilistic sampling cannot be analyzed
are valid. Remark: When we studied the particular theoretically).
problems associated with sampling gases and smokes, 5 Broken up the Total Sampling Error, TSE, into the sum
both from a theoretical and practical standpoint, of two components, CSE and ISE.
together with a world leading cement producer, we 5 Defined the Correct Sampling Errors, CSEs, observed
found that the sampling difficulties were here of a when the sampling is correct.
different nature: the specific representativity problems 5 Defined the additional Incorrect Sampling Errors,
arose from interaction between the sampling tool or ISEs, observed when the sampling is incorrect.
device and the gas or smoke, when subsequently 5 Defined the Fundamental Sampling Error, FSE, as the
cooling down the sample. This interaction depends on Correct Sampling Error, CSE, observed in ideal
the thermodynamics of the total smoke/sampling-tool conditions, when the constituents are selected correctly,
system. It is difficult, perhaps impossible, to prevent one by one and independently.
some of the incorrect sampling errors in such a 5 Shown that the variance of FSE is proportional to the
situation. As far as we are aware, this problem has Constitutional Heterogeneity, CHL , and, in practical
not yet been satisfactorily solved, or addressed in a applications, to the Heterogeneity Invariant, HIL .
proper scientific fashion. 5 Proposed a practical, experimental method to estimate
HIL and hence the variance of FSE.
5 Defined the Grouping and Segregation Error, GSE, as
20. Quantitative approach—zero-dimensional model. the additional error generated when selecting constit-
Recapitulation and conclusions uents with a uniform probability P, by groups (incre-
ments) of non-independent constituents. The variance
Sampling errors are the consequence of one form or of GSE is proportional to the Distributional Hetero-
another of heterogeneity. geneity, DHL .
Sampling of a homogeneous material would, from the
definition of homogeneity, be an exact operation. But even if The major purpose of Parts I and II is to help the
homogeneity can be defined mathematically, it is never reader understand the delicate mechanisms involved in
observed in real-world systems. the sampling operation. It is hoped that we have shown
In order to express the sampling errors in terms of their how the consecutive steps of the theoretical—and
mean, variance and mean square, we had first to define practical—reasoning are intimately linked to one
mathematically—and then quantify—the various forms of another. This tutorial can only be a summary of the
heterogeneity and then to express the moments of the subject matter. For the reader who wishes to know full
sampling errors as a function of the quantified heterogeneity. details it is necessary to study Refs. [20] and [18] (in
To achieve this purpose, we have: this order).
External references for part II
5 Defined the contribution h of a given unit U to the Bastien, M. (1960). Loi du rapport de deux variables
heterogeneity of the set L of units. Unit U can be normales. Revue de Statistique Appliquée, vol. 8, pp 45–50
either a single constituent F or a group G of (1960).
constituents such as an increment I. The heterogeneity Geary, R.C. (1930). The frequency Distribution of the
contribution h is a function of the mass and Quotient of Two Normal Variables. J. Roy. Stat. Soc. 93,
composition of unit U and lot L. 442 (1930).