Accounting For Natural and Extraneous Variation in The Analysis of Eld Experiments

Accounting for natural and extraneous variation in the analysis of eld experiments
Arthur R. Gilmour NSW Agriculture, Agricultural Research & Veterinary Centre, Orange, NSW, 2800, Australia Brian R. Cullis NSW Agriculture, Agricultural Research Institute, Wagga Wagga, NSW, 2650, Australia Arunas Verbyla Department of Statistics, University of Adelaide, Adelaide, SA, 5005, Australia April 28, 1997
SUMMARY We identify three major components of spatial variation in plot errors from eld experiments and extend the two-dimensional spatial procedures of Cullis and Gleeson (1991) to account for them. The components are non-stationary large scale (global) variation across the eld, stationary variation within the trial (natural variation or local trend) and extraneous variation which is often induced by experimental procedures and is predominantly aligned with rows and columns. We present a strategy for identifying a model for the plot errors which uses a trellis plot of residuals, a perspective plot of the sample variogram and, where possible, likelihood ratio tests to identify which components are present. We demonstrate the strategy using two illustrative examples. We conclude that while there is no one model that adequately ts all eld experiments, the separable autoregressive model is dominant. However, there is often additional identi able variation present.
Keywords: Spatial analysis, REML, eld experiments, variogram
1 Introduction
Since Wilkinson et al. (1983) presented their nearest neighbour method of analysis for eld experiments, many alternative approaches have been proposed (Besag and Kempton, 1986; Cressie and Hart eld, 1996; Cullis et al., 1997; Cullis and Gleeson, 1991; Gleeson and Cullis, 1987; Green et al., 1985; Martin, 1990 and Zimmerman and Harville, 1991). Some studies have reviewed and evaluated these methods; see for example Lill et al. (1988), Wilson (1994), Kempton et al. (1994) and Grondona et al. (1996). The result is a degree of confusion about methods and models, sometimes resulting in a lack of con dence in using spatial models routinely for the analysis of small plot eld experiments. The motivation for this paper stems from our regular use of spatial analysis procedures for the analysis of over 500 replicated and unreplicated variety trials annually in plant improvement programs in Australia over the last decade. We are convinced of their usefulness in achieving both improved accuracy and e ciency. However, it is apparent that no one spatial model will suit all trials and that there is often identi able variation introduced during the experiment which is additional to that which would be naturally present. Automatic use of a particular spatial model can lead to quite serious ine ciencies and the simplistic models advocated by Wilkinson et al. (1983), Green et al. (1985) and Besag and Kempton (1986) are often inappropriate. Previous work which suggested one dimensional models would be adequate for trials with long thin plots may also be misleading. Cullis and Gleeson (1991) advocated two dimensional ARIMA models chosen empirically. Kempton et al. (1994) reported a reanalysis of over 200 trials and demonstrated the need for two dimensional models in most of them. Kempton et al. (1994) argued that choosing among models is unacceptable. They showed that the routine use of the rst di erence ARIMA(0,1,1) ARIMA(0,1,1) model was ine cient for some trials. Several authors have questioned the need for di erencing (Martin, 1990; Zimmerman and Harville, 1991). Wilson (1994) re-analysed the datasets used by Cullis and Gleeson (1991) and found that rst order separable autoregressive models were often satisfactory. Cullis et al. (1997) acknowledge di erencing is unnecessary for many trials. Furthermore, di erencing can often lead to the need for more complex modelling of the variance structure for the plot errors (Cullis and Gleeson, 1991). Unlike most geostatistical data, data collected from small plot eld experiments can exhibit variation arising from sources other than the natural sources such as soil moisture gradients. However, a common belief in previous models (Cullis and Gleeson, 1991; Martin, 1990; Zimmerman and Harville, 1991 and perhaps Cressie and Hart eld, 1996) is that 'trend' is mainly due to natural variation. Therefore, the aim of this paper is to extend these spatial 2
models to include these extra sources of variation. The extension we propose identi es the need for modelling of at least three types of variation. This model is presented and discussed in section 2. The general approach we recommend for choosing an appropriate variance model for plot errors starts with tting a plausible variance model, such as the rst order separable autoregressive model denoted by AR1 AR1. After examining plots of the residuals and their spatial covariance structure, the plot error model is revised to include any patterns detected. The key to our approach is the use of the sample variogram and related likelihood ratio tests to assist the modelling. The variogram is presented and discussed in section 3. In section 4 we present the detailed analysis of two examples which illustrate the methodology. Section 5 presents some conclusions.
2 Spatial mixed linear model

2.1 The model
We begin by assuming we have data for n plots such that the trial is indexed by the rows and columns of an r c array. While the array is assumed contiguous, the extension to several separate arrays or irregular arrays is straightforward. We also assume that yi(si); i = 1; : : : ; n is a realisation of a random variable Yi (si) where fYi (si) : si 2 < g i = 1; : : : ; n: For most eld experiments, fsig is a two cell vector of the cartesian coordinates of the plot centroids (Zimmerman and Harville, 1991) which are located on a regular grid. If y is the vector of plot data (typically yield) in eld order (that is columnwise by convention), the model for y is
2
y = X + Zu + +
( 1) ( ) ( 1) ( ) ( 1) ( 1)
(1)
where t is the vector of xed e ects with design matrix X n t ; u b is the vector of random e ects with design matrix Z n b ; n is a spatially dependent random error vector and n is a zero mean random vector whose elements are pairwise independent. We further assume (u; ; ) are pairwise independent. Note that this model may be simpli ed by omitting u and either or . In postulating (1) we recognise the need for complete exibility and ease of interpretation in modelling spatial variation in eld experiments. This requires a knowledge and recognition of the potential sources of spatial variation. The genesis of nearest neighbour methods (Wilkinson et al., 1983) or spatial analysis (Cullis and Gleeson, 1991; Gleeson and Cullis, 3
1987) arose from the assumption of the presence of an underlying (smooth) trend re ecting changes in fertility, moisture status and depth of the soil. These authors included both large scale (global) and small scale (local) spatial inhomogeneities (Cressie, 1991) in the error process e = + . The presence of global trend often necessitated di erencing of the data, and many approaches relied on the assumption of stationarity in e after either rst (Besag and Kempton, 1986) or second (Green et al., 1985) di erencing. It is di cult to accurately assess the need for di erencing and the evidence supporting it for eld experiments is unclear (Wilson, 1994). Modelling the covariation due to large and small spatial inhomogeneities is analogous to the modelling of trend in geostatistics. In that context, trend is modelled as a mixture of spatial covariances and/or deterministic functions of spatial coordinates (Cressie, 1991). We therefore may include polynomial functions of the spatial coordinates in X to model global trend variation as an alternative to di erencing. We may also use smoothing splines by including the appropriate terms in X and Z as shown in the example in section 4.1 and discussed in detail by Verbyla et al. (1997). Therefore, for simplicity, we have omitted di erencing from the following development prefering to use polynomials and splines. This is in contrast to the approach advocated by Cullis and Gleeson (1991). The other source of variation which frequently occurs in small plot eld experiments arises from experimental procedure. We call this extraneous variation. It includes for example, the e ects of serpentine harvesting of plots, the use of multiplot seeders and variation due to unequal plot lengths arising from inaccurate trimming. These sources of variation are often well described by design factors, such as rows and columns, since the primary cultural operations are performed along rows and/or columns of the eld array and often have a recurrent pattern. This decomposition of the spatial variation is not unique and is largely operational. The most important aspect is the separation of global trend (through the tting of polynomials or cubic smooting splines) and local trend. We have found the process of modelling variation in eld experiments can be facilitated by using the sample variogram and likelihood ratio tests.
2.2 Estimation
We assume that the joint distribution of (u; ; ) is Gaussian with mean zero and variance
2 26 4
G( ) 0 0 3 0 ( ) 0 7 5 0 0 I
where = = , is the vector of variance component ratios corresponding to possible subvectors in u, and is a vector of spatial covariance parameters. The marginal distribution of y is then y N X ; (ZGZ 0 + R) where R = R( ) = + I , = ( 0; )0. Models for , the local trend vector, can be chosen from the class of separable processes (Cullis and Gleeson, 1991, Martin, 1990). Alternatively, we can use the covariance models used in geostatistics and discussed by Cressie (1991) and Zimmerman and Harville (1991). These models often assume isotropy which can be inappropriate for modelling the variance structure of plot errors in eld experiments. Furthermore, there are computational advantages in assuming separability resulting in signi cant savings in computer time for the analysis of larger trials. Equation (1) is a linear mixed model and Gilmour et al. (1995) present an algorithm for the estimation of variance parameters in these models by Restricted Maximum Likelihood (REML). The algorithm, known as the average information algorithm, is computationally e cient and easily extended to handle a variety of models. We use this algorithm to obtain REML estimates of the variance parameters, empirical generalized least squares (GLS) estimates (signi ed by ^) of the xed e ects and empirical Best Linear Unbiased Predictors (BLUPs, signi ed by ~) of the random e ects. Asymptotic Wald/F test statistics and standard errors are obtained using the inverse of the coe cient matrix of the mixed model equations as required. The models described in this paper may be tted using S-PLUS (Becker et al., 1988) functions written by the second author, ASREML (Gilmour et al., 1996) and GENSTAT 5 release 4.1 (Payne et al., 1993).
2 2 2
3 Plot error model identi cation

The e cient estimation of xed e ects in (1) relies on the appropriate choice of the plot error variance model. Identi cation of error models for eld experiments has been discussed by Martin (1990), Cullis and Gleeson (1991) and more recently by Grondona et al. (1996). Martin (1990) and Cullis and Gleeson (1991) suggest the use of the spatial correlation matrix of the whitened or recursive residuals. We have found a perspective plot of the sample variogram more helpful than the spatial correlation matrix, particularly for detecting extraneous e ects in the errors. In the next section we will introduce the theoretical variogram which has been widely used in geostatistics (Cressie, 1991) and repeated measures analysis (Diggle et al., 1994). In later sections, we will discuss the implementation and use of the sample variogram in designed eld experiments, with particular reference to the adjustment of the sample variogram for the estimation of the xed e ects.
3.1 The Variogram

Given a spatially correlated error process E( ) at points s and t, the theoretical variogram (also called the semi-variogram) of E( ) is the function 1 1 ! (s; t) = var E(s) ? E(t)] = V(s; s) + V(t; t) ? 2V(s; t)] 2 2 where s; t 2 < and V( ; ) is the covariance function of E( ). In most applications, we assume that E( ) is second order stationary in which case !(s; t) = !(s ? t). To illustrate these concepts, we consider e = + where is a zero mean spatially correlated process with a directional exponential covariance (DEC) structure distributed independently of which is a zero mean white noise process (Cressie, 1991). Let
2
l = ll1 = 2
! (s; t) = ! (l) =
2
"
"
be the "distance" between points s and t. Then = 0 +

2
js ? t j js ? t j
1 1 2 2 1 1
1 ? exp(? l ? l )]
2 2
l 6= 0 l=0
1 1 1
(2) (3)
The measurement error term induces a jump discontinuity at l = 0. For most eld experiments, the displacement vector takes values for l of 0; d ; 2d ; : : : ; (r ? 1)d and for l of 0; d ; 2d ; : : : ; (c ? 1)d where d and d are the plot dimensions. Then
1 2 2 2 2 1 2
(2) can be written as a function of an 'indexed' displacement vector l with values for l of 0; 1; 2; : : : ; r ? 1 and for l of 0; 1; 2; : : : ; c ? 1. So (2) and (3) become
1 2
! (l ) =
2 2
= = 0
+ +
2 2
1 ? exp(? d l ? d l )] 1 ? l1 l2
1 1 1 2 2 2 1 2 2 2
l 6= 0 l =0
(4)
where = exp(? d ) and = exp(? d ). This formulation demonstrates the equivalence of the DDEC model and the AR1 AR1 model for eld experiments. The variogram for this model is depicted in gure 1.
1 1 1 2
v 0.4 0.6 0.8 1 0 0.2

8 yd 6 isp lac 4 em en 2 t 0
0 2 8 6 ent 4 cem spla x di 10 12 14
Figure 1: Variogram for a standardised AR1 AR1 process, If

1
= 0:9; y = 0:4, = 0:3.
= , it follows that
! (l) =
2
= 0
1 ? exp(? (l + l ))]
1 2
l 6= 0 l=0
This is the symmetric DEC model, and is similar to the isotropic exponential covariance (IEC) model (Cressie, 1991, p61) which is
! (l) =
2
= 0
1 ? exp(?
2 2 l1 + l2 )
l 6= 0 l=0
3.2 The sample variogram

For the data vector y given in equation (1), the variogram ordinates are 1 e (s ) ? e (s )] 8i; j = 1; : : : ; n; i 6= j vij = 2 i i j j
2
where e = fei(si)g = y ? X ? Zu. When and u are known and under the assumption that y is Gaussian, the sampling distribution of the vij is so that vij is unbiased for !(si; sj ).
vij ! (si ; sj )
2 1
As implied in 3.1, there will be many vij with the same absolute displacement since the plots are arranged in a regular array. The sample variogram is taken as the triple (lij ; lij ; vij ) where lij = jsi ? sj j and lij = jsi ? sj j are the absolute displacements and vij is the sample mean of the vij with the same absolute displacements. We choose to present the sample variogram, truncated, as a perspective plot of the triples with more than about 30 pairs in each average.
1 2 1 1 1 2 2 2
3.3 The e ect of estimation of and u

The result that vij is unbiased for !(si; sj ) is based on the assumption that and u are known. In practice, we replace and u by their GLS estimates (^ ) and BLUP (~ ) respecu tively, so that the BLUP of the residual vector is given by ~ ~ e = y ? X ^ ? Z u = RPy where P = R? ? R? W C ? W 0R? ; W = X Z ]; C = W 0R? W + G is the coe cient matrix from the mixed model equations and is partitioned by virtue of the partition in W ; G is a square matrix of order t + b partitioned conformably with W 0R? W and is zero except in the lower diagonal block corresponding to Z 0R? Z where it equals G? . ~ Under the assumption of a Gaussian distribution for y, e N 0; (R ? WC ? W 0)
1 1 1 1 1 1 1 1 2 1
assuming ( ; ) is known. The variogram ordinates vij can be expressed as a quadratic form in y , that is vij = (a0ij e)(a0ij e) = e0aij a0ij e = e0Aij e and similarly ~ ~ ~ ~ ~ ~ vij = (a0ij e)(a0ij e) = e0aij a0ij e = e0Aij e ~ 8
where Aij n n has in positions fi; ig and fj; j g, ? in positions fi; j g and fj; ig and is zero elsewhere. Taking the expectation
( ) 1 2 1 2
E (~ij ) = v
2 2
= = a0ij Raij ? a0ij WC ? W 0aij = E (vij ) ? a0ij W C ? W 0aij

2 2 2 1 2 1
trace Aij (R ? W C ? W 0) i h trace Aij R] ? trace Aij W C ? W 0

1 1
Thus vij is biased. However the bias can be removed by considering the spectral decompo~ sition of W C ? W 0 which has t + b non-zero eigenvalues. Let
1
WC ?1W 0 = X
then
t+b
k=1
wk w0k wk k aij 0 k (aij w k ) 0 k w k Aij w k

k k
E (~ij ) = E (vij ) ? v
= E (vij ) ? = E (vij ) ?
a0
ij
X X
k k
w0
2
(5)
Thus the bias in vij is easily calculated as the weighted sum of the variogram ordinates for ~ each of the t + b eigenvectors wk . In practice we are concerned with the general shape of the variogram and so it is often su cient to use only the largest r eigenvalues and their corresponding eigenvectors, where r is much smaller than t + b. (In our examples, we have used them all.) This derivation assumes ( ; ) are known. In practice these are replaced by their REML estimates and so (5) is approximate. The e ect of the estimation of ( ; ) on the distribution ~ ~ of e (and functions of e) is an important problem. Kenward and Roger (1997) have examined this issue for the testing of xed e ects in REML.
4 Examples
In this section we present the analysis of two examples. The emphasis is on the choice of models for global, extraneous and natural variation. We proceed in a sequential manner. 9
The aim is to account for the dominant sources of variation, as indicated by a trellis plot of residuals, a perspective plot of the sample variogram (indexed by displacement within rows and columns), plots of BLUPs across rows and columns, and possibly by REML likelihood ratio tests. Unfortunately, the latter cannot always be used because the xed part of the full model may change, thereby changing the basis of the REML log-likelihood.
4.1 Wheat trial, South Australia

These data are kindly provided by Mr G Hollamby and are taken from the 1994 yield trials for the evaluation of advanced breeding lines and commercial varieties. A total of 107 varieties were sown in 3 replicates in a near complete block design. Each replicate occupied 5 columns, with 22 plots (denoted as rows) per column. Table 1 presents the variety codes and yield data in the eld layout. Three varieties (82=Tincurran, 89=VF655 and 104=WW1477) were sown twice in each replicate. Plots were sown to a length of 6 metres and were 0.75 metres wide; they were trimmed to 4.2 metres length before harvest. The total trial area excluding outside bu ers was 15 6 by 22 0:75 square metres. Table 2 presents an overview of the sequence of models tted for this data set, together with their REML log-likelihoods. A complete block analysis where blocks are the replicates is included in Table 2 for comparison. The lines drawn horizontally within Table 2 indicate where the base for the REML log-likelihood changes; likelihood ratio tests cannot be made between models in these di erent parts of the table. Table 3 gives the estimates of the spatial variance parameters in the various models tted. The results in Tables 2 and 3 will be referred to in the following discussion. We have found that a reasonable initial model for the errors of yield data from small plot (replicated) eld experiments is the DEC (2) or AR1 AR1 (4) model; see also Grondona et al. (1996). We therefore start by tting this model. Using a complete block analysis as the baseline, the AR1 AR1 error model increases the REML log-likelihood by 132.0. Figure 2 is a trellis plot of the residuals after tting the AR1 AR1 error model; the residuals are plotted against column number within each row. 10
Table 1: Variety codes and yields for SA wheat trial in eld order
4 10 17 16 21 32 33 34 72 74 75 81 83 106 107 3 5 6 8 9 11 13 483 526 557 564 498 510 344 600 466 370 448 513 481 430 355 445 447 400 460 371 420 453 14 30 15 18 19 20 23 24 25 26 27 28 35 36 37 38 39 71 73 76 77 78 400 405 442 444 557 446 470 515 364 477 377 525 408 605 437 440 446 534 510 603 442 506 79 80 97 102 40 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 569 514 487 557 750 710 773 747 683 701 658 682 667 699 585 645 642 630 729 714 651 820 59 60 61 62 63 64 65 66 67 68 69 70 88 103 104 98 2 1 89 87 84 85 734 767 693 790 714 747 815 762 684 827 819 689 758 705 565 799 758 774 652 684 656 792 89 90 86 91 92 93 94 95 96 7 99 100 101 29 104 105 22 31 41 12 82 82 571 726 633 595 684 744 740 705 734 637 837 657 765 689 537 633 629 614 714 684 722 791 104 18 70 22 56 41 37 53 86 40 50 16 80 78 26 95 96 17 82 87 44 58 642 687 625 619 781 729 708 650 620 785 741 847 657 704 662 710 817 754 925 821 767 875 62 92 91 36 57 68 23 102 33 47 105 2 104 28 90 32 21 39 76 84 79 89 665 715 644 716 696 608 785 631 617 758 776 736 631 782 702 638 635 592 534 706 596 735 4 67 19 103 35 14 107 11 73 30 6 29 75 97 82 10 25 7 52 3 83 88 738 727 749 719 655 535 780 661 687 757 763 754 910 733 868 791 696 712 850 803 802 793 74 71 100 27 42 98 12 77 72 13 64 59 8 5 85 45 101 89 65 99 34 51 678 509 532 443 680 644 516 572 584 624 651 611 593 651 558 594 690 602 626 747 637 569 15 81 106 61 69 66 38 9 24 55 60 46 93 31 63 43 54 48 1 49 20 94 536 735 709 724 637 613 556 618 754 845 906 813 750 740 635 743 711 819 845 818 734 844 5 4 104 46 63 3 48 98 60 8 51 71 15 22 14 86 18 12 69 87 84 35 568 508 469 628 561 599 637 696 581 553 666 531 547 617 428 549 571 554 668 657 566 578 2 42 81 91 57 25 26 72 80 73 45 52 53 102 17 54 19 96 92 7 10 38 706 726 686 559 618 527 715 680 473 546 724 627 700 546 673 657 661 733 599 567 663 530 27 89 28 56 37 99 97 68 64 49 74 24 95 21 103 75 94 55 11 67 30 85 298 439 363 358 317 491 256 328 396 288 315 361 387 350 351 298 485 447 376 395 381 442 79 9 66 82 82 59 101 83 62 90 106 104 65 31 76 78 50 70 88 77 23 33 371 328 452 399 435 387 503 446 422 379 458 296 487 465 335 369 469 518 573 493 541 423 47 29 100 93 89 107 39 32 40 58 13 34 61 20 43 36 44 41 105 6 1 16 194 246 283 307 294 257 238 285 254 302 397 393 428 319 458 396 381 343 458 438 508 551
11
Table 2: Overview of models tted to the SA wheat data Variance Model Variance REML loglikelihoodz Global/Extraneousy Natural Parameters change CB block 2 -1128.4 1 AR1 AR1 3 -996.4 132 2 lin(col) + spl(col) AR1 AR1 4 -985.9 2a lin(col) + spl(col) + AR1 AR1 5 -969.5 16.4 column 3 lin(col) + spl(col) + AR1 AR1 5 -955.4 colcode + column 3a lin(col) + spl(col) + AR1 AR1 6 -950.2 5.2 colcode + column + row 4 lin(col) + spl(col) + AR1 AR1 6 -944.1 colcode + column + lin(row) + rowcode + row 5 lin(col) + spl(col) + AR1 AR1 7 -939.1 5.0 colcode + column + measurement error lin(row) + rowcode + row 6 lin(col) + spl(col) + AR1 AR1 6 -939.1 0.0 colcode + column + measurement error lin(row) + rowcode + row ylin(col) and lin(row) represent the linear regression of yield on the column index and row index included in X , colcode and rowcode represent the factors de ned in the text also included in X , column and row represent factors based on the column and row indices included in Z and spl(col) represents the random spline component included in Z . zThe likelihood ratio test may not be used to compare models separated by a horizontal line. symmetric DEC model, i.e. = in equation (2). Model
1 2
12
Table 3: REML estimates of variance parameters for the models tted to the SA wheat data Model 1 2 3 4 5 6 Global/Extraneous Natural Variation spl(col) column row row column measurement y variancey variancey variancey z z error y 0.904 0.424 19714 6882 0.791 0.267 8584 4696 1382 0.433 0.360 3599 4703 1362 174 0.473 0.169 2954 4487 911 198 0.885 0.283 1288 2525 4556 1055 192 0.863 0.307 1259 2331
2 1 2 2 2
yUnits for all variance components are (g/plot) . The spline variance is an average variance
component obtained from the diagonal elements of Z (us)Z (us )0. z = exp? d , j = 1; 2. For model 6, r = c = 0:1965.
s
j j
A loess smoother (Becker et al., 1988) is included in each panel of the trellis plot, to highlight any trend within each row of the design. There is clearly a smooth quadratic-like trend across columns in each row. We turn to the sample variogram. The distinct property of the sample variogram which indicates non-stationarity and hence the presence of e ects, is the tendency for systematic changes in semi-variance across one or both directions (in our case row and column displacement) in the variogram. Figure 3(a) is the sample variogram of the residuals after tting the initial model. The labels row and column displacement in Figure 3 refer to the distance (in metres) between plot centroids. The sample variogram shows that the semi-variance within the same column appears fairly constant across all rows (we return to this later), while the semi-variance within each row has two components, an increasing trend and steps. The increasing trend is due to the smooth global quadratic pattern seen in Figure 2. The steps indicate that an additional component due to columns may be required, something that was not apparent in Figure 2. In summary, the sample variogram displays non-stationarity, and the need for at least a smooth column e ect in the error model. We account for the smooth global trend by using a cubic smoothing spline (model 2), indexed by column number. Verbyla et al. (1997) illustrate how smoothing splines can be formulated 13
10
12
14
10
12
14
19
20
21
22
200
-200
13
200
14
15
16
-400
17
18
residual yield, m1
-200
-400
10
11
12
200
-200
1
200
-400
-200
-400 2 4 6 8 10 12 14 2 4 6 8 10 12 14 2 4 6 8 10 12 14
column
Figure 2: Trellis plots of residuals from tting the AR1 AR1 model to the SA wheat data
14
40 0 20000 000
0
co 50 lum
10
10000
co 50 lum
10
nd isp . 0
(b)
nd
isp
0
(a)
5 . disp ow r
5 0 row
disp
40 0 2000 00
0
50 lum
co 50 lum nd isp 0
10 5 . disp row
co
10
nd isp . 0
(c)
5 . disp row
5000
(d)
Figure 3: Sample variograms calculated from four models (a) model 1: AR1 AR1, (b) model 2: lin(col) + spl(col) + AR1 AR1, (c) model 3: lin(col) +spl(col) + colcode + column + AR1 AR1, (d) model 5: lin(col) +spl(col) + colcode + column + lin(row) + rowcode + row + AR1 AR1 + measurement error, tted to the SA wheat data
15
as a mixed model and as such have a natural place in a parametric modelling framework. The mixed model formulation necessitates the inclusion of an intercept and slope in and an associated vector of c ? 2 (13 in this case) random e ects, us. The smoothing parameter is a variance ratio and the BLUPs of the random e ects of the smoothing spline us are used to calculate the spline (Green and Silverman, 1994 p22). Figure 3(b) is the sample variogram of the residuals from the t of model 2. The sample variogram no longer increases within columns, so that the inclusion of the cubic smoothing spline has accounted for the global variation. The plot error variance (Table 3) has been reduced from 19714 to 8584, and the sill (or plateau) of the sample variogram has been reduced from approximately 40000 to 10000. However, there is a clear indication of additional column e ects in the shape of Figure 3(b). There appears to be a cyclic pattern within columns. To accomodate this additional column variability, a random column e ect was included in the model. The assumption of an independent, identically distributed Gaussian distribution for this extraneous e ect is discussed below. Including this random column e ect increases the REML log-likelihood by 16.4 (model 2(a) compared to model 2 in Table 2). Figure 4 is a plot of the BLUPs of the column e ects from this model. Again, there is clear evidence of a pattern. The BLUPs 2 and 4 columns apart have the same sign and are similar in magnitude. Consultation with the breeder reveals that a plausible reason for this pattern lies in the experimental procedure. The plots were trimmed to an assumed equal length well before harvest. This operation is conducted by driving down the centre of the pathway between columns with a spray boom. Plots in adjacent columns are trimmed by spraying the ends of each plot with herbicide, thereby killing wheat plants within this region. The driver positions the vehicle in the centre of the pathway by sighting white pegs at each end of the pathway. This operation is di cult to accomplish accurately, and any consistent deviation from the centre of the pathway can result in an alternating sequence of long/short plot lengths and also in nonrectangular plot regions; these will be exhibited in the yields at harvest. In these data, the e ects are accentuated every fourth column. The breeder explained that, starting from the 16
50 BLUP 0 -50
-100
2 4 6 8 column number 10 12
14
Figure 4: Plot of the BLUPs of the column e ects estimated by tting model 2 (a) to the SA wheat data. left, the driver had sprayed every second pathway in a serpentine manner, and then sprayed the others starting from the right viz * + + * * + + * * + + * * + + * where the arrows indicate direction and the subscripts indicate order (see Figure 5). Thus, plots were trimmed by one of the 4 combinations *+, ++, +* and **. Because of the cyclic pattern and strong supporting evidence, the assumption of independent column e ects is clearly inappropriate. To overcome this problem, we include a factor (as a xed e ect called colcode) to describe the 4 phase sequence, with levels 123412341234123. This is model 3 of Table 2.
1 16 2 15 3 14 4 13 5 12 6 11 7 10 8 9
The REML estimate of plot error variance is now 3599 and the autoregressive parameter for rows is reduced from 0.791 (model 2) to 0.433 (model 3). The sample variogram is given in Figure 3(c). The column e ects are essentially removed, the sill dropping to approximately 4000. However, there is now a suggestion of row e ects showing up as a valley in the variogram at lag 3 and lag 6, together with a tendency for the variogram to rise with row displacement; these were not noticed earlier as they were masked by the much stronger e ects associated with columns. This highlights the need for a sequential approach to model identi cation. 17
column
Figure 5: The spraying order used for trimming the SA wheat plots. We include a random row e ect (model 3(a)). The REML log-likelihood increases by 5.15. Figure 6 is a plot of the BLUPs of the row e ects. There are two aspects that require attention. Firstly, there is a linear trend across rows. Secondly, there is a pattern seen taking rows in threes. Again, the breeder has an explanation. In Figure 6, the points have been labelled L, M and R, denoting the left, middle and right of triplets of plots. The trial was sown in a serpentine manner using a combine which sows three plots at a time. Thus the sequence of sowing is (RML) (LMR) (RML) (LMR) although the rst two plots, RM, were guard plots and are not part of the data used in the analysis. Consider the rst triplet (RML) . The operator positions the tractor in the centre of the M plot, and the distance between the centroids of the neighbouring R and L plots and the M plot is exactly 0.75 metres (matching the plot width). The physical gaps between the plots themselves are therefore xed for triplets. The di culty arises when the operator moves to the next triplet (LMR) . In this case, while the centroid distances for this triplet are again 0.75 metres, the distance to the L centroid of the preceeding L centroid can vary, simply because of the judgement required by the operator. Thus the gap between the two L
1 2 3 4 1 2
18
40
20
R L R L L M L L R M R 5 M M 10 row number M R 15 20 R M M L L
BLUP -20 0
Figure 6: Plot of the BLUPs of the row e ects estimated by tting model 3 (a) to the SA wheat data. plots, and hence between triplets, may di er to the gap between plots within triplets. This introduces variation with respect to L and R plots. In fact, as Figure 6 shows, the middle plots have a signi cantly lower yield, relative to the left and right plots. This suggests a systematic e ect due to sowing. After lengthy discussion with the breeder, we include a factor (called rowcode) with levels 2212212 , matching L (LMR) (RML) , where 1 is the middle plot of a triplet, and 2 is the left or right plot of a triplet, and in addition include a linear e ect in rows (model 4). Note the REML estimate of error variance has decreased to 2954 (Table 3), and the estimates of the autoregressive parameters have changed.
1 2 3
Figure 7 depicts the rst column (the points) and row (the +) of the sample variogram from models 4 (variogram not shown) and 5 (Figure 3(d)) with the tted values from models 4 and 5. There is a systematic lack of t for model 4, brought about by the apparent discontinuity at (0,0). Inclusion of a nugget variance or measurement error in model 5 signi cantly improves the t (Figure 7), increasing the REML log-likelihood by 6.1. We also see in Figure 7 that the tted lines for rows and columns, expressed as a displacement distance, are nearly coincident suggesting that the covariance structure of the natural spatial variation may be direction independent. Fitting the symmetric DEC model (model 6 of 19
model 4
model 5 +
5000
5000
+ + + +
+ + + +
+ + + +
+ + + + + + + + +
3000
1000
+ 0 20 40 60 80
1000 + 0 20 40 60 80 displacement 0
3000
+ v
+ + + + +
displacement
Figure 7: Plot of the observed (points) and tted (lines) margins of the sample variogram of the residuals after tting models 4 and 5 to the SA wheat data Table 2), we nd the reduction in REML log-likelihood is 0.04. This is our nal model. The IEC model gives a REML log-likelihood 0.42 lower than the symmetric DEC model for this data. Table 4 presents the GLS estimates of the global and extraneous xed spatial trend components in model 6, together with their standard errors and Wald type F statistics. The xed e ects for trimming e ects (colcode in the table) and sowing e ects (rowcode in the table) are clearly important, as are the linear row and column e ects. We have avoided formal tests of these xed e ects as we are mindful of the asymptotic nature of the tests and the need for an adjustment in the denominator degrees of freedom (see Kenward and Roger, 1997). The GLS estimates of the variety e ects for each of the 5 spatial models and complete block analysis are presented in Figure 8. There is substantial discrepancy between the variety e ects especially with respect to the complete block analysis. Table 5 gives the bias for model j relative to model 6 as de ned by
av (j î(j ) ? î(6)j)
( )
(6)
where î j are the estimated variety e ects for model j . Table 5 also includes the average 20
standard error of di erence which is a rough guide to precision but we do not recommend its use for choosing between models.
-150 -50 50 150 -150 -50 50 150 -150 -50 50 150 0 100 -200 -200 0 100 0 -150 0 100 -150 0 100 -150
m1
0 100 100 100 -150 -50 50
m2
-150
m3

m4

-150
m5

m6

-150
-150 -50 50 150
-150 -50 50 150
cb
150
Figure 8: Comparison of variety e ects estimated by tting several variance models to the SA wheat data The relative bias is still substantial for most spatial models, with the rankings of varieties changing signi cantly. This underlies the importance and relevance of modelling the spatial variation. The use of a standard AR1 AR1 model or a CB model when clearly unsatisfactory could result in a signi cant change in variety selection by the breeder.
4.2 Slate Hall Spring Wheat Trial No. 2, 1978

These data are taken from the trials analysed by Kempton et al. (1994). The trial was a balanced lattice square with 25 varieties sown in 6 replications. The trial layout, lattice blocking (for replicate 1) and variety randomization are depicted in Table 6. In the analysis 21
Table 4: REML estimates of xed e ect parameters and associated test statistics for the SA wheat data Model coe cient standard Wald F term error statistic colcode *+ -75.7 20.3 colcode ++ 14.7 19.3 colcode +* -4.9 20.2 colcode ** 65.9 22.0 6.25 rowcode M -16.1 4.4 rowcode L, R 16.1 4.4 13.48 lin(col)y -13.4 3.1 18.31 lin(row) 3.0 1.3 5.33 yUnits for lin(col) and lin(row) are g/plot/index.
Table 5: Summaries of variety e ects for the SA wheat data. Model CB 1 2 3 4 5 6 Bias Wald F ; statistic 47.3 1.43 13.2 7.31 12.7 5.75 10.7 5.46 4.8 5.55 0.4 5.07 | 5.08
106
223 223 222 219 217 217 217
average sedy 93.7 36.7 38.9 39.4 37.8 37.5 37.6
yThis average ignores the fact that three varieties have double replication.
22
below, the layout is taken as 15 rows by 10 columns. The plot size was 1.5 m by 4 m giving an experimental area of 15 1.5 m by 10 4 m. Tables 7 and 8 present an overview of the sequence of models tted and REML estimates of the variance components for each model. As in Example 4.1, horizontal lines in Table 7 separate models for which the REML log-likelihoods are not comparable. The lattice analysis is included in Table 7 for comparison. We begin with the AR1 AR1 model as the variance model for the plot errors. Fitting this model we nd the REML log-likelihood is ?670:4 (Table 7) compared to ?671:7 for the lattice analysis. The REML estimates of the autoregressive coe cients (Table 8 under model 1) are suggestive of the presence of extraneous variation as the largest correlation coe cient is not associated with the shortest distance between plot centroids, that is with row displacement. The residuals from the t of model 1 are plotted against row number for each column in Figure 9. A loess smooth superimposed to indicate any possible global trend across rows within the columns shows a general decline. This is supported by the sample variogram given in Figure 10(a), where there is a consistent increase in semi-variance with increased row displacement. Inclusion of a linear row covariate reduces the REML estimate of the residual variance considerably (model 2, Table 8). The sample variogram of the residuals from the t of model 2 is given in Figure 10(b) and indicates changes with row displacement and an increasing semivariance with column displacement when row displacement is 0, providing further evidence for row e ects. Adding row e ects (model 3) increases the REML log-likelihood by 4.8 units and substantially reduces the column autoregressive coe cient from 0.669 to 0.371 (Table 8). The sample variogram of the residuals for this model is presented in Figure 10(c). This is in reasonable agreement with the theoretical variogram for an AR1 AR1 process, although the edging of the variogram for low column displacement suggests the possibility of column e ects. In particular, notice that the semivariance consistently increases to lag 1 then drops to lag 2. Including column e ects (model 4) increased the REML log-likelihood by 2.3 units, a modest 23
Table 6: Variety codes and yields for the Slate Hall data
13 15 11 12 14 14 1 22 18 10 25 11 9 2 18 3245 3230 3125 3365 2945 2865 2795 2770 2590 2310 2305 2650 2625 2960 2555 : : : : : 23 25 21 22 24 3 20 11 7 24 7 23 16 14 5 : 8 : 10 : 6 : 7 : 9 6 23 19 15 2 13 4 22 20 6 3520 3305 3240 3155 2700 2825 2700 2840 3190 2515 3040 2335 3060 3000 2885 : : : : : 18 20 16 17 19 17 9 5 21 13 1 17 15 8 24 : : : : : 3 5 1 2 4 25 12 8 4 16 19 10 3 21 12 9 14 4 19 24 19 11 22 8 5 10 4 11 17 23 2850 3335 3085 3055 2390 2985 3090 2730 3325 2415 2850 2535 2820 2415 2450 7 12 2 17 22 6 3 14 25 17 21 20 2 8 14 3070 3705 3110 3155 3125 3005 2895 2945 2235 2425 2650 2805 2815 3115 2870 8 13 3 18 23 23 20 1 12 9 12 6 18 24 5 3355 3560 2885 3150 3045 3015 3295 2305 3370 2360 3330 2500 2890 2310 2755 6 11 1 16 21 2 24 10 16 13 3 22 9 15 16 3030 3455 3165 3095 2805 2940 2400 2745 2760 2725 2645 2735 2930 3285 2600 10 15 5 20 25 15 7 18 4 21 19 13 25 1 7 3155 3205 2855 3190 2440 3185 3105 2520 2815 2420 2910 2620 2395 2905 2660
3055 2745 2885 2970 2295 2470 2825 2645 3125 1910 2965 2450 2925 3225 2930
3085 2990 3120 3155 2770 2770 2805 2965 2715 2385 2765 2450 3180 3145 2110
3380 3390 3265 3080 2925 2640 3410 3365 3055 2245 2725 2610 3005 2575 3260
24
Table 7: Overview of models tted to the Slate Hall 1978 wheat trial Variance Model Variance REML loglikelihoodz Global/Extraneousy Natural Parameters change lattice rep + rep.row 4 -671.7 + rep.column 1 AR1 AR1 3 -670.4 1.3 2 lin(row) AR1 AR1 3 -664.5 3 lin(row) + row AR1 AR1 4 -659.7 4.8 4 lin(row) + row AR1 AR1 5 -657.4 2.3 + column 5 lin(row) + row AR1 AR1 + 6 -657.0 0.4 + column measurement error ylin(row) represents the linear regression of yield on the row index included in X , rep, row and column represent factors based on the replicate, row and column indexes included in Z . zThe likelihood ratio test may not be used to compare models separated by a horizontal line. Model
Table 8: REML estimates of the variance parameters for the Slate Hall wheat trial Model Global Extraneous lin(row) row column slope variance variance 1 2 -30.4 3 -31.8 20616 4 -31.7 20293 2518 5 -31.0 18953 2428 Natural Variation Error row column measurement Variance 0.260 0.179 0.234 0.125 0.200
1
0.745 0.669 0.371 0.439 0.606
5745
62027 47117 25830 23945 19729
25
2 6
8 7
10
12
14 8
8 9
10
12
14 10 600
400
200
-200
residual yield, m1
-400
-600 1 600 2 3 4 5
400
200
-200
-400
-600
10
12
14
10
12
14
10
12
14
row number
Figure 9: Trellis plots of the residuals from tting the AR1 AR1 model to the Slate Hall wheat data
26
100000
co
50000
lum
nd
isp
5 0 ro
15 10 . isp wd
co
lum
nd
isp
0
(b)
5 ro
15 10 isp. wd
(a)
40 0 20000 000
co 20 lum n
dis
p.
5 ro
15 10 isp. wd
0 20000
co 20 lum n
dis
p.
5 ro
15 10 isp. wd
(c)
(d)
Figure 10: Sample variograms calculated from tting models (a) model 1: AR1 AR1, (b) model 2: lin(row) + AR1 AR1, (c) model 3: lin(row) + rows + AR1 AR1, (d) model 5: lin(row) + rows + columns + AR1 AR1 + measurement error to the Slate Hall wheat data
27
amount. Finally, we include a measurement error term; there is little change in the REML loglikelihood. The sample variogram is given in Figure 10(d). Figure 11 presents the rst row and column of the sample variograms for models 4 and 5 together with the tted semivariances. The minor improvement in the REML log-likelihood from model 4 to 5 is re ected in this gure.
model 4 30000 30000 + + + 10000 + + + + + + model 5
20000
+ 10000
+ 0 10 20 30 40
+ + + + + + + +
20000 + 0 0
10
20
30
40
displacement
displacement
Figure 11: Plot of the observed (points) and tted (lines) margins of the sample variogram of the residuals after tting models 4 and 5 to the Slate Hall wheat data The inclusion of the measurement error component ( ) often results in only a minor improvement in the REML log-likelihood in our experience. We believe it makes biological and statistical sense to include it in the model whenever possible. However, when the autoregressive parameters are near zero, it is often not possible or very di cult to estimate . The regular spatial arrangement also probably contributes to this di culty in estimating using REML in some trials. This is an interesting problem requiring further work.
2 2
Table 9 presents a summary of the bias in variety e ect estimates calculated as described in equation (6) relative to the variety e ects for model 5, Wald F-statistics and the average sed from the 5 spatial models and the lattice analysis. The generalised least squares estimates of the variety e ects from these analyses are plotted in Figure 12. The estimated variety e ects 28
Table 9: Summary of variety e ects for the Slate Hall wheat trial Model lattice 1 2 3 4 5 Bias 19.4 15.5 15.0 10.3 3.7 Wald F ; statistic 14.5 24.6 22.0 19.8 21.3 19.9
24
125 125 124 124 124 124
av sed 81.4 76.2 78.4 79.2 76.0 76.4
are in fairly close agreement as opposed to the previous example. The higher replication, more robust design and less trend are all contributing factors.
5 Conclusions
The primary aim of this paper has been to extend the spatial models proposed by Cullis and Gleeson (1991) and other approaches such as Zimmerman and Harville (1991) and Martin (1990). More recently Cressie and Hart eld (1996) considered the spatial analysis of eld trials using a conditionally speci ed Gaussian model for the plot errors. Our extension attempts to recognise the sources of variation in eld experiments. These sources, namely large scale variation (global trend), extraneous variation and small scale (local or natural) variation are common in data from small plot eld experiments. Given the examples in this paper and our experience over the last decade, it is not appropriate to assume one spatial model will suit all trials. The nonstationarity observed in some eld trials is due to global and/or extraneous variation. Although di erencing is one method of handling it, it is often wasteful of degrees of freedom and information on varieties. We believe it is more appropriate to model global trend using polynomials and spline smoothers, and to account for extraneous variation by extending spatial models to include design e ects. This approach often leads to an understanding of causes of the extraneous variation leading to improved experimental technique in an attempt to reduce the variation in future trials. It is in direct contrast to the approaches advocated by several authors (Cullis and Gleeson, 29
-600
-200
200
-600
-200
200
-600
-200
200 400 -600 0 400 -600 0 400 -600 0
m1
400
-600
m2

400
m3
-600 -200 0 200
m4

-600
400
m5

-600 -600
lattice
-600
-200
200
-200
200
Figure 12: Comparison of variety e ects estimated by tting several variance models to the Slate Hall wheat data
30
1991; Wilkinson et al., 1983; Green et al., 1985). Kempton et al. (1994) suggest that only one spatial model should be tted to all trials. They show that the ARIMA(0,1,1) ARIMA(0,1,1) rst di erence spatial model is ine cient for many trials. This ine ciency may be due to the fact that it discards variety information unless there are strong row/column e ects, or rows and columns are orthogonal to varieties (Kempton et al., 1994). Consequently it performed poorly in trials with less spatial variation. Our approach is to choose a spatial model which is consistent, to the best of our knowledge, with the data. A reanalysis of the data presented by Kempton et al. (1994) showed that the AR1 AR1 model had a higher REML log-likelihood than the incomplete block analysis in 99 of 163 trials. However, extraneous row and/or column variation was important in 44 trials and linear row and/or column covariates were important in 83 trials. Our model is not dissimilar to the model proposed by Cressie and Hart eld (1996) although our approach to modelling is quite di erent. We include design e ects only if there is evidence to suggest they are needed. There is some similarity between the AR1 AR1 variance model for the plot errors we use and the conditionally speci ed Gaussian model they present. Again, we would use any variance model which was consistent with the data. The general superiority of the AR1 AR1 model over the IB model justi es its use as an initial model for spatial analysis. We have found that assuming independence of plot errors for an initial model can be misleading in the subsequent identi cation of the variance model for plot errors. Use of the AR1 AR1 model as an initial model allows for a more accurate assessment of the presence of global and extraneous variation or outliers. Knowledge of the experiment and close collaboration with the scientist is strongly encouraged during the modelling process. We note that spatial models were developed for analysing quantitative traits such as yield in eld trials. These models may not necessarily be appropriate for other traits such as disease incidence or in experiments with diverse treatments where di erent processes contribute to the error structure. The frequency of the presence of extraneous variation in eld experiments, both for these data 31
and other data from eld experiments conducted in Australia, as expressed as row and column e ects, vindicates the use of designs such as alpha-latinised row-column designs (Williams and John, 1989). It also highlights the need for experimenters to become aware of the consequences of their cultural and plot techniques on the analysis of eld experiments. Work is currently underway to search for improved designs which have high e ciency assuming an AR1 AR1 + row + column + measurement error variance structure.
ACKNOWLEDGEMENTS
We thank Rob Kempton for making the UK variety trial data available. The nancial support of the Grains Research and Development Corporation of Australia is gratefully acknowledged. We dedicate this paper to the memory of our colleague Cheryl Wilson whose work is referenced and who was accidently and tragically killed in July, 1996.
References
Becker, R.A., Chambers, J. M., & Wilks, A. R. 1988. The new S language. Wadsworth and Brooks/Cole. Besag, J., & Kempton, R. A. 1986. Statistical analysis of eld experiments using neighbouring plots. Biometrics, 42, 231{251. Cressie, N.A.C. 1991. Statistics for spatial data. John Wiley and Sons. Cressie, N.A.C., & Hart eld, M.N. 1996. Conditionally-speci ed Gaussian models for spatial statistical analysis of eld trials. Journal of Agricultural, Biological and Environmental Statistics, 1, 0{0. Cullis, B. R., & Gleeson, A. C. 1991. Spatial analysis of eld experiments - an extension to two dimensions. Biometrics, 47, 1449{1460. Cullis, B. R., Gogel, B. J., Verbyla, A. P., & Thompson, R. 1997. Spatial analysis of multienvironment early generation trials. Biometrics, 53, 00{00. (submitted). 32
Diggle, P.J., Liang, K-Y., & Zeger, S.L. 1994. Analysis of longitudinal data. Clarendon Press Oxford. Gilmour, A. R., Thompson, R., & Cullis, B. R. 1995. Average Information REML, an e cient algorithm for variance parameter estimation in linear mixed models. Biometrics, 51, 1440{1450. Gilmour, A.R., Thompson, R., Cullis, B.R., & Welham, S. 1996. ASREML. Biometric Bulletin. NSW Agriculture. Gleeson, A. C., & Cullis, B. R. 1987. Residual maximum likelihood (REML) estimation of a neighbour model for eld experiments. Biometrics, 43, 277{288. Green, P. J., Jennison, C., & Seheult, A. H. 1985. Analysis of eld experiments by least squares smoothing. Journal of the Royal Statistical Society, Series B, 47, 299{315. Green, P.J., & Silverman, B.W. 1994. Nonparametric Regression and Generalized Linear Models. A roughness penalty approach. Chapman and Hall. Grondona, M. O., Crossa, J, Fox, P.N., & Pfei er, W. H. 1996. Analysis of variety yield trials using two-dimensional separable ARIMA processes. Biometrics, 52, 763{770. Kempton, R. A., Seraphin, J. C., & Sword, A. M. 1994. Statistical analysis of twodimensional variation in variety yield trials. Journal of Agricultural Science, Cambridge, 122, 335{342. Kenward, M. G., & Roger, J. H. 1997. The precision of xed e ects estimates from restricted maximum likelihood. Biometrics, 53, 000{000. Lill, W. J., Gleeson, A. C., & Cullis, B. R. 1988. Relative accuracy of a neighbour model for eld trials. Journal of Agricultural Science, Cambridge, 111. Martin, R. J. 1990. The use of time-series models and methods in the analysis of agricultural eld trials. Communications in Statistics, 19, 55{81.
33
Payne, R. W., Lane, P. W., Digby, P. G. N., Harding, S. A., Leech, P. K., Morgan, G. W., Todd, A.D., Thompson, R., Tunnicli e Wilson, G., Welham, S. J., & White, R. P. 1993. Genstat 5 Release 3 Reference Manual. Oxford: Oxford University Press. Verbyla, A.P., Cullis, B.R., Kenward, M.G., & Welham, S.J. 1997. Smoothing splines in the analysis of designed experiments and longitudinal data. submitted, 00, 00{00. Wilkinson, G. N., Eckert, S. R., Hancock, T. W., & Mayo, O. 1983. Nearest neighbour (NN) analysis of eld experiments. Journal of the Royal Statistical Society, Series B, 45, 151{211. Williams, E. R., & John, J. A. 1989. Construction of row and column designs with contiguous replicates. Applied Statistics, 38, 149{154. Wilson, C. A. 1994. An errors-in-variables spatial mixed model. Tech. rept. University of New England. Zimmerman, D. L., & Harville, D. A. 1991. A random eld approach to the analysis of eld plot experiments. Biometrics, 47, 223{239.
34

Accounting For Natural and Extraneous Variation in The Analysis of Eld Experiments

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Accounting For Natural and Extraneous Variation in The Analysis of Eld Experiments

Caricato da

Copyright:

Formati disponibili

Accounting for natural and extraneous variation in the analysis of eld experiments

2 Spatial mixed linear model

3 Plot error model identi cation

3.1 The Variogram

be the "distance" between points s and t. Then = 0 +

v 0.4 0.6 0.8 1 0 0.2

Figure 1: Variogram for a standardised AR1 AR1 process, If

= 0:9; y = 0:4, = 0:3.

3.2 The sample variogram

3.3 The e ect of estimation of and u

= = a0ij Raij ? a0ij WC ? W 0aij = E (vij ) ? a0ij W C ? W 0aij

trace Aij (R ? W C ? W 0) i h trace Aij R] ? trace Aij W C ? W 0

wk w0k wk k aij 0 k (aij w k ) 0 k w k Aij w k

4.1 Wheat trial, South Australia

-150 -50 50 150

-150 -50 50 150

4.2 Slate Hall Spring Wheat Trial No. 2, 1978

223 223 222 219 217 217 217

average sedy 93.7 36.7 38.9 39.4 37.8 37.5 37.6

0.745 0.669 0.371 0.439 0.606

62027 47117 25830 23945 19729

125 125 124 124 124 124

av sed 81.4 76.2 78.4 79.2 76.0 76.4

200 400 -600 0 400 -600 0 400 -600 0

Potrebbero piacerti anche