Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
We have seen this model before (a log-log model), and it can introduce log-normal bias into the regression model. However, this bias can be corrected (see Baskerville 1972, Flewelling and Pienaar 1981, Snowdon 1991).
Yi = 0 (1 exp( 1 X i ))
Normal Equations
Recall from OLS, we want to minimize:
Q = (Yi 0 1 X i )
i =1 n 2
where, Yi = vector of dependent variables, Xi = vector of the independent variables, = vector of the regression parameters, and i = vector of error terms; i.e.:
Yi1 Y i2 . Yi = . . Yiq
X i1 X i2 . Xi = . . X iq
0 1 . = . . p 1
i1 i2 . i = . . iq
Now, minimize Q:
Q = Yi f ( X i , )
i =1 n
n f ( X i , ) Q = 2 Yi f ( X i , ) k i =1 k
k=0, 1, ..., p - 1
g px1
The normal equations are nonlinear in gk. They have no closed form solutions. So, iterative procedures are necessary to solve the equations (e.g., Gauss - Newton, Marquardt, Method of Steepest Descent).
[Y
n i =1
f ( X i , g) np
MSE is biased, but the bias is small when n is large. Because there are no closed form solutions, variances, confidence intervals, R2, F-tests, etc. do not exist. However, we have a theorem that allows us to calculate asymptotic variances (see Theorem 13.32 on page 528): Theorem: When i ~ N(0,2) and n is large, the sampling distribution of g ~ N(0,2) with E[g] . Thus, s2(g) = 2(DD)-1. So, we can calculate asymptotic tvalues:
gk k ~ t n p , s(g k )
k = 0,1,..., p - 1
s(g k )
Weighted NLS
When we have non-constant variances, we can use weights to eliminate heteroscedasticity (H ) . Recall that b=(XWX)-1XWY. When the variances, 2i, are not constant, we choose weights, Wi, that are inversely proportional to 2i, so that 2i=2/wi. Though there are different ways to detect H , we will examine residual plots for our example. We will then find weights that will eliminate H . To determine which weight is the best, we will use Furnivals Index of Fit (Furnival 1961), just as we did with weighted linear regression earlier.
Example
The Chapman - Richards sigmoid growth model will be used in our example exercise. This model has seen extensive use in forestry, especially to model tree growth. The Chapman Richards growth model quantitatively describes the growth of an organism as the difference between its anabolic (constructive) growth and the catabolic (destructive) growth. This relationship can be expressed by the differential equation:
dY = Y Y , dt
where: Y = size of the organism t = time anabolic growth = Y (i.e., proportional to the size of the organism, raised to the power ) catabolic growth = Y (i.e., proportional to the size of the organism). This nonlinear first-order differential equation is a Bernoulli equation of the form:
dY + a (x )y = f (x )y n . dt
where z = Y1-. This Bernoulli equation can be solved as a linear first-order differential equation by separation of variables to give the solution (3) in earlier section, True Nonlinear Models. To bring this model into the context of our biological example, we will replace Y with S = size of organism and X with A = age of organism.
Though considered an empirical model, its parameters do lend themselves to biological interpretation. A sigmoid growth form has an asymptote for the maximum size of an organism:
asymptote
SIZE
AGE
This asymptote is represented by 0. The 1 and 2 parameters together define the shape of the curve. The first derivative of S with respect to A gives the inflection point at which the growth rate is the fastest (i.e., point of greatest slope on the curve):
2 1 S = 2 0 (1 exp( 1A )) 1 exp( 1A ) A
inflection point
GROWTH
0
AGE
The second derivative identifies where the growth rate is increasing and decreasing over time:
2 1 2S 2 = 2 0 (1 exp( 1A)) 1 exp( 1A) + 2 A
2 2
1 exp( 1A)
RATE OF GROWTH
inflection point 0
AGE
For a detailed review of the family of sigmoid growth models as well as the derivation of the generalized formulation of the sigmoid growth model, see Schnute (1981).
For our example, we will use weighted NLS to fit a Chapman-Richards growth model, HT - 4.5 = b0(1-exp(-b1AGE))b2, to 400 height-age measurements of Douglas-fir trees. This will produce what is commonly known in forestry as height-age curves, which show height development over time for trees of a given species. H is common in unweighted regression models of this nature, so we will use weights to eliminate H . For our example, I have provided a scatterplot of the predicted values and the residuals from unweighted NLS to show the H : 7
The weights are reciprocals of age raised to various powers; we will use 1/A0.5, 1/A1, and 1/A1.5. We will use PROC NLIN to fit the models (NOTE: other computer software packages can perform nonlinear regression, such as SPSS, JMP, BMDP, MINITAB, and SYSTAT; only SAS (to my best knowledge) performs weighted NLS - weighted NLS must be done by transformations in the other packages).
8.8443
ln AGE = 1330.25
b0
b1
b2
83.5978 (6.5370) 85.7294 (7.2133) 90.5040 (8.9869) 105.0 (15.6143) 226.2 (143.1)
0.0262 (0.00500) 0.0246 (0.00453) 0.0217 (0.00418) 0.0163 (0.00402) 0.00509 (0.00419)
1.3992 (0.1593) 1.3495 (0.1245) 1.2755 (0.0959) 1.1584 (0.0729) 0.9737 (0.0547)
1/X1.5
1/X2
NOTE: asymptotic standard errors are inside the parentheses. The best fit model is indicated in RED font.
BIBLIOGRAPHY
Baskerville, G.L. 1972. Use of logarithmic regression in the estimation of plant biomass. Can. J. For. Res. 2:49-53. Chapman, D.G. 1961. Statistical problems in population dynamics. In: Proc. Fourth Berkeley Symp. Math Stat. and Prob. Univ. Calif. Press, Berkeley. Draper, N.R., and H. Smith. 1998. Applied Regression Analysis, 3rd edition. John Wiley and Sons, Inc., New York. Flewelling, J.W., and L.V. Pienaar. 1981. Multiplicative regression with lognormal errors. For. Sci. 27:281-289. Furnival, G.M. 1961. An index for comparing equations used in constructing volume tables. Forest Sci. 7:337-341. Gallant, A.R. 1987. Nonlinear Statistical Models. John Wiley & Sons, New York.
Greene, W.H. 1992. Econometric Analysis, 2nd edition. Macmillan Publishing Company, New York. Pienaar, F.J., and Turnbull, K.J. 1973. The Chapman Richards generalization of Von Bertalanffys growth model for basal area growth and yield in even=aged stands. Forest Science 19:2 22. Richards, F.J. 1959. A flexible growth function for empirical use. J. Exp. Bot. 10(29):290 300. SAS. 1999. SAS/STAT Users Guide, Version 8. SAS Institute, Inc., Cary, North Carolina. Schnute, J. 1981. A versatile growth model with statistically stable parameters. Can. J. Fish. Aquat. Sci. 38:1128-1140. Snowdon, P. 1991. A ratio estimator for bias correction in logarithmic regressions. Can. J. For. Res. 21:720-724.
10