Stata Library - Panel Data Analysis Using GEE

Stata Library: Panel Data Analysis using GEE
1 of 12
http://www.ats.ucla.edu/stat/stata/library/gee.htm
Help the Stat Consulting Group by
stat > stata > library >
Stata Library
Panel Data Analysis using GEE
Note: This page uses Stata 11 syntax in the examples.
Introduction
Panel data analysis, also known as cross-sectional time-series analysis, looks at a group of people, the 'panel,' on more
than one occasion. Panel studies are essentially equivalent to longitudinal studies, although there may be many response
variables observed at each time point.
These data are from a 1996 study (Gregoire, Kumar Everitt, Henderson & Studd) on the efficacy of estrogen patches in treating postnatal depression. Women
were randomly assigned to either a placebo control group (group=0, n=27) or estrogen patch group (group=1, n=34). Prior to the first treatment all patients took
the Edinburgh Postnatal Depression Scale (EPDS). EPDS data was collected monthly for six months once the treatment began. Higher scores on the EDPS
are indicative of higher levels of depression.
Before reading in the data we will need to change the size of the largest matrix that Stata can use. We need to do this because one of the analyses requires a
large number of coded variables:
set matsize 160

use http://www.ats.ucla.edu/stat/stata/library/depress, clear
Let the analyses begin

Note that the data are in the wide format, we will collect some information and perform two analyses while the data are in
this format.
sort group
by group: summarize pre dep1 dep2 dep3 dep4 dep5 dep6
-> group=
0
Variable |
Obs
Mean
Std. Dev.
Min
Max
---------+----------------------------------------------------pre |
27
20.77778
3.954874
15
28
dep1 |
27
16.48148
5.279644
7
26
dep2 |
22
15.88818
6.124177
4
27
dep3 |
17
14.12882
4.974648
4.19
22
dep4 |
17
12.27471
5.848791
2
23
dep5 |
17
11.40294
4.438702
3.03
18
dep6 |
17
10.89588
4.68157
3.45
20
-> group=
1
Variable |
Obs
Mean
Std. Dev.
Min
Max
---------+----------------------------------------------------pre |
34
21.24882
3.574432
15
28
dep1 |
34
13.36794
5.556373
1
27
dep2 |
31
11.73677
6.575079
1
27
dep3 |
29
9.134138
5.475564
1
24
dep4 |
28
8.827857
4.666653
0
22
dep5 |
28
7.309286
5.740988
0
24
dep6 |
28
6.590714
4.730158
1
23
corr pre dep1 dep2 dep3 dep4 dep5 dep6
(obs=45)
|
pre
dep1
dep2
dep3
dep4
dep5
dep6
---------+--------------------------------------------------------------pre |
1.0000
dep1 |
0.1922
1.0000
dep2 |
0.3904
0.4982
1.0000
dep3 |
0.3958
0.5258
0.8672
1.0000
dep4 |
0.1658
0.3933
0.7357
0.7831
1.0000
dep5 |
0.2848
0.3674
0.7500
0.8520
0.8449
1.0000
dep6 |
0.2688
0.2795
0.6900
0.7967
0.7894
0.9014
1.0000
7/3/2013 3:15 PM
2 of 12
graph matrix dep1 dep2 dep3 dep4 dep5 dep6, half
Let's check to see if the groups differ on the pretest depression score:
ttest pre, by(group)

Two-sample t test with equal variances
-----------------------------------------------------------------------------Group |
Obs
Mean
Std. Err.
Std. Dev.
[95% Conf. Interval]
---------+-------------------------------------------------------------------0 |
27
20.77778
.7611158
3.954874
19.21328
22.34227
1 |
34
21.24882
.61301
3.574432
20.00165
22.496
---------+-------------------------------------------------------------------combined |
61
21.04033
.476678
3.722975
20.08683
21.99383
---------+-------------------------------------------------------------------diff |
-.4710457
.9658499
-2.403707
1.461615
-----------------------------------------------------------------------------Degrees of freedom: 59
Ho: mean(0) - mean(1) = diff = 0
Ha: diff < 0
t = -0.4877
P < t =
0.3138
Ha: diff ~= 0
t = -0.4877
P > |t| =
0.6276
Ha: diff > 0

t = -0.4877
P > t =
0.6862
There isn't much of a difference between groups on the pretest so let's continue on to the panel data analysis.
GEE with Continuous Response Variable

In order to use these data for our panel data analysis, the data must be reorganized into the long form using the reshape command.
reshape long dep, i(subj) j(visit)

(note:
j = 1 2 3 4 5 6)
Data
wide
->
long
----------------------------------------------------------------------------Number of obs.
61
->
366
Number of variables
9
->
5
j variable (6 values)
->
visit
xij variables:
dep1 dep2 ... dep6
->
dep
7/3/2013 3:15 PM
3 of 12
----------------------------------------------------------------------------Before we begin the panel data anlyses let's look at some other analyses for comparison. We will begin with a repeated measures analysis of variance. This is
the analysis that requires the larger matrix size.
anova dep group / subj|group visit group#visit /, repeated(visit)

Number of obs =
295
Root MSE
= 3.39594
R-squared
=
Adj R-squared =
0.7699
0.6980
Source | Partial SS
df
MS
F
Prob > F
------------+---------------------------------------------------Model | 8643.81572
70 123.483082
10.71
0.0000
|
group | 548.494938
1 548.494938
5.60
0.0212
subj|group | 5775.54143
59 97.8905328
------------+---------------------------------------------------visit | 1050.05444
5 210.010889
18.21
0.0000
group#visit | 19.3028953
5 3.86057906
0.33
0.8916
|
Residual | 2583.26536
224 11.5324346
------------+---------------------------------------------------Total | 11227.0811
294 38.1873506
Between-subjects error term:
Levels:
Lowest b.s.e. variable:
Covariance pooled over:
subj|group
61
(59 df)
subj
group
(for repeated variable)
Repeated variable: visit

Huynh-Feldt epsilon
=
Greenhouse-Geisser epsilon =
Box's conservative epsilon =
0.5930
0.5532
0.2000
------------ Prob > F -----------Source |

df
F
Regular
H-F
G-G
Box
------------+---------------------------------------------------visit |
5
18.21
0.0000
0.0000
0.0000
0.0001
group#visit |
5
0.33
0.8916
0.7979
0.7840
0.5658
Residual |
224
----------------------------------------------------------------matrix list e(Srep)
symmetric e(Srep)[6,6]
c1
c2
r1 31.361171
r2
15.71989 38.927914
r3 13.555927 28.365674
r4 9.4625252
22.74371
r5 8.6149335 23.887935
r6 4.6830378 19.242424
c3
c4
c5
c6
27.90249
20.519069
23.161248
18.721233
26.403025
22.47211
18.46616
28.026157
22.103924
22.204237
This analysis indicates that both group and visit are significant while the group*visit interaction is not. Some researchers are critical of this type of analysis since
it is based on fixed-effects adjusted for the repeated factor. Also, this repeated measures analysis assumes compound symmetry in the covariance matrix
(which seems to be a stretch in this case). However, we can do worse. The next several analyses are not meant to answer the research question but to show
relationships among several different commands in Stata.
regress dep pre group visit

Source |
SS
df
MS
---------+-----------------------------Model | 3719.12931
3 1239.70977
Residual | 7507.95176
291 25.8005215
---------+-----------------------------Total | 11227.0811
294 38.1873506
Number of obs
F( 3,
291)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
295
48.05
0.0000
0.3313
0.3244
5.0794
-----------------------------------------------------------------------------dep |
Coef.
Std. Err.
t
P>|t|
---------+-------------------------------------------------------------------pre |
.4769071
.0798565
5.972
0.000
.3197376
.6340767
group | -4.290664
.6072954
-7.065
0.000
-5.485912
-3.095416
visit | -1.307841
.169842
-7.700
0.000
-1.642116
-.9735667
_cons |
8.233577
1.803945
4.564
0.000
4.683143
11.78401
-----------------------------------------------------------------------------glm dep pre group visit, fam(gaus) link(iden)
7/3/2013 3:15 PM
4 of 12
Iteration 1 : deviance = 7507.9518

Residual df
Pearson X2
Dispersion
=
=
=
291
7507.952
25.80052
No. of obs =
Deviance
=
Dispersion =
295
7507.952
25.80052
Gaussian (normal) distribution, identity link

-----------------------------------------------------------------------------dep |
Coef.
Std. Err.
t
P>|t|
---------+-------------------------------------------------------------------pre |
.4769071
.0798565
5.972
0.000
.3197376
.6340767
group | -4.290664
.6072954
-7.065
0.000
-5.485912
-3.095416
visit | -1.307841
.169842
-7.700
0.000
-1.642116
-.9735667
_cons |
8.233577
1.803945
4.564
0.000
4.683143
11.78401
-----------------------------------------------------------------------------(Model is ordinary regression, use regress instead)
We are finally ready to try the panel data analysis using Stata's xtgee command. xtgee allows us to specify various working covariance structures through the
use of the corr option. We will start with an covariance structure of independence. We don't believe that this is the correct covariance structure but it allows us
to compare results with the OLS regression and the glm results above. The estat wcorrelations (which we will abbreviate as estat wcorr) will allow us to view
the working correlation matrix.
xtgee dep pre group visit, fam(gaus) link(iden) i(subj) t(visit) corr(ind)
Iteration 1: tolerance = 3.270e-15
GEE population-averaged model
Group variable:
Link:
Family:
Correlation:
Scale parameter:
25.45068
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(3)
Prob > chi2
Pearson chi2(295):
Dispersion (Pearson):
7507.95
25.45068
Deviance
Dispersion
subj
identity
Gaussian
independent
=
=
=
=
=
=
=
295
61
1
4.8
6
146.13
0.0000
=
=
7507.95
25.45068
-----------------------------------------------------------------------------dep |
Coef.
Std. Err.
z
P>|z|
---------+-------------------------------------------------------------------pre |
.4769071
.0793133
6.013
0.000
.321456
.6323582
group | -4.290664
.6031641
-7.114
0.000
-5.472844
-3.108484
visit | -1.307841
.1686866
-7.753
0.000
-1.638461
-.9772215
_cons |
8.233577
1.791673
4.595
0.000
4.721962
11.74519
-----------------------------------------------------------------------------estat wcorr
Estimated within-subj correlation matrix R:
r1
r2
r3
r4
r5
r6
c1
1.0000
0.0000
0.0000
0.0000
0.0000
0.0000
c2
c3
c4
c5
c6
1.0000
0.0000
0.0000
0.0000
0.0000
1.0000
0.0000
0.0000
0.0000
1.0000
0.0000
0.0000
1.0000
0.0000
1.0000
The three previous analyses yielded identical but propbably incorrect results. The common thread among them is that they all assume that the observations
within subjects are independent. This seems, on the face of it, to be highly unlikely. Scores on the depression scale are not likely to be independent from one
visit to the next.
We can also try analyzing these data using compound symmetry for the correlational structure. Compound symmetry is obtained using exchangable for the corr
option in xtgee.
xtgee dep pre group visit, fam(gaus) link(iden) i(subj) t(visit) corr(exc)
Group variable:
subj
Link:
identity
Family:
Gaussian
Correlation:
exchangeable
Scale parameter:
25.56569
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(3)
Prob > chi2
=
=
=
=
=
=
=
295
61
1
4.8
6
135.08
0.0000
-----------------------------------------------------------------------------dep |
Coef.
Std. Err.
z
P>|z|
---------+-------------------------------------------------------------------pre |
.4599018
.1441533
3.190
0.001
.1773666
.742437
7/3/2013 3:15 PM
5 of 12
group | -4.024676
1.081131
-3.723
0.000
-6.143654
-1.905698
visit | -1.226764
.1175009
-10.440
0.000
-1.457062
-.9964666
_cons |
8.432806
3.120987
2.702
0.007
2.315783
14.54983
----------------------------------------------------------------------------estat wcorr
r1
r2
r3
r4
r5
r6
c1
1.0000
0.5554
0.5554
0.5554
0.5554
0.5554
c2
c3
c4
c5
c6
1.0000
0.5554
0.5554
0.5554
0.5554
1.0000
0.5554
0.5554
0.5554
1.0000
0.5554
0.5554
1.0000
0.5554
1.0000
Note in particular the change in the standard errors between this analysis and the previous one. Next, what if we impose no preconceived notions about the
correlations among the responses over time. In this next example, we will request an unstructured correlation matrix. This is equivalent to the assumptions made
in a multivariate analysis.
xtgee dep pre group visit, fam(gaus) link(iden) i(subj) t(visit) corr(unstr)
Group and time vars:
subj visit
Link:
identity
Family:
Gaussian
Correlation:
unstructured
Scale parameter:
25.87029
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(3)
Prob > chi2
=
=
=
=
=
=
=
295
61
1
4.8
6
94.13
0.0000
-----------------------------------------------------------------------------dep |
Coef.
Std. Err.
z
P>|z|
---------+-------------------------------------------------------------------pre |
.3399185
.1326684
2.562
0.010
.0798932
.5999437
group | -4.134413
.9986306
-4.140
0.000
-6.091693
-2.177133
visit | -1.228327
.1492831
-8.228
0.000
-1.520916
-.9357372
_cons |
11.13045
2.892903
3.848
0.000
5.460464
16.80044
-----------------------------------------------------------------------------estat wcorr
r1
r2
r3
r4
r5
r6
c1
1.0000
0.4955
0.3477
0.3012
0.2328
0.0943
c2
c3
c4
c5
c6
1.0000
0.8622
0.7359
0.7431
0.5671
1.0000
0.6677
0.7394
0.5625
1.0000
0.7701
0.6166
1.0000
0.7179
1.0000
Now, let's try a different correlation structure, auto regressive with lag one. This is the correlational structure that is most
likely to be correct considering the repeated measures over time
xtgee dep pre group visit, fam(gaus) link(iden) i(subj) t(visit) corr(ar1)
Link:
Family:
Correlation:
Scale parameter:
subj visit
identity
Gaussian
AR(1)
25.82413
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(3)
Prob > chi2
=
=
=
=
=
=
=
287
53
2
5.4
6
64.55
0.0000
-----------------------------------------------------------------------------dep |
Coef.
Std. Err.
z
P>|z|
---------+-------------------------------------------------------------------pre |
.4268002
.1376156
3.101
0.002
.1570785
.6965219
group | -4.218194
1.053504
-4.004
0.000
-6.283023
-2.153364
visit | -1.181975
.1907298
-6.197
0.000
-1.555799
-.8081517
_cons |
9.037864
3.036076
2.977
0.003
3.087264
14.98846
-----------------------------------------------------------------------------estat wcorr
7/3/2013 3:15 PM
6 of 12
r1
r2
r3
r4
r5
r6
c1
1.0000
0.6812
0.4641
0.3161
0.2154
0.1467
c2
c3
c4
c5
1.0000
0.6812
0.4641
0.3161
0.2154
1.0000
0.6812
0.4641
0.3161
1.0000
0.6812
0.4641
1.0000
0.6812
c6
1.000
This analysis probably more closely reflects the correlations among the depression scores over six visits that we observed in our descriptive analysis.
Now, let's back up and reconsider the group by visit interaction. We will try a model with the interaction using the ar1 correlations.
xtgee dep pre group visit c.group#c.visit, fam(gaus) link(iden) i(subj) t(visit) corr(ar1)
note:
some groups have fewer than 2 observations

not possible to estimate correlations for those groups
8 groups omitted from estimation
Iteration
Iteration
Iteration
Iteration
1:
2:
3:
4:
tolerance
tolerance
tolerance
tolerance
=
=
=
=
.08642572
.00129189
.00002644
5.433e-07

Link:
Family:
Correlation:
subj visit
identity
Gaussian
AR(1)
Scale parameter:
25.81682
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(4)
Prob > chi2
=
=
=
=
=
=
=
287
53
2
5.4
6
64.83
0.0000
-----------------------------------------------------------------------------dep |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------pre |
.4284649
.1377094
3.11
0.002
.1585595
.6983703
group |
-3.55197
1.654127
-2.15
0.032
-6.794
-.3099395
visit | -1.057824
.3044115
-3.47
0.001
-1.654459
-.4611881
|
c.group#|
c.visit | -.2040059
.3905217
-0.52
0.601
-.9694144
.5614026
|
_cons |
8.606923
3.147897
2.73
0.006
2.437158
14.77669
-----------------------------------------------------------------------------The group by visit interaction still is not significant even though this may be a better approach for testing it. So far we have been treating visit as a continuous
variable. Is it possible that our analysis might change if we were to treat visit as a categorical variable, in the way that the anova did? Let's try one more analysis
using Stata's factor variable syntax (i.) introduced in Stata 11, to include dummy coded terms in the model.
xtgee dep pre group i.visit, fam(gaus) link(iden) i(subj) t(visit) corr(ar1)
note:

Iteration
Iteration
Iteration
Iteration
1:
2:
3:
4:
tolerance
tolerance
tolerance
tolerance
=
=
=
=
.12083034
.00138846
.00002034
2.990e-07

Link:
Family:
Correlation:
Scale parameter:
subj visit
identity
Gaussian
AR(1)
25.67071
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(7)
Prob > chi2
=
=
=
=
=
=
=
287
53
2
5.4
6
66.85
0.0000
-----------------------------------------------------------------------------dep |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------pre |
.4264589
.1372194
3.11
0.002
.1575137
.6954041
group | -4.197096
1.050645
-3.99
0.000
-6.256323
-2.137869
|
visit |
2 |
-.964717
.5556079
-1.74
0.083
-2.053689
.1242546
7/3/2013 3:15 PM
7 of 12
3
4
5
6
| -2.790063
.7474989
-3.73
0.000
-4.255134
-1.324992
| -3.730425
.8528421
-4.37
0.000
-5.401964
-2.058885
| -5.127078
.9147959
-5.60
0.000
-6.920045
-3.334111
|
-5.84916
.9534054
-6.14
0.000
-7.7178
-3.98052
|
_cons |
7.896145
2.998003
2.63
0.008
2.020168
13.77212
-----------------------------------------------------------------------------testparm i.visit
(
(
(
(
(
1)
2)
3)
4)
5)
2.visit
3.visit
4.visit
5.visit
6.visit
=
=
=
=
=
0
0
0
0
0
chi2( 5) =
Prob > chi2 =
40.56
0.0000
We can test to see whether the categorical version of visit accounts for more variability that the continuous version by including both in the model but using only
k - 2 = 4 dummy variables for time
xtgee dep pre group c.visit i.visit, fam(gaus) link(iden) i(subj) t(visit) corr(ar1)
note: 6.visit omitted because of collinearity
note: some groups have fewer than 2 observations
Iteration
Iteration
Iteration
Iteration
1:
2:
3:
4:
tolerance
tolerance
tolerance
tolerance
=
=
=
=
.203814
.00172276
.000025
3.675e-07

Link:
Family:
Correlation:
Scale parameter:
subj visit
identity
Gaussian
AR(1)
25.67071
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(7)
Prob > chi2
=
=
=
=
=
=
=
287
53
2
5.4
6
66.85
0.0000
-----------------------------------------------------------------------------dep |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------pre |
.4264589
.1372194
3.11
0.002
.1575137
.6954041
group | -4.197096
1.050645
-3.99
0.000
-6.256323
-2.137869
visit | -1.169832
.1906811
-6.14
0.000
-1.54356
-.7961039
|
visit |
2 |
.205115
.5196299
0.39
0.693
-.8133408
1.223571
3 | -.4503992
.648481
-0.69
0.487
-1.721399
.8206003
4 | -.2209286
.6602134
-0.33
0.738
-1.514923
1.073066
5 | -.4477498
.5585628
-0.80
0.423
-1.542513
.6470131
6 | (omitted)
|
_cons |
9.065977
3.031614
2.99
0.003
3.124124
15.00783
-----------------------------------------------------------------------------testparm i.visit
(
(
(
(
1)
2)
3)
4)
2.visit
3.visit
4.visit
5.visit
=
=
=
=
0
0
0
0
chi2( 4) =
Prob > chi2 =
1.92
0.7506
These results indicate that the categorical version of visit does not account for significantly more variability than the continuous version. In the final analysis, I
think that I prefer the following model, xtgee dep pre group visit, fam(gaus) link(iden) i(subj) t(visit) corr(ar1), of all the analyses run so far. Those results
looked as follows:
-----------------------------------------------------------------------------dep |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------pre |
.4268002
.1376156
3.10
0.002
.1570785
.6965219
7/3/2013 3:15 PM
8 of 12
group | -4.218194
1.053504
-4.00
0.000
-6.283023
-2.153364
visit | -1.181975
.1907298
-6.20
0.000
-1.555799
-.8081517
_cons |
9.037864
3.036076
2.98
0.003
3.087264
14.98846
-----------------------------------------------------------------------------The final interpretation of these results indicate that there is a significant effect for the pretest, i.e., for evey one point increase in the pretest score there is about
a 0.4 increase in the depression score, when controlling for treatment and visit. There is also an effect for the estrogen patch when controlling for pretest
depression and visit. Use of the estrogen patch reduces the depression score by 4.2 point. Finally, there is also a significant visit effect when controlling for
pretest depression and group membership. The depression score decreases on the average by 1.18 points for each visit.
GEE with Binary Response Variable

The binary response variable in these examples was created from the data from the 1996 Gregoire, Kumar Everitt,
Henderson & Studd study on the efficacy of estrogen patches in treating postnatal depression. Women were randomly
assigned to either a placebo control group (group=0, n=27) or estrogen patch group (group=1, n=34). Prior to the first
treatment all patients took the Edinburgh Postnatal Depression Scale (EPDS). EPDS data was collected monthly for six
months once the treatment began. Depression scores greater than or equal to 11 were coded as 1.
use http://www.ats.ucla.edu/stat/stata/library/depres01, clear
We will go through as series of analyses pretty much paralleling models that were run above using the continuous response variable. To get a binary logit type
model we will set family to binary and link to logit. We will start with the correlation structure independent follow by exchangable (compound symmetry) and
then unstructured.
xtgee depressd group visit, i(subj) fam(bin) link(logit) corr(ind)

Iteration 1: tolerance = 2.028e-12
Group variable:
Link:
Family:
Correlation:
subj
logit
binomial
independent
Scale parameter:
Pearson chi2(295):
Dispersion (Pearson):
295.72
1.00245
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(2)
Prob > chi2
=
=
=
=
=
=
=
295
61
1
4.8
6
52.54
0.0000
Deviance
Dispersion
=
=
338.95
1.148974
-----------------------------------------------------------------------------depressd |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------group | -1.606602
.277919
-5.78
0.000
-2.151313
-1.061891
visit | -.4402142
.0802387
-5.49
0.000
-.5974791
-.2829493
_cons |
2.38366
.3675414
6.49
0.000
1.663292
3.104028
-----------------------------------------------------------------------------estat wcorr
|
c1
c2
c3
c4
c5
c6
------+-----------------------------------------------------------------r1 |
1
r2 |
0
1
r3 |
0
0
1
r4 |
0
0
0
1
r5 |
0
0
0
0
1
r6 |
0
0
0
0
0
1
xtgee depressd group visit, i(subj) fam(bin) link(logit) corr(exc)
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
1:
2:
3:
4:
5:
6:
tolerance
tolerance
tolerance
tolerance
tolerance
tolerance
=
=
=
=
=
=
.0287363
.00515198
.00024259
.00001921
1.162e-06
5.879e-08

Group variable:
subj
Link:
logit
Family:
binomial
Correlation:
exchangeable
Scale parameter:
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(2)
Prob > chi2
=
=
=
=
=
=
=
295
61
1
4.8
6
45.64
0.0000
7/3/2013 3:15 PM
9 of 12
-----------------------------------------------------------------------------depressd |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------group | -1.616323
.4669082
-3.46
0.001
-2.531446
-.7011994
visit | -.3984038
.0613331
-6.50
0.000
-.5186145
-.2781931
_cons |
2.409522
.4456646
5.41
0.000
1.536035
3.283008
-----------------------------------------------------------------------------estat wcorr
|
c1
c2
c3
c4
c5
c6
------+-----------------------------------------------------------------r1 |
1
r2 | .4518293
1
r3 | .4518293
.4518293
1
r4 | .4518293
.4518293
.4518293
1
r5 | .4518293
.4518293
.4518293
.4518293
1
r6 | .4518293
.4518293
.4518293
.4518293
.4518293
1
xtgee depressd group visit, i(subj) t(visit) fam(bin) link(logit) corr(unstr)
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
1:
2:
3:
4:
5:
6:
tolerance
tolerance
tolerance
tolerance
tolerance
tolerance
=
=
=
=
=
=
.03269114
.00216503
.00040481
.00005372
6.971e-06
8.126e-07

subj visit
Link:
logit
Family:
binomial
Correlation:
unstructured
Scale parameter:
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(2)
Prob > chi2
=
=
=
=
=
=
=
295
61
1
4.8
6
32.57
0.0000
-----------------------------------------------------------------------------depressd |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------group |
-1.5933
.4553165
-3.50
0.000
-2.485704
-.7008963
visit | -.3897561
.0748284
-5.21
0.000
-.5364169
-.2430952
_cons |
2.311344
.4521761
5.11
0.000
1.425095
3.197593
-----------------------------------------------------------------------------estat wcorr
|
c1
c2
c3
c4
c5
c6
------+-----------------------------------------------------------------r1 |
1
r2 |
.404501
1
r3 | .1803076
.6315383
1
r4 |
.284646
.5602217
.5795466
1
r5 | .2789439
.6335593
.4888153
.7378928
1
r6 | .0837916
.3078638
.5616702
.4672481
.5587511
1
With these data, just as with the continnuous response variable, it might be more reasonable to hypothesize that the correlation structure would be
autoregressive.
xtgee depressd group visit, i(subj) t(visit) fam(bin) link(logit) corr(ar1)

note:

Iteration
Iteration
Iteration
Iteration
1:
2:
3:
4:
tolerance
tolerance
tolerance
tolerance
=
=
=
=
.00583303
.00029981
7.003e-06
1.599e-07

Link:
Family:
subj visit
logit
binomial
Number of obs
Number of groups
Obs per group: min
avg
=
=
=
=
287
53
2
5.4
7/3/2013 3:15 PM
10 of 12
Correlation:
AR(1)
Scale parameter:
Wald chi2(2)
Prob > chi2
max =
=
=
6
26.04
0.0000
-----------------------------------------------------------------------------depressd |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------group | -1.588712
.4391128
-3.62
0.000
-2.449358
-.7280672
visit | -.4036122
.0933711
-4.32
0.000
-.5866163
-.2206082
_cons |
2.259702
.4961409
4.55
0.000
1.287284
3.23212
-----------------------------------------------------------------------------estat wcorr
|
c1
c2
c3
c4
c5
c6
------+-----------------------------------------------------------------r1 |
1
r2 | .5643214
1
r3 | .3184587
.5643214
1
r4 | .1797131
.3184587
.5643214
1
r5 | .1014159
.1797131
.3184587
.5643214
1
r6 | .0572312
.1014159
.1797131
.3184587
.5643214
1
If we want, we can also obtain the results in the odds ratio metric using the eform option.
xtgee, eform
Link:
Family:
Correlation:
subj visit
logit
binomial
AR(1)
Scale parameter:
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(2)
Prob > chi2
=
=
=
=
=
=
=
287
53
2
5.4
6
26.04
0.0000
-----------------------------------------------------------------------------depressd | Odds Ratio

Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------group |
.2041883
.0896617
-3.62
0.000
.086349
.4828413
visit |
.6679031
.0623629
-4.32
0.000
.5562061
.8020309
-----------------------------------------------------------------------------Let's add in the pretest and a group by visit interaction.
xtgee depressd pre group visit c.group#c.visit, i(subj) t(visit) fam(bin) link(logit) corr(a
note:

Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
1:
2:
3:
4:
5:
6:
tolerance
tolerance
tolerance
tolerance
tolerance
tolerance
=
=
=
=
=
=
.27128178
.00369124
.00040644
.00002584
1.892e-06
1.383e-07

Link:
Family:
Correlation:
Scale parameter:
subj visit
logit
binomial
AR(1)
1
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(4)
Prob > chi2
=
=
=
=
=
=
=
287
53
2
5.4
6
29.71
0.0000
-----------------------------------------------------------------------------depressd |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------pre |
.1231682
.0565583
2.18
0.029
.012316
.2340204
group | -1.278468
.7833482
-1.63
0.103
-2.813802
.2568666
visit | -.3504923
.1484459
-2.36
0.018
-.6414409
-.0595436
|
c.group#|
c.visit | -.1279848
.1946883
-0.66
0.511
-.5095669
.2535973
7/3/2013 3:15 PM
11 of 12
|
_cons | -.4669354
1.271484
-0.37
0.713
-2.958999
2.025128
-----------------------------------------------------------------------------Clearly, there is no interaction but we'll stick with the pretest for the moment. Next let's try the categorical version of visit and the model that contains both the
categorical and continuous version of visit.
xtgee depressd pre group i.visit, i(subj) fam(bin) link(logit) t(visit) corr(ar1)
note:

Iteration
Iteration
Iteration
Iteration
Iteration
1:
2:
3:
4:
5:
tolerance
tolerance
tolerance
tolerance
tolerance
=
=
=
=
=
.31672971
.00966096
.00059261
3.397e-06
9.420e-07

Link:
Family:
Correlation:
subj visit
logit
binomial
AR(1)
Scale parameter:
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(7)
Prob > chi2
=
=
=
=
=
=
=
287
53
2
5.4
6
30.86
0.0001
-----------------------------------------------------------------------------depressd |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------pre |
.1140311
.056433
2.02
0.043
.0034244
.2246378
group | -1.692654
.4377388
-3.87
0.000
-2.550607
-.8347021
|
visit |
2 | -.1751772
.3106588
-0.56
0.573
-.7840573
.4337028
3 | -1.015265
.3915632
-2.59
0.010
-1.782715
-.2478151
4 | -1.108258
.4287682
-2.58
0.010
-1.948628
-.2678878
5 | -1.489162
.4548596
-3.27
0.001
-2.380671
-.597654
6 |
-2.14973
.4951443
-4.34
0.000
-3.120195
-1.179265
|
_cons | -.4832614
1.18731
-0.41
0.684
-2.810346
1.843823
-----------------------------------------------------------------------------testparm i.visit
(
(
(
(
(
1)
2)
3)
4)
5)
2.visit
3.visit
4.visit
5.visit
6.visit
=
=
=
=
=
0
0
0
0
0
chi2( 5) =
Prob > chi2 =
21.92
0.0005
xtgee depressd pre group c.visit i.visit, i(subj) fam(bin) link(logit) t(visit) corr(ar1)
note: 6.visit omitted because of collinearity
note: some groups have fewer than 2 observations
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
1:
2:
3:
4:
5:
6:
tolerance
tolerance
tolerance
tolerance
tolerance
tolerance
=
=
=
=
=
=
.38986389
.0130539
.00083314
4.664e-06
1.294e-06
6.999e-08

Link:
Family:
Correlation:
Scale parameter:
subj visit
logit
binomial
AR(1)
1
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(7)
Prob > chi2
=
=
=
=
=
=
=
287
53
2
5.4
6
30.86
0.0001
7/3/2013 3:15 PM
12 of 12
-----------------------------------------------------------------------------depressd |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------pre |
.1140311
.056433
2.02
0.043
.0034244
.2246378
group | -1.692654
.4377388
-3.87
0.000
-2.550607
-.8347021
visit |
-.429946
.0990289
-4.34
0.000
-.624039
-.235853
|
visit |
2 |
.2547688
.2901423
0.88
0.380
-.3138998
.8234373
3 | -.1553729
.3440849
-0.45
0.652
-.829767
.5190212
4 |
.1815801
.3544878
0.51
0.608
-.5132033
.8763635
5 |
.2306217
.3201945
0.72
0.471
-.3969481
.8581914
6 | (omitted)
|
_cons | -.0533153
1.201905
-0.04
0.965
-2.409005
2.302375
-----------------------------------------------------------------------------testparm i.visit
(
(
(
(
1)
2)
3)
4)
2.visit
3.visit
4.visit
5.visit
=
=
=
=
0
0
0
0
chi2( 4) =
Prob > chi2 =
3.04
0.5507
How to cite this page
Report an error on this page or leave a comment
The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.
IDRE RESEA RCH TECHNOLOGY

GROUP
High Perform ance Com puting
Statistical Com puting
2013 UC Regents
CONTACT
NEWS
GIS
StatisticalCom puting
Hoffm an2 Cluster
M apshare
Classes
Hoffm an2 AccountApplication
Visualization
Conferences
Hoffm an2 Usage Statistics
3D M odeling
Reading M aterials
UC Grid Portal
Technology Sandbox
IDRE Listserv
UCLA Grid Portal
Tech Sandbox Access
IDRE Resources
Shared Cluster& Storage
Data Centers
SocialSciences Data Archive
AboutIDRE
GIS and Visualization
ABOUT
High Perform ance Com puting
EVENTS
OUR EXPERTS
Terms of Use & Privacy Policy
7/3/2013 3:15 PM

Stata Library - Panel Data Analysis Using GEE

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Stata Library - Panel Data Analysis Using GEE

Caricato da

Copyright:

Formati disponibili

Stata Library: Panel Data Analysis using GEE

Help the Stat Consulting Group by

stat > stata > library >

set matsize 160

Let the analyses begin