Sei sulla pagina 1di 9

Oleksandra Cheipesh 11720920

Empirical Project in Advanced Microeconometrics

What determines health?


Oleksandra Cheipesh 11720920

Introduction

Increasing trends for healthy lifestyles make people rethink their health habits.
However, although a lot of people know that fruits and vegetables consumption as well
as avoiding unhealthy habits may improve their health, not so many are aware of the
effects of different health behaviors. The impact of health behaviors is useful not only
for preventive health medicine but also in economics, since, healthy people are more
productive.

Several research studies suggested that fruits consumption, physical exercises, good
social relationships have a significant positive impact on health, whereas smoking and
obesity have a negative impact on health [1-4]. We will estimate the effects of several
covariates by using panel data models. First, we will describe the data, then estimate
the coefficients by using both fixed and random effects estimators. And finally, we will
discuss the possible bias and limitations.

Data description

The data come from National longitudinal surveys of youth 1997 (the respondents were
first interviewed in 1997). The data set in its original form consists of the answers of
8984 individuals in 2002, 2007 and 2008. After dropping all variables which the
participants did not answer, the sample size is reduced to about 6000 individuals.
Furthermore, because of attrition in 2008, only answers in the years of 2002 and 2007
are analyzed.

The dependent variable is general health status (1-5 variable, 1 – “very poor health
status”, 5 – “perfect health status”). As we can see, health status of individuals changed
across two years. In 2007 less than 50% of respondents have the same health status
as in 2002.
Oleksandra Cheipesh 11720920

Table 1. Health status in 2002 and 2007


2002 2007 health_new
health_new 1 2 3 4 5 Total

1 6 12 10 7 2 37
2 16 112 154 81 23 386
3 16 182 649 507 178 1,532
4 10 112 587 1,054 467 2,230
5 5 52 286 647 929 1,919

Total 53 470 1,686 2,296 1,599 6,104

Explanatory variables include:

• different health behaviors: fruits and vegetables consumption as well as time


spent on television, computer, exercises and sleep
• stressful events: dummy variables which show divorce of the parents, whether
anyone from the respondent’s relatives was hospitalized and whether parents
or partner lost a job – (serve more as control variables, which help not to
overestimate the effect of sleep, since they might be correlated)
• dummy variables for over- and underweight
• dummy variables for gender and race

The sample size consists of about 50% of males, 25% of black-skinned people and
21% of Hispanics (table 3). In 2002 interviewers had more stressful events than in
2007 and the share of underweight people decreased by 4%. In contrast, there
were more overweight respondents in 2007: 47% in comparison to 36% (table 2)

Table 2. Time-varying dummies Table 3. Time-invariant dummies

Variable Obs Mean Std. Dev. Min


Over Mean Std. Err. [95% Conf. Interval]
male 12,208 .4957405 .5000023 0
black 12,208 .2445937 .4298637 0
dec
hisp 12,208 .2080603 .4059368 0
2002 .6050131 .0062575 .5927474 .6172788
2007 .5673329 .006342 .5549016 .5797642

unemp
2002 .0912516 .0036861 .0840262 .098477
2007 .072903 .0033278 .0663799 .0794261

div
2002 .0488204 .0027584 .0434135 .0542274
2007 .0286697 .0021361 .0244826 .0328568

over
2002 .3623853 .0061531 .3503243 .3744463
2007 .4670708 .0063864 .4545525 .4795891

under
2002 .1251638 .0042358 .1168611 .1334666
2007 .0853539 .0035766 .0783432 .0923645
Oleksandra Cheipesh 11720920

As we can see in appendix 1, there is variation in continuous regressors both in


2002 and 2007. However, most of the individuals eat fruits 1-3 times per week
(nearly 43%), eat vegetables 1-6 times per week (more than 55%), do not do
exercises (about 30%), sleep 8 hours (about 30%), spend more than 10 hours on
using a computer (28% and 41% in 2002 and 2007 respectively) and from 3 to 10
hours on watching a television (about 50%).

The values of the regressors vary more between than within the individuals.
However, there is enough variation within individuals, which means that we can use
a fixed effects estimator.

Table 4. Within and between summary statistics

Variable Mean Std. Dev. Min Max Observations

fruit overall 2.761386 1.332675 1 7 N = 12208


between 1.08062 1 7 n = 6104
within .7799875 -.238614 5.761386 T = 2

veg overall 3.216989 1.33557 1 7 N = 12208


between 1.09168 1 7 n = 6104
within .7694663 .2169889 6.216989 T = 2

exer overall 2.546281 2.35353 0 7 N = 12208


between 1.867834 0 7 n = 6104
within 1.431987 -.9537189 6.046281 T = 2

comp overall 3.949787 1.905789 1 6 N = 12208


between 1.618743 1 6 n = 6104
within 1.005942 1.449787 6.449787 T = 2

tel overall 2.413827 1.159704 1 6 N = 12208


between .9298844 1 6 n = 6104
within .6930355 -.086173 4.913827 T = 2

sleep overall 7.004833 1.519883 0 24 N = 12208


between 1.175673 1.5 16 n = 6104
within .9633022 -2.995167 17.00483 T = 2

Empirical Strategy and Results

To estimate the coefficients, we include all available regressors and compare the
results of the fixed and random effects estimators. It can be shown (table 5) that
the coefficients of the fixed and random effects estimators are different, and the
Hausman test rejects the null of consistency of the random effects estimator (p-
value is close to zero). This means that we cannot estimate the values of time-
invariant regressors as gender and race, but we allow for time constant endogeneity
Oleksandra Cheipesh 11720920

across individuals. For our fixed effects model, we specify panel robust standard
errors.

Table 5. Random and fixed effects coefficients

Coefficients
(b) (B) (b-B) sqrt(diag(V_b-V_B))
fixed random Difference S.E.

under -.157623 -.2267092 .0690862 .0259179


over -.2057351 -.3580017 .1522666 .0200661
div -.0028997 -.0062266 .0033269 .0294744
unemp .0062048 -.0695542 .075759 .0201183
dec .006621 -.0285271 .0351481 .0114866
sleep .0280009 .0307666 -.0027658 .004217
tel .0029352 -.0328703 .0358055 .0060611
comp -.0095829 .0321236 -.0417064 .0046407
exer .0287956 .0340391 -.0052436 .0028835
veg .0097362 .0173034 -.0075672 .0059732
fruit .0399884 .0546766 -.0146881 .0058248

As shown, only the effects of several covariates can be estimated, namely: being
over- and underweighted, time spent on sleep and doing exercises and the number
of eaten fruits (table 6). Therefore, being under- or overweighed decrease the
general health status by approximately 0.16 and 0.21 points respectively, while
every hour of sleep and every time per week when respondents do exercises
improve their health by about 0.028 points. And finally, every 2-3 fruits per week
have a positive impact of 0.04 points.

In comparison to fruits the effect of vegetables turned out to be insignificant. It can


be explained by the fact, that the respondents were asked about eating vegetables
in general and not all vegetables have a similar impact on health (for example,
French fries). Moreover, previous study by C. Mood showed a barely significant
role of vegetables on health as well [2].

The control variables of stressful events are insignificant even at the 10%
confidence interval. Moreover, they are not correlated with sleep (correlation is less
than 3%). Therefore, as can be shown, if we drop out these variables in order to
gain efficiency, other coefficients will change only slightly (by less than 0.0001
points, appendix 2).

Although the coefficients seem to look small, if an individual is not over- or


underweighted, starts doing exercises at least 3-4 times per week and increases
his fruits consumption from 1 time per week to 2-3 times per day, his/her health
Oleksandra Cheipesh 11720920

status can be improved by about 0.5 points. Considering that health is measured
from zero to five, the effects are not so small. However, since almost all
respondents sleep from 6 to 8 hours a day, the role of sleep in this case is not very
big.

Table 6. Fixed effects regression

R-sq: Obs per group:


within = 0.0283 min = 2
between = 0.0958 avg = 2.0
overall = 0.0717 max = 2

F(11,6103) = 15.10
corr(u_i, Xb) = 0.1263 Prob > F = 0.0000

(Std. Err. adjusted for 6,104 clusters in id)

Robust
health_new Coef. Std. Err. t P>|t| [95% Conf. Interval]

under -.157623 .0377065 -4.18 0.000 -.2315411 -.0837049


over -.2057351 .027645 -7.44 0.000 -.2599291 -.151541
div -.0028997 .0468927 -0.06 0.951 -.094826 .0890266
unemp .0062048 .0360662 0.17 0.863 -.0644976 .0769071
dec .006621 .018979 0.35 0.727 -.0305845 .0438264
sleep .0280009 .007444 3.76 0.000 .013408 .0425938
tel .0029352 .0098653 0.30 0.766 -.0164042 .0222746
comp -.0095829 .0064221 -1.49 0.136 -.0221725 .0030068
exer .0287956 .0048315 5.96 0.000 .0193241 .038267
veg .0097362 .0094653 1.03 0.304 -.0088191 .0282916
fruit .0399884 .0093704 4.27 0.000 .0216192 .0583576
_cons 3.579416 .0708903 50.49 0.000 3.440446 3.718386

sigma_u .75984703
sigma_e .7056016
rho .53696561 (fraction of variance due to u_i)

Possible Bias and Limitations

Although we use a fixed effects estimator, the results may be biased. First, the general
health status is measured subjectively. Therefore, people who started eating fruits and
doing exercises in 2007 might feel themselves better because of placebo effect.
Second, there may be an omitted variable bias of alcohol consumption, smoking, etc.
Sometimes, people who have bad habits, do not have a healthy diet and do not do
exercises. For this reason, the effects of fruits and exercises might me overestimated.
And finally, there may be a reverse causality between doing exercises and general
health status: people who are helathier, are more likely to do exercises.

Due to the fact that we analyze data only across two years, the lagged regressors
cannot be used as instruments. Analyzing the same dataset across more than two
Oleksandra Cheipesh 11720920

years could help to solve the problem of time-varying endogeneity. The same data set
is available for 2009, 2010 and 2011, however, in these years the respondents were
not asked about doing exercises. Nevertheless, if we drop out the variable of doing
exercises, other coefficients will change (appendix 3) and in such a way we will have
an omited variable bias, because exercises are correlated with diet habits and weight.

Conclusion

In order to estimate the impact of health behaviors, stressful events and weight on
health, panel data estimators are used. The cleared data set include 6000 individuals
in 2002 and 2007. As the Hausman test rejected the null of random effects model at
1% confidence interval, fixed effects estimator was used. The results suggested that
such covariates as sleep, fruits consumption and exercises have a positive effect on
health, while being over- and underweighed have a negative impact on health.

Since the data are analyzed only in 2002 and 2007, lagged regressors connot be used
as instruments, that’s why, the results may suffer from endogeneity, which is caused
by reverse causality between health and doing exercises, omited variables of other
health behaviors and estimation error (general health status is self reported). For this
reason, analyzing data across more than 2 years could be a possible improvement.

Reference List

1. Kerry Sargent-Cox, Nicolas Cherbuin, Lara Morris, Peter Butterworth, Kaarin


J. Anstey, 2013. The effect of health behavior change on self-rated health
across the adult life course: A longitudinal cohort study. Preventive Medicine
58, 75–80
2. Carina Mood, 2013. Life-style and self-rated global health in Sweden: A
prospective analysis spanning three decades. Preventive Medicine 57, 802–
806

3. Charlotta Pisinger, Ulla Toft, Mette Aadahl, Charlotte Glümer, Torben


Jørgensen, 2009. The relationship between lifestyle and self-reported health in
a general population. The Inter99 study. Preventive Medicine 49, 418–423
Oleksandra Cheipesh 11720920

4. James Tsai, Earl S. Ford, Chaoyang Li, Guixiang Zhao, William S. Pearson,
Lina S. Balluz, 2010.Multiple healthy behaviors and optimal self-rated health:
Findings from the 2007. Behavioral Risk Factor Surveillance System Survey.
Preventive Medicine 51, 268–274

Appendix 1. Continuous regressors in 2002, 2007 and 2008

2002 2007 2008


HOW MANY TIMES PER WEEK DOES R EAT FRUIT
(1) I do not typically eat fruit 1032 13.08% 750 10.12% 41 9.83%
(2) 1 to 3 times 3502 44.40% 3198 43.16% 172 41.25%
(3) 4 to 6 times 1521 19.28% 1470 19.84% 88 21.10%
(4) 1 time per day 964 12.22% 1049 14.16% 56 13.43%
(5) 2 times per day 536 6.80% 606 8.18% 36 8.63%
(6) 3 times per day 209 2.65% 211 2.85% 13 3.12%
(7) 4 or more times per day 123 1.56% 126 1.70% 11 2.64%
total 7887 100.00% 7410 100.00% 417 100.00%
HOW MANY TIMES PER WEEK DOES R EAT VEGETABLES
(1) I do not typically eat fruit 578 7.33% 349 4.71% 25 6.01%
(2) 1 to 3 times 2525 32.00% 2055 27.73% 125 30.05%
(3) 4 to 6 times 2049 25.97% 2049 27.64% 97 23.32%
(4) 1 time per day 1539 19.51% 1563 21.09% 90 21.63%
(5) 2 times per day 831 10.53% 937 12.64% 44 10.58%
(6) 3 times per day 241 3.05% 292 3.94% 14 3.37%
(7) 4 or more times per day 127 1.61% 167 2.25% 21 5.05%
total 7890 100.00% 7412 100.00% 416 100.00%
HOW MANY TIMES PER WEEK DOES R EXERCISE 30
MINUTES OR MORE
(0) 2512 31.88% 2308 31.22% 125 29.90%
(1) 717 9.10% 667 9.02% 27 6.46%
(2) 992 12.59% 907 12.27% 49 11.72%
(3) 1165 14.78% 1149 15.54% 74 17.70%
(4) 601 7.63% 602 8.14% 40 9.57%
(5) 722 9.16% 764 10.34% 42 10.05%
(6) 221 2.80% 179 2.42% 10 2.39%
(7) 950 12.06% 816 11.04% 51 12.20%
total 7880 100.00% 7392 100.00% 418 100.00%
HOW MANY HOURS PER WEEK DOES R USE A COMPUTER
(1) None 2039 25.86% 1156 15.60% 68 16.31%
(2) less than 1 hour a week 597 7.57% 645 8.70% 25 6.00%
(3)1 to 3 hours a week 1305 16.55% 1209 16.31% 61 14.63%
(4) 4 to 6 hours a week 1129 14.32% 920 12.41% 66 15.83%
(5) 7 to 9 hours a week 552 7.00% 481 6.49% 39 9.35%
(6)10 hours or more a week 2264 28.71% 3001 40.49% 158 37.89%
total 7886 100.00% 7412 100.00% 417 100.00%
HOW MANY HOURS PER WEEK DOES R WATCH TELEVISION
(1) Less than 2 hours per week 1301 16.49% 1160 15.67% 61 14.56%
(2) 3 to 10 hours a week 3711 47.03% 3821 51.61% 224 53.46%
(3) 11 to 20 hours a week 1518 19.24% 1467 19.81% 91 21.72%
(4) 21 to 30 hours a week 658 8.34% 491 6.63% 23 5.49%
(5) 31 to 40 hours a week 283 3.59% 189 2.55% 5 1.19%
(6) More than 40 hours a week 419 5.31% 276 3.73% 15 3.58%
total 7890 100.00% 7404 100.00% 419 100.00%
HOW MANY HOURS PER NIGHT DOES R SLEEP
(0) 6 0.08% 3 0.04% 0 0.00%
(1) 2 0.03% 2 0.03% 0 0.00%
(2) 20 0.25% 14 0.19% 0 0.00%
(3) 45 0.57% 48 0.65% 6 1.44%
(4) 206 2.61% 258 3.49% 16 3.84%
(5) 648 8.22% 737 9.96% 45 10.79%
(6) 1677 21.28% 1857 25.10% 99 23.74%
(7) 1887 23.95% 1878 25.39% 105 25.18%
(8) 2466 31.30% 2112 28.55% 123 29.50%
(9) 471 5.98% 278 3.76% 15 3.60%
(10) more than 10 451 5.72% 211 2.85% 8 1.92%
total 7879 100.00% 7398 100.00% 417 100.00%
Oleksandra Cheipesh 11720920

Appendix 2. Fixed effects regression without control variables

Robust
health_new Coef. Std. Err. t P>|t| [95% Conf. Interval]

under -.1572853 .0376525 -4.18 0.000 -.2310975 -.0834732


over -.2055921 .0276332 -7.44 0.000 -.259763 -.1514213
sleep .0279811 .0074298 3.77 0.000 .0134161 .0425461
tel .0029868 .0098593 0.30 0.762 -.0163409 .0223146
comp -.0095622 .006417 -1.49 0.136 -.0221417 .0030173
exer .0287795 .0048299 5.96 0.000 .0193112 .0382478
veg .0097311 .0094623 1.03 0.304 -.0088182 .0282805
fruit .0400305 .0093675 4.27 0.000 .0216669 .0583941
_cons 3.583473 .0700688 51.14 0.000 3.446113 3.720832

sigma_u .75961985
sigma_e .70543704
rho .5369329 (fraction of variance due to u_i)

Appendix 3. Fixed effects regression without exercises

corr(u_i, Xb) = 0.1239 Prob > F = 0.0000

(Std. Err. adjusted for 6,104 clusters in id)

Robust
health_new Coef. Std. Err. t P>|t| [95% Conf. Interval]

under -.1584067 .0378262 -4.19 0.000 -.2325595 -.0842539


over -.214702 .0276864 -7.75 0.000 -.2689772 -.1604269
sleep .0282634 .0074348 3.80 0.000 .0136886 .0428381
tel -.0006503 .009916 -0.07 0.948 -.0200892 .0187886
comp -.0093313 .006445 -1.45 0.148 -.0219659 .0033032
veg .0123858 .0094811 1.31 0.191 -.0062005 .0309722
fruit .0453129 .0093954 4.82 0.000 .0268946 .0637313
_cons 3.643413 .0693155 52.56 0.000 3.50753 3.779296

sigma_u .7646857
sigma_e .70772878
rho .53862487 (fraction of variance due to u_i)

Appendix 4. Correlation between regressors

under over div unemp dec sleep tel comp exer veg

under 1.0000
over -0.2887 1.0000
div -0.0011 0.0016 1.0000
unemp 0.0025 0.0390 0.0312 1.0000
dec 0.0096 0.0210 -0.0002 0.0495 1.0000
sleep 0.0144 -0.0560 0.0097 -0.0226 -0.0242 1.0000
tel 0.0169 0.0301 -0.0123 0.0135 0.0380 0.0597 1.0000
comp -0.0165 0.0476 0.0046 -0.0341 0.0031 -0.0902 -0.1289 1.0000
exer -0.0092 -0.1019 -0.0001 -0.0178 -0.0093 -0.0345 -0.1033 0.0449 1.0000
veg -0.0365 0.0032 -0.0043 -0.0155 0.0173 -0.0144 -0.0993 0.0940 0.1487 1.0000
fruit -0.0281 -0.0344 -0.0058 -0.0407 -0.0037 0.0131 -0.0876 0.1017 0.2067 0.4920

fruit

fruit 1.0000

Potrebbero piacerti anche