Sei sulla pagina 1di 17

Journal of the Air & Waste Management Association

Copy of e-mail Notification z5p3387


Your article (# 08-00158) from JA&WMA is available for download

_______________________

Dear Author,

Your article, tentatively scheduled for publication in the August 2009 issue of the Journal of the Air &
Waste Management Association, is available in PDF format for your review. To access and download your
article, please go to this URL address:
http://rapidproof.cadmus.com/RapidProof/retrieval/index.jsp
Login: your e-mail address
Password: 99Ht9PpoHZN6

You will need to have Adobe Acrobat Reader software to read the PDF file of your article. This is free
software that can be downloaded at www.adobe.com/products/acrobat/readstep.html.

If you experience technical problems in accessing your PDF file, please contact rapidproof@cadmus.com

Once you have downloaded and printed your article, please read the page proofs carefully, then
-- mark changes or corrections in the margin of the page proofs and answer any queries listed on the last
page of the PDF proof (footnotes A,B,C, etc.) NOTE: ONLY MAJOR ERRORS CAN BE CORRECTED
AT THIS STAGE;
-- proofread tables, figures, and equations carefully; and
-- check that any Greek characters or symbols, especially "mu," have translated correctly.

NEXT STEPS: To ensure that your article is published expeditiously, please complete the following within
48 hours:

1) Return your corrected page proofs and signed "Author Approval" form to me (preferably by overnight
mail) at the address below, and

2) Print out and complete the attached "Page Charge and Reprint Order" form, and mail or fax it along with
your payment to the address indicated on the form (please note that this is a different address from mine).
Payment by credit card or check made out to A&WMA in US Dollars must be received by A&WMA before
your article can be published.

Please contact me directly should you have any questions or concerns. Always include your article number
(# 08-00158) with all correspondence.

Thank you for your prompt response and for choosing to publish in the Journal of the Air & Waste
Management Association.

Sincerely,

Trisha Gage
Senior Production Editor
Cadmus Professional Communications
46 Elfman Drive
Doylestown, PA 18901
Tel: +1-215-489-7120
Fax: +1-215-489-3196
E-mail: gagep@cadmus.com
JOURNAL OF THE AIR & WASTE MANAGEMENT ASSOCIATION
Page Charge and Hard Copy Reprint Order Form
INVOICE #CAD- 3387 Fed ID #25-6048614
Author’s Name: Manuscript Title:
Email:

Delivery Address if also ordering hard copies (no P.O. boxes):

Manuscript Number: 08-00158


Number of Pages: 13

Papers published in the Journal of the Air & Waste Management Association are subject to page charges of US$79 per
printed page for A&WMA members and US$89 per page for all others. After payment of page charges, the primary author
will receive a PDF copy of the published paper*. Authors who would also like hard copy reprints from A&WMA in booklet
format may order them below.
*PDFs may be used to print out and distribute 50 copies of the paper, or to e-mail to 50 recipients, or a combination of the two (e.g., print
out 25 and e-mail 25). To print or distribute additional copies, authors must receive permission from A&WMA and pay any applicable
permission fees. Authors are also prohibited from posting their articles on any Web sites without permission.
Please remit payment to “A&WMA,” c/o Karen Denne, One Gateway Center, 3rd Floor, 420 Fort
Duquesne Blvd., Pittsburgh, PA 15222-1435, USA; phone: +1-412-904-6005; fax: +1-412-232-3450; e-mail:
kdenne@awma.org.
Thank you for choosing to publish in the Journal of the Air & Waste Management Association.
PAGE CHARGES (required for all manuscripts)
Number of Pages: 13 x US$79/page, A&WMA members (Member No. _________________)
x US$89/page, Non-members
Page Charges Total: = US$_____________
COLOR ARTWORK (only due if manuscript contains color figures)
Number of Color Figures: @ US$450 for first color artwork; US$300 for second color artwork; and US$200 for each
additional color artwork.
Color Artwork Total: = US$_____________
HARD COPY REPRINTS (optional)
Total No. Base Price No. Additional Copies
of Pages (100 copies) (in lots of 100) Total
1 to 4 = US$150 + ______ @ US$35/additional 100 = __________
5 to 8 = US$260 + ______ @ US$70/additional 100 = __________
9 to 12 = US$370 + ______ @ US$105/additional 100 = __________
>13 = US$450 + ______ @ US$140/additional 100 = __________
Example: You would like to order 200 reprints of a 7-page paper. The total cost would be $330 ($260 for the first 100
reprints, plus $70 for an additional 100).
Hard Copy Reprint Total: = US$_____________

GRAND TOTAL (page charges + color artwork + reprints): = US$_____________


Method of Payment (U.S. Funds only)
___ Check made out to A&WMA Credit Card (check one)
in U.S. Dollars ___ American Express ___ MasterCard
___ Discover ___ Visa
Card number
Expiration date
Name on card
Signature

Payment is due upon receipt of this invoice.


Manuscript will not be published prior to payment.
Journal of the
Air & Waste Management Association

AUTHOR’S APPROVAL OF PAGE PROOFS


** Please respond within 48 hours **

Corresponding Author: __________________________________________________

08-00158
Manuscript Title: _______________________________________________________

_______________________________________________________________________

Manuscript Number: ______________

Tentative Journal Issue: ______________

I hereby acknowledge that I have carefully read and examined the page proofs of my article referred
to above, including all text, figures, and tables, and that I have corrected all errors. I fully understand
that it is my responsibility to detect any errors on the page proofs before publication.

__________________________________________ ____________________________
Corresponding Author’s Signature Date

PLEASE COMPLETE AND RETURN THIS FORM,


ALONG WITH ANY CORRECTIONS, TO

Trisha Gage, Editorial Production Services


Cadmus Professional Communications
46 Elfman Drive
Doylestown, PA 18901

Phone: +1-215-489-7120
Fax: +1-215-489-3196
E-mail: gagep@cadmus.com

*Note: Page numbers may change prior to publication*


tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
TECHNICAL PAPER ISSN:1047-3289 J. Air & Waste Manage. Assoc. 59:000 – 000
DOI:10.3155/1047-3289.59.8.1
Copyright 2009 Air & Waste Management Association

Using a Continuous Time Lag to Determine the Associations


between Ambient PM2.5 Hourly Levels and Daily Mortality
Joan G. Staniswalis and Hongling Yang
Department of Mathematical Sciences, University of Texas at El Paso, El Paso, TX

Wen-Whai Li
Department of Civil Engineering, University of Texas at El Paso, El Paso, TX

Kerry E. Kelly
Institute for Combustion and Energy Studies, University of Utah, Salt Lake City, UT

ABSTRACT INTRODUCTION
The authors are interested in understanding the possible Particulate matter (PM) air pollution has been associated
association between exposure to short-term fine particu- with adverse health effects, decreased heart variability in
late matter (PM2.5) peaks that have changing physical the elderly, and respiratory- and cardiopulmonary-related
characteristics throughout the day and observable health morbidity and mortality. Among the various sizes of PM,
outcomes (daily mortality). To this end, modern statisti- fine PM (PM ⱕ2.5 ␮m in aerodynamic diameter, or PM2.5)
cal methods are used here that allow for a continuous is of particular concern. Pope and Dockery1 summarized AQ: A
time lag between hourly PM2.5 mass concentration and the evaluations of health effects associated with long- and
daily mortality. The functional linear regression model short-term exposures to ambient PM conducted in recent
was used to study how hourly PM2.5 mass of past days years. They reported that thoracic PM (PM ⱕ10 ␮m in
continuously influences the daily mortality count of the aerodynamic diameter, or PM10) is associated with all-
current day. Using a Poisson likelihood with the canoni- cause mortality, lung cancer, and nonmalignant respira-
cal link, the authors found that a 10-␮g/m3 increase in the tory mortality for males and coronary heart disease in
hourly PM2.5 above the hourly average is associated with females. Higher health risks may be associated with expo-
1.7% (0.1, 3.4), 2.4% (1.2, 3.7), 1.6% (0.6, 2.7), and 0.8% sure to PM2.5 rather than to thoracic PM10.
(⫺0.2, 1.8) higher risk of mortality on the same day, next Most recently, the Women’s Health Initiative Obser-
day, 2 days, and 3 days later, respectively. The increase in vatory Study reported by Miller et al.2 found a 76% in-
relative risk is statistically significant for lags of 0 –2 days, crease in the risk of death from cardiovascular disease for
but not at lag 3. The highest association between PM2.5 every increase of 10 ␮g/m3 in long-term averaged PM2.5
mass concentration and daily mortality was found to concentration. Dominici et al.3 evaluated the short-term
occur in the morning when both mass and PM number health effects of PM2.5 and reported positive association
concentrations peak at approximately 8:00 a.m. (lag of 15, between day-to-day variation in PM2.5 and hospital ad-
39, and 63 hr). This morning time interval corresponds to missions, with the largest effect observed at 0 days lag
automobile traffic rush hour that coincides with a morn- (same day) for all cardiovascular health outcomes except
ing atmospheric inversion that traps high concentrations ischemic heart disease (at 2 days lag time instead) using
of nanoparticles. daily averaged PM2.5 and hospital admission data from
204 counties in the United States. In particular, the risk of
heart failure increased by 1.28% per 10-␮g/m3 increase in
same-day PM2.5. Both studies echoed the earlier statement
IMPLICATIONS made by the U.S. Environmental Protection Agency4 that
In this paper, the different sampling rates of mortality (daily) “the new studies support previous conclusions that short-
and PM2.5 (hourly) are accommodated by a historical func-
term exposure to fine PM is associated with both mortal-
tional linear model without any reduction of the hourly
ity and morbidity”, where short-term is referring to 24-hr
PM2.5 data. All lags of PM2.5 are simultaneously considered,
thereby avoiding variable selection methods. Daily mortality exposure. Other recent findings show changes in heart
was significantly associated with hourly PM2.5 (P value ⫽ rate variability5,6 and triggering of myocardial infarc-
0.0011) and most highly associated with a morning peak of tion7,8 associated with 1- to 4-hr peaks in PM10.
PM2.5 mass concentration, which is the primary peak in the The authors are interested in understanding the pos-
total number of particles and is dominated by ultrafine sible association between exposure to diurnal episodic
particles. The results of the analysis imply that short-term (hourly) PM2.5 concentrations and short-term observable
exposure to elevated PM2.5 mass and number concentra- health outcomes (daily mortality). To this end, modern
tions might be the better predictor of daily mortality in El statistical methods are used that allow for a continuous
Paso, TX.
time lag between hourly PM2.5 and daily mortality. More-
over, the different sampling rates for mortality (daily) and

Volume 59 August 2009 Journal of the Air & Waste Management Association 1
Nanoparticles;Fine Particles;Functional Linear Model;Smooth Distributed Lag Model
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
Staniswalis, Yang, Li, and Kelly

PM2.5 (hourly) can be accommodated with no reduction data collected at CAMS12 were used for the analysis.
of the hourly PM2.5 profile before its inclusion as a covari- PM2.5 at this station may not be a perfect representation
ate in a regression model. The authors propose the use of of exposure for the general population of El Paso, as it is
the functional linear regression model9 to study how well known that use of data from a central monitoring site
hourly PM2.5 of past days continuously influence the can lead to exposure misclassifications under various cir-
daily mortality count of the current day. The authors wish cumstances. CAMS12 data are likely to underestimate
to test the hypothesis that daily mortality rate in El Paso, PM2.5 for the region, but CAMS12 data were the only
TX, is associated with exposure to elevated hourly PM2.5 complete set of PM2.5 data for the region and the study
concentrations. time period. However, PM10 data at this site were ex-
Most studies of the association between PM2.5 and
cluded from this study because of excessive missing values
daily mortality have two common elements in the statis-
between July 2000 and October 2001 (⬎20% of the study
tical analysis: (1) hourly PM2.5 measurements are col-
period).
lapsed by summary statistics such as the daily mean, and
B-splines were used to interpolate the hourly PM2.5
(2) variable selection methods are used to determine the
lag of daily average PM2.5 most highly associated with mass concentration data to impute any missing observa-
daily mortality. Staniswalis et al.,10 an exception to (1), tions spanning fewer than 5 hr. Data gaps of 5 or more
demonstrated a significant association between PM10, lag hours in a given day were imputed with the mean for that
3, and daily mortality in El Paso, TX, using the first prin- day on the basis of the available PM2.5 hourly observa-
cipal component score of hourly PM10, an association tions. Twenty-five days were missing all hourly data, so
that would have otherwise been missed with the daily the missing values were imputed with the mean of all
average of PM10. Notable exceptions to (2) involve the use available hourly PM2.5 data over the 6-yr period.
of the distributed lag model (DLM), a regression model Figure 1 is a graphical summary of the hourly PM2.5 F1
including time-lagged variables as covariates. The stan- data from this time period. Figure 2 displays the directions F2
dard unconstrained DLM includes the average daily PM of maximum variation (first PC direction) of the hourly
(PM10 or PM2.5) for all lags from, for example, 0 to q days, profiles for PM2.5 for each year of the study period. A
as covariates in the regression model. A polynomial DLM pooled estimate of the first PC direction (as in Staniswalis
avoids the problem of multicollinearity among the lagged et al.10 for hourly PM10) is not computed because the
PM levels by constraining the coefficients to be a polyno- shapes of the curves are not the same by year. The direc-
mial function of the lag number.11 The smooth DLM tion of maximum day-to-day variation in the hourly pro-
replaces the polynomial function with a smooth function files of PM2.5 is in the size of an 8:00 p.m. evening peak
of the lag number. Zanobetti et al.12 first applied the (years 2000, 2004, 2005), a late afternoon peak (year
smooth DLM to daily summaries of pollution/mortality 2003), and an early afternoon peak (year 2001, 2002). AQ: B
data from Milan, Italy.
Table 1 displays summary statistics of daily average PM2.5 T1
As an extension to the study of Staniswalis et al.,10
and temperature and hourly PM2.5 for the study period.
this paper examines the association between the newly
Complete mortality data corresponding to all natural
available PM2.5 and mortality data with the intention to
deaths in El Paso County were obtained from the Texas
delineate the effects caused by naturally occurring wind
events (the major source for elevated PM10 concentration) Department of Health for the 6-yr interval 2000 –2005.
and diurnal anthropogenic traffic-related pollution (the Natural deaths were identified with ICD-9 codes (Interna-
major source for elevated PM2.5 concentration). The au- tional Classification of Diseases codes) less than 800. For
thors use the smooth DLM with hourly PM2.5 measure- the time period under study, the average number of daily
ments, rather than daily mean PM2.5, thereby avoiding (1) deaths is 11.3, and 12.6 is the variance.
and (2) above; namely, arbitrary collapsing of the hourly
measurements and variable selection for determining the
lag of the pollutant most highly associated with daily
100 The hollow box indicates the median, and the bars indicate the 25th and 75th percentiles.
mortality. The smooth DLM is an example of a historical The solid box indicates the 95th percentile.

functional linear model13; the authors review the histor-


ical functional linear model and fitting by penalized re-
PM2.5 Concentration (ug/m3)

gression splines before its application.

DATA 10
The study was restricted to the 2000 –2005 6-yr period for
which records of hourly PM2.5 and temperature were ob-
tained from the Aerometric Information Retrieval System
for the site located on the edge of the University of Texas
at El Paso (CAMS12). CAMS12 and CAMS40 (both state-
operated continuous ambient monitoring stations) are 1
the only monitoring stations providing hourly PM2.5 data
0 5 10 15 20
during the study period. CAMS40 is located near a bus/ Time (hr)
train terminal in an industrial section of El Paso, which is
not heavily populated, and the PM2.5 levels at CAMS40 Figure 1. Boxplot (quartiles and 95th percentile) of PM2.5 on log10
are not likely to be representative of El Paso. Only PM2.5 scale by hour of day for 2000 –2005 for El Paso, TX.

2 Journal of the Air & Waste Management Association Volume 59 August 2009
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
Staniswalis, Yang, Li, and Kelly

PM2.5 1st PC Yrs 00-05 with mean denoted by ␮t ⫽ E[y(t)]. The canonical link for
the Poisson distribution is used, meaning that ␩t ⫽ ln␮t is
0.6

the linear predictor, which the authors model as


0.4

S共t兲
␩ t ⫽ ␣共t兲 ⫹ x共S共t兲 ⫺ u兲␤共u兲du. (1)
0.2

0
0.0

The intercept function ␣(t) is the overall summary of


-0.2

other factors influencing mortality y on the current day t.


2000 1st PC ; Direction 32 %
2001 1st PC ; Direction 31 % The slope function ␤(u) reflects the strength of the asso-
2002 1st PC ; Direction 43 %
ciation between pollution and mortality at a lag of u
-0.4

2003 1st PC ; Direction 35 %


2004 1st PC ; Direction 40 %
2005 1st PC ; Direction 42 % hours. For example, the value of ␤ (u) ⱍu⫽12hourslag de-
scribes the influence of the noon reading of hourly PM2.5
-0.6

on today’s deaths, whereas ␤ (u) ⱍu⫽36hourslag describes the


0 5 10 15 20
influence of yesterday’s noon hour reading of PM2.5 on
Unsmoothed PC Directions
today’s deaths. The relative risk, RR, associated with a
Figure 2. First harmonic from principal component analysis of 10-␮g/m3 uniform increase in the hourly PM across a lag
PM2.5 for 2000 –2005 for El Paso, TX. of [T1, T2] hours, while holding other variables fixed, can
be computed according to
HISTORICAL FUNCTIONAL LINEAR MODEL

冉冕 冊
A historical functional linear model13 refers to the special T2
case when the influence of the covariate on the response RR ⫽ exp 10 ␤共u兲du . (2)
is of a feed-forward nature. This means that the behavior T1
of daily mortality y(t) depends only on the behavior of
hourly PM2.5 x(s) at past times s ⱕ S(t). Here the transfor-
mation S(t) ⫽ 24t converts the daily index t 僆 {1, …, T} of Parametric models that are linear in daily average PM2.5
mortality y(t) to the hourly index s of PM2.5 x(s). Daily assume a functional form for ␤(u). For example, a linear
mortality is modeled as a Poisson distributed variable model in x៮ t,0, daily average PM2.5 with zero lag

Table 1. Summary statistics for temperature and PM2.5: mean, standard deviation, minimum, quartiles, and maximum.

Variable Mean SD Minimum P25 Median (P50) P75 Maximum

Daily average temperature (°F) 67 15 26 54 69 80 94


Daily average PM2.5 (␮g/m3) 9 6 .5 6 8 11 99
Hourly PM25 (␮g/m3) time interval
关0,1) 10 17 0 4 6 11 580
关1,2) 8 10 0 3 6 10 155
关2,3) 8 8 0 3 6 9 121
关3,4) 7 7 0 3 5 9 91
关4,5) 7 7 0 4 6 9 161
关5,6) 8 7 0 4 7 10 137
关6,7) 9 7 .1 5 8 12 93
关7,8) 10 6 .3 6 8 12 77
关8,9) 9 6 0 5 8 11 57
关9,10) 8 6 0 4 7 10 48
关10,11) 7 7 0 3 6 9 101
关11,12) 6 11 0 2 5 8 399
关12,13) 6 13 0 2 4 7 353
关13,14) 6 14 0 2 4 7 265
关14,15) 7 17 0 2 4 7 396
关15,16) 7 21 0 2 4 7 636
关16,17) 8 19 0 3 4 8 405
关17,18) 9 17 0 3 6 10 417
关18,19) 12 16 0 5 8 13 270
关19,20) 15 17 0 6 10 18 180
关20,21) 15 15 0 6 10 20 189
关21,22) 13 16 0 5 9 17 347
关22,23) 12 16 0 4 8 14 417
关23,24) 10 14 0 4 7 12 287

Notes: SD ⫽ standard deviation.

Volume 59 August 2009 Journal of the Air & Waste Management Association 3
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
Staniswalis, Yang, Li, and Kelly

␩ t ⫽ ␣共t兲 ⫹ ␤ 0 x៮ t,0,

is formulated in terms of a historical functional linear


(3)
␩ t ⫽ ␣共t兲 ⫹ 冕 min共S共t兲,S共␶兲兲

0
x共S共t兲 ⫺ u兲␤共u兲du. (9)

model by specifying ␤(.) to be the boxcar function


In any case, computer implementation of the historical

再 1 functional linear model requires truncation of the lags


24
␤ 0 u 僆 关0, 24兴 hours used in prediction of the current day. Truncating the past
␤共u兲 ⫽ (4)
influence of pollution to ␶ days has the effect of stabilizing
0 otherwise.
the variance of the estimator of ␤(䡠); although, it comes at
the cost of introducing one more unknown parameter.
In which case, the relative risk RR according to eq 2 is An additional modeling assumption is made that fur-
exp(10␤0) for a 10-␮g/m3 uniform increase in the hourly ther distinguishes the authors’ model from that of Zano-
PM across [0,24] hours on the current day. betti et al.12 In a final step taken to reduce the dimension-
Consider instead the linear model ality of the regression problem, it is assumed that the
intercept function ␣(t) in eq 9 is a periodic function.
␩ t ⫽ ln␮t ⫽ ␣共t兲 ⫹ ␤2x៮ t,2. (5)
␣共t兲 ⫽ ␣共t ⫹ 365兲, t 僆 兵1, · · · ,T其. (10)
where x៮ t,2 is daily average PM2.5 at a 2-day lag. In this case
␤(u) is specified as This intercept function ␣(t) accounts for seasonal changes
over time t that influence daily mortality yt. However,

再 1 assuming periodicity of ␣(䡠) necessitates augmenting eq 9


␤ 2 u 僆 关48, 72兴 hours
24 to include a slowly varying nonparametric function s1(t)
␤共u兲 ⫽ (6)
0 otherwise. to model nonperiodic changes over time t; for example,
changes in y attributable to population growth over time.


The relative risk RR according to eq 2 is exp(10␤2) for a
10-␮g/m3 uniform increase in the hourly PM across min共S共t兲,S共␶兲兲

[48,72) lagged hours. As a final example, if daily average ␩ t ⫽ ␣共t兲 ⫹ s 1 共t兲 ⫹ x共S共t兲 ⫺ u兲␤共u兲du
0
PM2.5 is used to predict daily mortality with a 0-, 1-, and
2-day lag ⫹ s2共temperaturet兲 ⫹ terms for day-of-weekt.
(11)
␩ t ⫽ ln␮t ⫽ ␣共t兲 ⫹ ␤0x៮ t,0 ⫹ ␤1x៮ t,1 ⫹ ␤2x៮ t,2, (7)
Equation 11 now also includes nonparametric additive
then terms s2(.) for modeling the dependence of mortality on
the daily mean of temperature and indicator variables for


1 day of the week. The nonparametric functions in the
␤ 0 u 僆 关0, 24兴 hours
24 additive model given by eq 11 are all modeled as linear
1 combinations of a flexible cubic B-spline basis. Detailed
24
␤1 u 僆 关24, 48兴 hours treatment of the estimation of these functions is included
␤共u兲 ⫽ (8)
1
in the Appendix for interested readers. The data analysis
24
␤2 u 僆 关48, 72兴 hours was written in the S programming language using S-PLUS.
Similarly, other variables, either co-pollutants (e.g.,
0 otherwise. nitrogen dioxide [NO2] ozone, coarse PM [or PM10 –25]) or
atmospheric conditions (mixing height, humidity, wind
In each of the above parametric examples, the coefficients speed) can be incorporated into the model to evaluate the
␤0, ␤1, and ␤2 could be estimated from the data using the dependence of mortality on them. In this study, the au-
usual time-series methods for Poisson regression and thors have limited the nonparametric additive term to
would have the usual interpretation as the coefficient of one for ambient temperature. Dew point was not in-
daily average PM2.5 with 0-, 1-, or 2-day lag, respectively. cluded in the model because it is highly correlated with
In this paper, a functional form for ␤(u) is not speci- temperature, r ⫽ 0.6 sample correlation.
fied, instead it is estimated using nonparametric smooth-
ing methods described in the next section. In the data RESULTS
analysis, the nonparametric estimate of ␤(u) will be com- The truncated historical functional linear model (eq 11) AQ: C
pared with the parametric estimate given by eq 8. was fit to the El Paso 2000 –2005 pollution/mortality data.
Hourly PM2.5 was a significant predictor of daily mortality
Additional Modeling Assumption adjusted for temperature and day of the week. Figure 3 F3
It is reasonable to suppose that only pollution exposures overlays the fitted component functions in the additive
corresponding to time lags u ⱕ ␶ are likely to impact model (eq 11) that includes the covariate hourly PM2.5 for AQ: D
mortality on a given day, leading to the following trun- ␶ ⫽ 4, 6, 8, 10, and 12 days in the prediction of daily
cated historical functional linear model for ␩t ⫽ ln␮t: mortality. The parameter ␶ determines the number of

4 Journal of the Air & Waste Management Association Volume 59 August 2009
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
Staniswalis, Yang, Li, and Kelly

Figure 3. Estimates of the component curves in eq 11, including lags of up to ␶ ⫽ 4, 6, 8, 10, and 12 for years 2000 –2005 for El Paso, TX.
The placement of the knots is indicated by the tic marks. The shapes of all of the component functions are consistent across all values of ␶.
Simulations14 suggest that late truncation is preferred over early truncation to reduce bias; the value ␶ ⫽ 8 was selected for prediction of mortality.

days for which hourly PM2.5 measurements are included The fit in Figure 4a for the intercept function ␣(t)
as a predictor of daily mortality; for example, ␶ ⫽ 12 indicates that daily mortality tends to be highest in Jan-
includes the hourly PM2.5 measurements for the current uary, decreasing in time until a minimum at approxi-
day and the previous 11 days. The values ␶ ⫽ 4, 6, 8, 10, mately 180 days (June/July), then increasing for the rest
and 12 days were chosen arbitrarily for visual inspection. of the year without attaining the global maximum seen
The knots for the periodic intercept function ␣(t) were in January. The fit in Figure 4b for the slope function ␤(䡠)
placed at 2-month (60 days) intervals, the knots for the is convex with a global maxima between 0.5 and 3 days,
slope function ␤(䡠) were placed at 2-day intervals, the suggesting that the association between daily mortality
knots for the smooth dependence of time s1(t) were and PM2.5 is highest for a 0.5- to 3-day lag. The point-
placed at 2-yr intervals, and those for the smooth depen- wise confidence intervals for the estimate of the slope
dence on temperature s2(temperature) were placed at 10 ° function ␤(䡠) do not contain 0 for lags within approxi-
intervals. For each component in Figure 3, the placement mately 0.5–3 days, signifying that PM2.5 is a significant AQ: E

of the knots is indicated by the tic marks. The values of predictor of daily mortality. Furthermore, Table 2 shows T2

the smoothing parameter ␭ were obtained by trial and that an increase of 10 ␮g/m3 in the hourly PM2.5 across a
day would be associated with a RR of 1.017 (1.001,1.034),
error and set to ␭␣ ⫽ 104, ␭␤ ⫽ 107, and the smoothing
1.024 (1.012,1.037), 1.016 (1.006,1.027), and 1.008
parameters for the other smooth functions were set to
(0.998,1.018) deaths on the same day and following 3
106. A larger value of ␭ results in a smoother fit of the
days, respectively; see eq 2 for RR. The increases in RR for
regression, as compared with a smaller ␭, which provides
lags of 0 –2 days are statistically significant, but not for the
a much rougher fit. The shapes of all of the component
3-day lag. These RR estimates are consistent with those
functions in the full model are fairly consistent across all
reported based on daily average PM2.5; namely, RR be-
values of truncation ␶. However, because the dimension tween 0.9 and 1.17, with the exception of one study
of the B-spline basis needed for estimation of the slope ␤(䡠) reporting a nonsignificant finding of 1.32 (U.S. Environ-
in the full model increases in ␶, the power of the scaled mental Protection Agency4 2006, Table 1).
deviance test statistic goes down as ␶ increases. Thus, it The fit in Figure 4c for s1(t) shows that daily mortality
was found that the P value for significance of PM2.5 ranges was steadily increasing in time over the 6-yr period, al-
from 0.0003 to 0.003 as ␶ increases from 4 to 12. though the rate of increase was lower over 2002–2004.
Computer simulations14 suggested that late trunca- Figure 4d displays the fit for s2(temperature), showing that
tion was preferred over early truncation to reduce bias. mortality tends to decrease as temperature increases. Al-
Thus, those computer simulations suggest that results though this seems counterintuitive at first, it is consistent
shown in Figure 3b are preferred with truncation at ␶ ⫽ 8 with Figure 4a, which shows that June/July (months with
F4 days for prediction of mortality. See Figure 4 for the fitted the highest temperatures) on average have fewer deaths,
components of eq 11 along with ⫾2 standard error bars whereas December/January (months with the lowest tem-
with ␶ ⫽ 8. peratures) on average have the most deaths. So it seems

Volume 59 August 2009 Journal of the Air & Waste Management Association 5
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
Staniswalis, Yang, Li, and Kelly

p−value= 0.0011 df= 5 for significance of PM2.5


(a) intercept function (b) slope function (c) s1(t)

0.10
2.50

0.0001

0.05
2.45

Smooth Dependence
periodic intercept

slope PM2.5
0.0

0.0
2.40

−0.0001

−0.05
2.35

−0.0002

−0.10
1 61 121 181 241 301 365 0 2 4 6 8 1 731 1461 2190
time (days) lag (days) time (days)

(d) s2(t) (e) terms for day−of−week


0.15

0.06
0.10

0.02 0.04
Smooth Dependence

Change in Intercept
0.0 0.05

0.0
−0.05

−0.04 −0.02
−0.10

25 35 45 55 65 75 85 95 1 2 3 4 5 6 7
temperature day of week, 1: Saturday, ..., 7: Friday

Figure 4. Estimates of the component curves in eq 11. The nonparametric estimates (solid lines) including lags of up to ␶ ⫽ 8, displayed with
⫾2 pointwise standard errors (dashed lines). (a) Mortality tends to be highest in January, decreasing in time until a minimum in June/July, then
increasing for the rest of the year without attaining the global maximum attained in January. (b) The fitted slope function has a global maxima
between 0.5 and 3 days, suggesting that the association between daily mortality and PM2.5 is highest for a 0.5- to 3-day lag. The pointwise
confidence intervals do not contain 0 for lags within approximately 0.5–3 days, signifying that PM2.5 is a significant predictor of daily mortality.
(c) On average, daily mortality is increasing in time, the rate of increase was lower over the years 2002–2004. (d) Mortality tends to decrease
as temperature increases. This is consistent with (a), which shows that June/July on average have fewer deaths, whereas December/January
have the most deaths. The estimate of smooth dependence on temperature may be viewed as an adjustment of the smooth intercept function
in (a). (e) The coefficients of day of the week suggest that deaths on average significantly increase on Friday.

plausible that the estimated smooth function and daily mortality at specific hours, the regression anal-
s2(temperature) is an adjustment of the smooth intercept ysis is executed with regularly spaced knots at 2-hr inter-
␣(t) shown in Figure 4a. Lastly, Figure 4e displays the vals, ␭␤ ⫽ 108 and ␶ ⫽ 3 for the estimation of the slope
coefficients for the indicator variables for day of the week function ␤(䡠). The knots and smoothing parameters for the
(Saturday through Friday). Only the coefficient for the other additive components ␣(䡠), s1(t), and s2(temperature)
indicator function for Friday is significantly different remain as before. Figure 5 displays the function estimates F5
from zero, and because the coefficient is greater than zero, with ⫾2 standard error bars in solid lines, from which it
on average there are more deaths on Friday than the other can be seen that morning hourly PM2.5 levels around 8:00
days of the week. a.m. (lags of 15, 39, 63 hr) are most highly associated with
Thus far, because this analysis has placed the knots mortality. The fits for the other terms in the model were
for estimation of the slope ␤(䡠) at 2-day intervals, fine similar to those given in Figure 4 and are not displayed.
details of the effects of hourly PM2.5 on daily mortality are As a basis for comparison with other published re-
obscured by the rigidity of the B-spline basis. To develop sults, relative risk is also calculated based on the paramet-
a better understanding of the association between PM2.5 ric Poisson linear model (eq 8) that uses the daily average

Table 2. Predicted effect of a 10-␮g/m3 increase in hourly PM2.5 across 1 day.

Relative Risk (⬃95% Confidence Interval)

Model terms for PM2.5 for Same Day 1-Day Lag 2-Day Lag 3-Day Lag

Hourly, eq 2, ␶ ⫽ 8 1.017 (1.001,1.034) 1.024 (1.012,1.037) 1.016 (1.006,1.027) 1.008 (0.998,1.018)


Daily average, eq 8 1.018 (0.997,1.039) 1.024 (1.004,1.046) 1.023 (1.003,1.044)

6 Journal of the Air & Waste Management Association Volume 59 August 2009
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
Staniswalis, Yang, Li, and Kelly

changed from approximately 20 to 40 nm at night (Figure


0.0006

8). The total number of particles peaked in the morning as F8


well as in the evening, whereas the mode of the particle
size changed from approximately 20 to 40 nm, indicating
0.0004

different PM sources may be responsible for the mass and


number concentrations and/or agglomeration of particles
in the atmosphere during the day. The authors thus be-
0.0002

lieve that mortality is associated with exposure to a high


number of nanoparticles in the morning hours, most
likely generated by traffic-related activities rather than the
slope

ambiguously defined mass concentration, and hypothe-


0.0

size a better predictor of number concentration for mor-


tality.
−0.0004 −0.0002

3AM 3AM 3AM


8AM 8AM 8AM
DISCUSSION
1PM 1PM 1PM The goal in this study is to use PM2.5 as an indicator to
shed light on the associations between mortality and pol-
lutants. This analysis uses the raw hourly PM2.5 mass
concentrations in a linear model for prediction of daily
mortality avoiding the arbitrary use of the daily mean,
which inadvertently obscures the significance of acute
lag (days) (hourly) PM exposure on various health outcomes.10 At

Figure 5. Estimates of the component curves in eq 11. The solid


lines are the nonparametric estimates of ␤(u), ␶ ⫽ 3, with ⫾2 point-
wise standard errors. The x-axis tic marks display the location of the
knots in time. The dashed lines are the parametric estimates of the
coefficients of ␤(u) corresponding to the daily average PM2.5 at lags
0, 1, and 2 days with ⫾2 pointwise standard errors.

PM2.5 at 0-, 1-, and 2-day lags (Table 2). The changes in
relative risk for lags 1–2 are statistically significant, but
not for lag 0. The nonparametric model gave statistically
significant changes in relative risk for all lags 0 –2, but the
relative risk estimates themselves at lag 0 are similar. The
parametric estimates of the slope function ␤(䡠) under eq 8
together with ⫾2 standard error bars are graphed in Figure
5 in dashed lines. The parametric estimate seems to reflect
a mean value of the nonparametric smooth estimates over
each of the 24-hr periods, obscuring the relationship be-
tween hourly PM2.5 and daily mortality. The findings
using hourly and daily average PM2.5 (see Table 2) are in
line with those of Franklin et al.,15 who reported a relative
risk of 1.012% or all-cause mortality associated with a
10-␮g/m3 increase in the previous day’s PM2.5 average
concentration in a study of 1.3 million deaths in 27 U.S.
communities.
The authors next examined the physical characteris-
tics of PM2.5 at this site. Temporal variations of PM2.5
mass and size-resolved number concentrations collected
at this site for two consecutive seasons (October 26 to
November 12, 2006 for the fall season and March 29 to
April 15, 2007 for the spring season) using an aerody-
namic particle sizer, a scanning mobility particle sizer,
and two tapered element oscillating microbalance moni-
tors were further analyzed.16,17 A PM2.5 mass peak of less
magnitude occurring in the morning hours between 6:00
and 9:00 a.m. was observed for 50% of the time during the
exploratory study. It was observed that the number con-
centration peaked in the morning hours, whereas another
number concentration peak of less magnitude was ob- Figure 6. Summary of hourly total number concentrations at
F6-F7 served in the evening (Figures 6 and 7), and the mode CAMS12.

Volume 59 August 2009 Journal of the Air & Waste Management Association 7
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
Staniswalis, Yang, Li, and Kelly

with daily mortality. This is consistent with the published


literature reviewed by the U.S. Environmental Protection
Agency.4 Such a global viewpoint over the effects of time
was achieved by (1) placing the knots for estimation of
␤(䡠) at regular 2-day intervals, and (2) allowing PM2.5
hourly levels for day t and the previous 7 days (␶ ⫽ 8) to
enter the linear model for prediction of mortality on day t.
Figure 5 allows for a more detailed view in time of the
relationship between PM2.5 and mortality that is not pos-
sible in studies that use the daily average PM. Such a
detailed view was obtained by (1) placing the knots for
estimation of ␤(䡠) at regular 2-hr intervals, and (2) choos-
ing ␶ ⫽ 3 to zoom in on the association between daily
mortality and hourly PM2.5 on the current day and pre-
vious 2 days. The highest association between PM2.5 mass
concentration and mortality occurs with morning PM2.5
levels around approximately 8:00 a.m. (lag of 15 hr) or
earlier.
The irrelevance of the even higher diurnal evening
PM2.5 mass peak (Figure 1) to mortality and the associa-
tion between PM2.5 mass and mortality previously re-
ported for El Paso10 deserve some investigation. Staniswa-
lis et al.10 found that diurnal evening peak (likely caused
by traffic related emissions) and occasional afternoon
peak (caused by high winds) explain 40 and 15%, respec-
tively, of the day-to-day variation in the daily PM10 pro-
files in El Paso. However, the obvious morning peak (Fig-
ure 2 of Staniswalis et al.10) was obscured mathematically
in their principal component analysis of a 5-yr record of
PM10 and mortality data. They further observed that mor-
tality in El Paso may be underestimated by 20% if daily
average PM10 instead of hourly PM10 is used and that
relative risk is approximately 20% higher when the
evening PM10 peak occurred under low-wind conditions
(wind speed ⬍2.5 m/sec).
It is well known in the literature that although the
direction of maximum variation (evening peak) may be
Figure 7. Normalized diurnal distributions and modes of nanopar- significantly associated with the response (mortality), the
ticles of sizes 30 nm and below at CAMS12: (a) fall 2006 season, direction of maximum variation (evening peak) is not
and (b) spring 2007 season. necessarily the most highly correlated covariate with a
given response (mortality).21 Power to detect statistical
high ambient concentrations, acute short-term (1-hr) ex- significance of a covariate is driven by two features of the
posure to diesel engine exhaust has been shown to pro-
duce a well-defined and marked systemic and pulmonary
inflammatory response in healthy human volunteers.18,19
Studies have begun to link short-term increases in hourly
PM2.5 concentrations with adverse cardiac effects.20 How-
ever, few studies have evaluated the effect of short-term
increases in PM2.5 concentration on daily mortality.
Our fitted model differs from that of Zanobetti et al.12
in that the smooth function in time included two additive
components: a periodic piece of period 365 days, ␣(t), and
a slowly varying function in time, s1(t) (see eq 11). This
has the advantage of reducing the dimensionality of the
model. Furthermore, Zanobetti et al.12 used daily average
PM2.5 in a functional linear model, and hourly measure-
ments were used.
The association over time of PM2.5 and mortality
were studied by varying selection of the knots and trun-
cation point ␶ in the historical functional linear model
(see eq 11). This is reviewed below. Figure 4 shows that Figure 8. Normalized average particle size distributions: (a) fall
PM2.5 lags of up to 0.5–3 days are most highly associated 2006 season, and (b) spring 2007 season.

8 Journal of the Air & Waste Management Association Volume 59 August 2009
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
Staniswalis, Yang, Li, and Kelly

covariate: the variance of the covariate and the correla- United States and to evaluate the association between
tion with the response. A covariate may be highly corre- daily mortality and other pollutant components such as
lated with a response, but there may not be enough power nitric oxide, carbon monoxide, and NO2.
to detect this when the variance of the covariate is small
(a PC direction of lesser variation). Figure 2 shows that APPENDIX
indeed the evening peak is still the primary direction of This appendix lays out the matrix expression for the lin-
maximum variation in this study explaining the day-to- ear predictor ␩t ⫽ ln␮t and the equations for estimation of
day variation of PM2.5 mass, although mathematically it the parameters.
is not as dominant as that observed in the PM10 analysis
of Staniswalis et al.10
The conclusion that mortality risk associated with Nonparametric Curve Fitting with P-splines
either diurnal evening or occasional afternoon PM10 For model identifiability, the usual constraints
peaks under high-wind condition is lower than that under
low-wind conditions10 and the fact that PM10 mass in the
⌺ i⫽1
T
s1共ti兲 ⫽ 0 (12)
region is composed of mostly PM10 –25 at 75–90%22–24
imply that the coarse fraction of PM10 may be less potent
than PM2.5 and that PM2.5 may be a better predictor than E关s2共temperature兲兴 ⫽ 0 (13)
PM10 for daily mortality. Unfortunately, the authors were
unable to fit the functional linear model on PM10 data at
this CAMS12 site to further validate the authors’ premise are imposed on the estimates in eq 11. The nonparametric
because of excessive missing data. functions in the additive model are all modeled as linear
The fact that the morning PM2.5 peak is highly cor- combinations of a cubic B-spline basis,26,27 meaning that
related with daily mortality, as shown in Figure 5, is the fitted functions are piecewise cubic polynomials
interesting, particularly when the evening peak is not smoothly joined together at prespecified points in time
implicated in the analysis (Figure 1). The authors thus called knots. Cubic B-splines provide a convenient flexi-
looked into the physical and chemical difference between ble basis for the representation of smooth curves; for
morning and evening peaks. Chemical speciation data for example, see Ostro et al.,28 who modeled the dependence
the morning and afternoon peaks during this period of of daily mortality on time and weather. B-splines have
time are not available. The most relevant data available compact local support that can be exploited to obtain
are a set of 4-week time-resolved PM2.5 chemical specia- stable and efficient algorithms for nonparametric curve
tion data collected at a station approximately 3.5 mi fitting. Adjacent B-splines have overlapping support that
northwest of the CAMS12 site23 in the winter of 2002. is exploited in the P-spline curve fitting technology.29
Concentrations of anions, elemental carbon, organic car- This will be described next.
bon, and elemental composition of PM2.5 for the 3-hr Let the vector ␤ denote the coefficients in the B-
samples collected in the evening (6:00 –9:00 p.m.) appear spline representation of the unknown regression func-
to be greater than that collected at other 3-hr intervals. tions and the coefficients for the indicator variables of day
Concentrations of chemical composition of morning of the week. The vector ␤ is estimated from the data by
PM2.5 (6:00 –9:00 a.m.) are consistently less than that maximization with respect to of the penalized Poisson
observed in the evening (6:00 –9:00 p.m.) and show little log-likelihood30,31 with the canonical link function:
difference from that observed throughout the day. Thus,
species and magnitudes of chemicals absorbed onto PM2.5
do not appear to be a factor in the dominance of morning ᏸ共␤; data兲 ⫺ P共␭, ␤兲 (14)
peak on mortality.
the Poisson log-likelihood is conditional on the covari-
CONCLUSIONS ates. The second term in the penalized log-likelihood is a
A smooth distributed lag model was used to analyze 6 yr difference penalty that is explicitly given later in this
of continuous records of hourly PM2.5 and daily mortality appendix. It has the effect of imposing smoothness on the
data in El Paso, TX. The authors discovered a statistically regression function by requiring that the heights of
significant association between the morning PM2.5 peak neighboring B-splines with overlapping support “hold
at a central monitoring site at 0.5–3-day lag and daily hands” to withstand erratic fluctuations in the data. This
mortality in the general population. By examining the difference penalty is similar in flavor to the roughness
available physical and chemical properties of PM2.5 for penalty based on the squared integral of the second de-
the site, the authors believe that the total number of rivative of the regression function. A large value of the
nanoparticles, particularly those in the range of approxi- smoothing parameter ␭ results in a smoother fit of the
mately 20 nm, likely generated by traffic-related activities regression function, as compared with small values of ␭,
is the principal factor of the association and that total which provide a much rougher fit.
number of particle may be a better and simplistic indica- The estimate of the parameter ␤ that maximizes eq 14
tor for mortality. A limitation of this study is that in El was computed by iteratively reweighted least squares.31
Paso only one PM2.5 monitoring site was available with This is explained later in this appendix. The scaled Pois-
complete data for the study period. Future studies are son deviance31 was used to test for significance of the
needed to confirm the findings for El Paso to determine covariate x as a predictor of y in eq 11 for the Poisson
the generalizability of the findings to other areas of the mean.

Volume 59 August 2009 Journal of the Air & Waste Management Association 9
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
Staniswalis, Yang, Li, and Kelly

Estimating ␤ Alone Setting h ⫽ 1 hr, X(t) can be evaluated from the observed
Here the method of estimation is developed step by step, raw pollution data. Define ␩ and X:
AQ: F beginning with a simplified model for t 僆 {1, …, T} that

冤冥 冤 冥
has no intercept. Set the truncation point for the histor- ␩t1 Xⴕ共t1兲
ical functional linear model at ␶ ⱕ 365 days. Later, the ␩t2 Xⴕ共t2兲
equations are developed for simultaneous estimation of
· ·
the intercept and the slope functions. So for now, the ␩ ⫽ ·· X⫽ · , (23)
·
model is · ·
· ·
· ·

冕 min(S(t),S(␶))
␩tT T⫻1
Xⴕ共tT兲 T⫻Q
␩t ⫽ x共S共t兲 ⫺ u兲␤共u兲du. (15)
0 so that the model (eq 17) for the entire T days can be
written as
Let {Nj(t), j ⫽ 1,…K} denote the cubic B-spline basis for a
given knot sequence on [0,S(␶)]. The dependence of K on ␩ ⫽ h 䡠 XNbb. (24)
␶ is suppressed from the notation to simplify the notation
in the presentation that follows. The slope function is Estimating ␣(䡠) and ␤(䡠) Simultaneously
approximated by Now let us include the intercept function ␣(t) with period
P ⫽ 365 days


⌺ K j⫽1 bjNj共u兲 u 僆 关0, S共␶兲兴
␤共u兲 ⫽ 0


u ⬎ S共␶兲. (16)
min(S(t),S(␶))
␩ t ⫽ ␣共t兲 ⫹ x共S共t兲 ⫺ u兲␤共u兲du, t 僆 兵1, · · · , T其
An equivalent formulation to eq 15, 0

(25)


K

␩t ⫽ bj␾j共t; x兲, (17) and simultaneously estimate ␣(䡠) and ␤(䡠). It is assumed
j⫽1 that T is a multiple of P. Choose a cubic B-spline basis
{Mj(t)}dj⫽1 of dimension d on [0,P] for representation of
is obtained by defining ␣(t):

x ⫹ 共S共t兲 ⫺ u兲 ⫽ 0 再
x共S共t兲 ⫺ u兲 when u ⱕ S共t兲
otherwise, (18) ␣共t兲 ⫽ 冘
d

aiMi共t mod P兲, t 僆 兵1, · · · , T其. (26)


i⫽1

␾ j 共t, x兲 ⫽ 冕 0
S(␶)
x ⫹ 共S共t兲 ⫺ u兲Nj共u兲du. (19) The matrix form of ␣(t) in terms of the B-spline basis is:

冤 冥冤冥
␣共t 1 兲 Na
Approximate the integrals defining ␾j(t;x) with a quadra- ␣共t 2 兲 Na
ture rule using the Q equally spaced time points {gq}Qq⫽1 · ⫽ · a, (27)
· ·
in [0,S(␶)]: · ·
␣共t T 兲 Na T⫻d


Q
S共␶兲 where a’ ⫽ (a1,…,ad) and
␾ j 共t, x兲 ⬇ x ⫹ 共S共t兲 ⫺ gq兲Nj共gq兲. (20)
Q
q⫽1

冤 冥
M1共t1兲 M2共t1兲 M3共t1兲 ··· Md共t1兲
Let X⬘ (t) ⫽ (x⫹(S(t) ⫺ g1,…, (x⫹(S(t) ⫺ gQ) and b⬘ ⫽ M1共t2兲 M2共t2兲 M3共t2兲 ···
·
·
·
(b1,…,bK), and Na ⫽ · · · · · . (28)
· · · · ·
· · · · ·

冤 冥
N1共g1兲 N2共g1兲· · · · · · NK共g1兲 M1共tP兲 ··· ··· · · · Md共tP兲 P⫻d
N1共g2兲 N2共g2兲· · · · · · NK共g2兲
Nb ⫽ · ·· · ·· · . (21) Incorporating eq 27 into eq 24
· · · · ·
· · ·
N1共gQ兲 N2共gQ兲 · · · · · · NK共gQ兲 Q⫻K
␩ ⫽ 关J
Na
X兴 0 冋 0
hNb 册冋ba册 (29)
Set h ⫽ S(␶)/Q. Then eq 17 can be written in matrix form:
with JTxP composed by stacking identity matrices of di-
␩ t ⫽ h 䡠 Xⴕ共t兲Nbb. (22) mension P ⫻ P; that is,

10 Journal of the Air & Waste Management Association Volume 59 August 2009
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
Staniswalis, Yang, Li, and Kelly

冤 冥
Jⴕ ⫽ 关IP⫻P, . . . , IP⫻P兴 (30) 1 ⫺2 1 0 0 ··· 0
0 1 ⫺2 1 0 · · · 0
The number of identity matrices in the stack must match · · · · · · · , (40)
· · · · · · ·
the number of years of data. Set · · · · · · ·
0 ··· · · · 0 1 ⫺2 ⫺1


˜ ⫽ 关J X兴, Ñ ⫽ 冋 N0 a 0
hNb 册冋ba册, 冋册
a
and ␤ ⫽ b , where the dimension of the matrix is (d ⫺ 2) ⫻ d for Da
and (K ⫺ 2) ⫻ K for Db.
(31)
Define the D to be a block diagonal matrix of dimen-
sion (d ⫹ K) ⫻ (d ⫹ K):
then

␩ ⫽ X̃Ñ␤ (32) D⫽ 冋 ␭ 0D D
a a
t

K⫻d
a 0d⫻K

␭bDbtDb . (41)

⫽ ␹␤ (33) The penalty on the Poisson log-likelihood is

⫽ 关␹ a
a
冋册
␹b兴 b . (34) P共␭ a , a兲 ⫹ P共␭a, b兲 ⫽
2 冋
1 ␭aatDatDaa
0K⫻d
0d⫻K

␭bbtDbtDbb
(42)
Regularized Estimates of ␣(t) and ␤(t)
The Poisson log-likelihood of the data is given by
1 t
⫽ ␤ D␤. (43)
ᏸ共␤兲 ⫽ ⌺ T i⫽1
共⫺␮i ⫹ yi ln ␮i兲, (35) 2

where ␮i is the parameter for the Poisson distribution that The partial derivative of this penalty with respect to ␤ is
generated the mortality count yi. Assume the canonical given by
link function ␮t ⫽ exp (␹Ti ␤) with ␹Ti denoting the ith
row of ␹ given in eq 33. Thus, ⳵
共P共␭ a , a兲 ⫹ P共␭a, b兲兲 ⫽ D␤. (44)
⳵␤
⳵ᏸ
⫽ ⌺ T i⫽1 共y1 ⫺ ␮i兲␹i (36)
⳵␤ Finally, putting together eqs 37 and 44, the maximizer ␤
of eq 14 solves the nonlinear equation AQ: G
⫽ ␹ t 共Y ⫺ ␮兲, (37)
␹ t 共Y ⫺ ␮兲 ⫺ D␤ ⫽ 0. (45)
where ␮ ⫽ (␮1, …, ␮n) . t

The roughness penalty29 needed to constrain varia- The estimate of ␤ is computed using iteratively re-
tion of the functions ␣(䡠) and ␤(䡠) is described next. Recall weighted least squares as follows. Set
that the intercept ␣(䡠) has been written as a linear combi-
nation of B-splines defined on the interval [0,P] with
coefficients denoted by a 僆 Rd. The intercept function is V ⫽ diagonal共␮1, . . . , ␮n兲, (46)
periodically extended to the interval [0,T]. Similarly, the
slope function ␤(䡠) has been written as a linear combina- Z ⫽ V ⫺ 1共Y ⫺ ␮兲 ⫹ ␹␤. (47)
tion of B-splines defined on the interval [0,T] with coef-
ficients denoted by b 僆 Rk. The penalty needed in eq 14 is
Here V is the covariance of the vector of independent
the sum P(␭a,a) ⫹ P(␭b,b) of the penalties for each func-
Poisson observations Y, and Z are the pseudo-observations.
tion alone:
V and Z depend on ␤. The covariance of Z conditional on
␹ and ␤ is V⫺1.
1 Initialize ␤ ⫽ ␤0, where the components of a are ln(Y)
P共␭ a , a兲 ⫽ ␭ 储D a储 2 (38)
2 a a 2 and b ⴝ 0.
(1) Let V̂0 and Ẑ0 denote the covariance matrix and
1 pseudo-observations evaluated at the current
P共␭ b , b兲 ⫽ ␭ 储D b储 2, (39) value of ␤. AQ: H
2 b b 2
(2) Update the current fitted value according to

where  䡠 2 denotes the usual Euclidean norm of a vector.


ˆ ⫽ 关␹tV̂0␹ ⫹ D兴 ⫺ 1␹tV̂0Ẑ0.
␤ (48)
Here Da and Db are matrix representations of the second-
order difference operator acting on the coefficients a and
b in the B-spline representation of ␣(䡠) and ␤(䡠), respec- When the inverse does not exist, use the generalized
tively. The general form of Da and Db is given as inverse.

Volume 59 August 2009 Journal of the Air & Waste Management Association 11
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-00158 Input-sw
Staniswalis, Yang, Li, and Kelly

(3) Iterate Steps 1 and 2 to convergence. Indicator variables for day of the week with the cor-
Upon convergence, the estimates of the component responding coefficients are included in the model for the
curves of ln-relative risk ␣
ˆ (t) and ␤
ˆ (t) are given by linear predictor ␩t by augmenting the design matrix (X Z)
the parameter vector, and the penalty matrix D as de-

冋 JN 0ⴱ冋0 N 册册
scribed before for including the temperature variable.
b
a (49) However, because day of the week is not a smooth func-
(T⫺Q)XQ ˆ

tion, no smoothing is used in the estimation of the day-
of-week effect; that is, the smoothing parameter for day of
and the week is set to zero.

冋 0ⴱJN 冋0 N 册册 .
a
b
(T⫺Q)XQ ˆ

(50)
ACKNOWLEDGMENTS
Joan Staniswalis was partially supported by National In-
stitutes of Health (NIH)-SCORE 2S06 GM008012. Wen-
Whai Li and Kerry E. Kelly were partially supported by a
The estimates are of the form HaẐ0 and HbẐ0, respectively.
grant from the Health Effects Institute (HEI, RFA05-1B).
The pointwise standard errors of the estimated compo-
The authors thank JoAnn Lighty, University of Utah, for
nent curves are the square root of the diagonal entries of
instrumentational support, and Dr. J.O. Ramsay, McGill
HaV̂0⫺1Hat (intercept) and HbV̂0⫺1Hbt (slope) at the cur-
University, for his functional analysis software written in
AQ: I rent value of ␤.
S-PLUS that was used to impute missing hourly data. This
For testing hypothesis about the significance of ␤(䡠)
publication was made possible by grant number S11
using the scaled deviance, the degrees of freedom associ-
ES013339 from the National Institute of Environmental
ated with a particular fit are taken to be trace (H),
Health Sciences (NIEHS), NIH. Its contents are solely the
responsibility of the authors and do not necessarily rep-
H ⫽ ␹共␹tV0␹ ⫹ D兲 ⫺ 1␹tV0 (51) resent the official views of the NIEHS, NIH, or HEI.

from the first step of the iteratively reweighted least REFERENCES


squares, when V0 is a multiple of the identity matrix. 1. Pope, A.C.; Dockery, D.W. Health Effects of Fine Air Pollution: Lines
That Connect; J. Air & Waste Manage. Assoc. 2006, 56, 709-742.
Standard errors are computed for the ln-relative risk esti- 2. Miller, K.A.; Siscovick, D.S.; Sheppard, L.; Shepherd, K.; Sullivan, J.H.;
mates by exploiting the linearity in Ẑ0. Anderson, G.L.; Kaufman, J.D. Long-Term Exposure to Air Pollution
and Incidence of Cardiovascular Events in Women; N. Engl. J. Med.
2007, 356, 447-458.
Obtaining Regularized Estimates of ␣(t), ␤(t), 3. Dominici, F.; Peng, R.D.; Bell, M.L.; Pham, L.; McDermott, A.; Zeger,
s1(t), s2(temperature), and Day of the Week S.L.; Sanat, J.M. Fine Particulate Air Pollution and Hospital Admission
for Cardiovascular and Respiratory Diseases; JAMA 2006, 295, 1127-
Simultaneously 1134.
Augmenting the model for the mean of the daily mortal- 4. Provisional Assessment of Recent Studies on Health Effects of Particulate
ity just augments the design matrix ␹ and the block diag- Matter Exposure; EPA/600/R-06/063; National Center for Environmen-
tal Assessment; Office of Research and Development; U.S. Environ-
onal matrix D as indicated below. For ease of presenta- mental Protection Agency: Research Triangle Park, NC, 2006.
tion, a common smoothing parameter ␭y is assumed for 5. Gold, D.R.; Litonjua, A.; Schwartz, J.; Lovett, E.; Larson, A.; Nearing, B.;
estimation of s1(t) and s2(temperature). Allowing for differ- Allen, G.; Verrier, M.; Cherry, R.; Verrier, R. Ambient Pollution and
Heart Rate Variability; Circulation 2000, 101, 1267-1273.
ent smoothing parameters is a straight forward extension. 6. Magari, S.R.; Schwartz, J.; Williams, P.L.; Hauser, R.; Smith, T.J.; Chris-
Represent s1(t) and s2(temperature) in terms of a B-spline tiani, D.C. The Association between Personal Measurements of Envi-
ronmental Exposure to Particulates and Heart Rate Variability; Epide-
basis with coefficients ␥ so that now miol. 2002, 13, 305-310.
7. Peters, A.; Dockery, D.W.; Muller, J.E.; Mittleman, M.A. Increased

冋册
Particulate Air Pollution and the Triggering of Myocardial Infarction;
␤ Circulation 2001, 103, 2810-2815.
␩ ⫽ 关␹ Z兴 ␥ . (52) 8. Peters, A.; von Klot, S.; Heier, M.; Trentinaglia, I.; Hormann, A.; Wich-
mann, H.E.; Lowel, H. Exposure to Traffic and the Onset of Myocardial
Infarction; N. Engl. J. Med. 2004, 351, 1721-1730.
Define D to be the block diagonal matrix: 9. Ramsay, J.O.; Silverman, B.W. Functional Data Analysis; Springer: New
York, 1997.
10. Staniswalis, J.G.; Parks, N.J.; Bader, J.O.; Munoz Maldonado, Y. Tem-
poral Analysis of Airborne Particulate Matter Reveals a Dose-Rate Ef-
D ⫽ diagonal共␭aDatDa, ␭bDbtDb, ␭␥D␥tD␥兲, (53) fect on Mortality in El Paso: Indications of a Differential Toxicity for
Different Particle Mixtures; J. Air & Waste Manage. Assoc. 2005, 55,
893-902.
where D␥ is the matrix representation of the second-order 11. Schwartz, J. The Distributed Lag between Air Pollution and Daily
AQ: J difference operator acting on the coefficients . The esti- Deaths; Epidemiol. 2000 11, 320-326.
12. Zanobetti, A.; Wand, M.P.; Schwartz, J.; Ryan, L.M. Generalized Addi-
mates of ␤ and ␥ are the solutions to tive Distributed Lag Models: Quanitifying Mortality Displacement;
Biostatistics 2000, 1, 279-292.

冉冊
13. Malfait N.; Ramsay, J.O. The Historical Functional Linear Model; Can.
␤ J. Stats. 2003, 31, 115-128.
共X Z兲 t 共Y ⫺ ␭兲 ⫺ D ␥ ⫽ 0 (54) 14. Yang, H. M.Sci. Thesis, University of Texas at El Paso, El Paso, TX,
2005.
15. Franklin, M.; Zeka, A.; Schwartz, J. Association between PM2.5 and
and are obtained as before by iteratively reweighted least All-Cause and Specific-Cause Mortality in 27 Communities; J. Expo.
Sci. Environ. Epidemiol. 2007, 17, 279-287.
squares with the starting value ␥ ⫽ 0 for the additional 16. Gamez, J. M.Sci. Thesis. Department of Civil Engineering, University
parameters in the augmented model. of Texas at El Paso, El Paso, TX, 2007.

12 Journal of the Air & Waste Management Association Volume 59 August 2009
tapraid4/z5p-jawma/z5p-jawma/z5p00809/z5p3387d09a longd S⫽25 7/14/09 11:25 Art: 08-0 Input-css(css)
Staniswalis, Yang, Li, and Kelly

17. Li, W.W.; Gamez, J.; Baca, D.J.; Olvera, H.A.; Garcia, J.H.; Staniswalis, 27. Ostro, B.; Broadwin, R.; Green, S.; Feng, W.Y.; Lipsett, M. Fine Partic-
J.G.; Garcia, N.; Garcia, M.; Pingitore, N.E., Jr.; Amaya, M.; Kelly, K.; ulate Air Pollution and Mortality in Nine California Counties: Results
Lighty, J. Investigation of the Number Concentrations of Ultrafine from CALFINE; Environ. Health Perspect. 2006, 114, 29-33.
Particles in the Nocturnal PM Peaks. Presented at the 3rd International 28. Eilers, P.H.; Mark, B.D. Flexible Smoothing with B-Splines and Penal-
Symposium on Nanotechnology, Occupational and Environmental ties; Stat. Sci. 1996, 11, 89-121.
Health, Academia Sinica, Taipei, Taiwan, 2007. 29. Green, P.J.; Silverman, B.W. Nonparametric Regression and Generalized
18. Nemmar, A.; Hoet, P.H.M.; Vanquickenborne, B.; Dinsdale, D.; Linear Models: a Roughness Penalty Approach; Chapman and Hall/CRC:
Thomeer, M.; Hoylaerts, M.F.; Vanbilloen, H.; Mortelmans, L.; Nem- London, U.K., 1994.
ery, B. Passage of Inhaled Particles into the Blood Circulation in 30. McCullagh, P.; Nelder, J.A. Generalized Linear Models; Chapman and
Humans; Circulation 2002, 105, 411-414. Hall/CRC: London, U.K., 1989.
19. Salvi, S.; Blomberg, A.; Rudell, B.; Kelly, F.; Sandstrom, T.; Holgate,
S.T.; Frew, A. Acute Inflammatory Responses in the Airways and Pe-
ripheral Blood after Short-Term Exposure to Diesel Exhaust in Healthy
Human Volunteers; Am. J. Respir. Crit. Care Med. 1999, 159, 702-709.
20. Rosenthal, F.S.; Carney, F.S.; Olinger, M.L. Out-of-Hospital Cardiac
Arrest and Airborne Fine Particulate Matter: a Case-Crossover Analysis
of Emergency Medical Services Data in Indianapolis, Indiana; Environ. About the Authors
Health Perspect. 2008, 116, 631-636. Joan G. Staniswalis is a professor at the University of Texas
21. Massy, W.F. Principal Components Regression in Exploratory Statisti- at El Paso Department of Mathematical Sciences. Hongling
cal Research; J. Am. Stat. Assoc. 1965, 60, 234-254.
22. Li, W.W.; Orquiz, R.; Pingitore, N.E., Jr.; Garcia; J.E.; Espino, T.T.;
Yang is currently a lecturer at the University of Texas at El
Gardea-Torresdey, J.; Chow, J.; Watson, J.G. Analysis of Temporal and Paso, having recently completed her doctorate in statistics
Spatial Dichotomous PM Air Samples in the El Paso-Cd. Juarez Air at the Arizona State University Department of Mathematics
Quality Basin; J. Air & Waste Manage. Assoc. 2001, 51, 1511-1560. in Phoenix, AZ. Wen-Whai Li is a professor at the University
23. Li, W.W.; Cardenas, N.; Walton, J. Trujillo, D.; Morales, H.; Arimoto,
of Texas at El Paso Department of Civil Engineering. Kerry
R. PM Source Identification at Sunland Park, New Mexico Using a
Simple Heuristic Meteorological and Chemical Analysis; J. Air & Waste Kelly is Associate Director of the Institute for Combustion
Manage. Assoc. 2005, 55, 352-364. and Energy Studies at the University of Utah. Please ad-
24. Review of the National Ambient Air Quality Standards for Particulate dress correspondence to: Joan Staniswalis, University of
Matter: Policy Assessment of Scientific and Technical Information; EPA- Texas at El Paso, Department of Mathematical Sciences,
452/R-05-005a; Office of Air Quality Planning and Standards; U.S.
Environmental Protection Agency: Research Triangle Park, NC, 2005. Bell Hall, Room 124, El Paso, TX 79968-0514; phone: ⫹1-
AQ: K 25. De Boor, C. A Practical Guide to Splines; Springer: New York, 1978. 915-747-6761; e-mail: joan@math.utep.edu. AQ: L
26. Eubank, R.L. Spline Smoothing and Nonparametric Regression; Marcel
Dekker: New York, 1988.

Volume 59 August 2009 Journal of the Air & Waste Management Association 13
JOBNAME: AUTHOR QUERIES PAGE: 1 SESS: 4 OUTPUT: Mon Jul 13 14:47:21 2009
/tapraid4/z5p⫺jawma/z5p⫺jawma/z5p00809/z5p3387d09a

AUTHOR QUERIES

AUTHOR PLEASE ANSWER ALL QUERIES 1

A—Au: The original reference list began with reference 2. Please review your reference list and
in-text referencing list closely to ensure all are correct.
B—Au: Please define P25, P50, and P75 in Table 1 (25th, 50th, and 75th percentile?).
C—Au: Please confirm this revision correctly clarifies your meaning.
D—Au: Please confirm this revision correctly clarifies your meaning.
E—Au: Please confirm text in table 2, first column, after citation for eq 2.
F—Au: Please clarify what symbol is needed before the italic t. Throughout, please check all
equations and special math or other characters carefully. Sometimes the PDF file is not very
clear.
G—Au: Is the beta symbol meant here?
H—Au: Is the beta symbol meant here?
I—Au: Is the beta symbol meant here?
J—Au: Please clarify what symbol belongs here. On PDF file it looks like half a ‘frown face.‘
K—Au: Please note that the section ‘Appendix References‘ has been incorporated into your
reference text list, and the in-text referencing for the Appendix has been renumbered
accordingly.
L—Au: Please provide a fax number for this corresponding address.

Potrebbero piacerti anche