Sei sulla pagina 1di 11

1

PharmaSUG2010 - Paper CC23 AN ERROR BAR MACRO WITH PROC GPLOT Shuping Zhang and Xingshu Zhu Merck and Co. Inc, Upper Gwynedd, PA

Abstract
Error bar plots (e.g., SD, SE, and confidence intervals) are often used to visually represent and explore the relationship among data values, as well as to compare the outcomes of different studies or treatment groups over a period of time. While enhanced features in SAS/Graph procedures including statements and annotations can be used to make quick and informative graphics, challenges remain for SAS programmers to create graphics for sophisticated analysis purposes such as error bar plots. To address this challenge, we developed a customized mean measurement macro using the PROC GPLOT procedure in order to accomplish more comprehensive graphic outputs specifically targeting error bar plots.

Introduction
How to handle and present data is very important in clinical trial analysis. There are different ways by which data can be presented, including tables, listings, and graphs. Often times, the graphic format is the preferred one because of its direct and immediate visual impact. "A picture is worth a thousand words." This overused phrase applies especially to data collected in early experimental medicine and epidemiology research, where data can sometimes be analyzed and presented with more weight towards descriptive than inferential statistics. The important features of these data are presented in either tables or graphs. Graphics can help to identify patterns and trends during preliminary investigations of the data, including relationships among treatments or over time. Then using the inferential statistics supported by the descriptive statistics, inferences about the target populations, especially regarding treatment groups, can be made. The three popular descriptive or inferential statistics, the SD, the SE, and the confidence interval (CI) are widely used in the graphs for this purpose. The SD describes the variation in the data, the SE describes the variation in the sampling distribution of the sample mean about the population mean, and the CI attaches precision to the sample mean estimate of the population mean. There are certain enhanced features in SAS/Graph procedures available to make quick and informative graphs, still challenges remain for SAS programmers to create more comprehensive and customized plots, such as error bar plots. In this article, we start with simple SAS codes for creating basic mean error bar plots. These codes are then extended for an enhanced variety of SAS plots, including multiple treatment groups and visits over a period of time with one of inferential statistics, the SD, the SE or CI.

Datasets
Two datasets are used in the examples to present the different options for the parameters in the macro. The data set "Sample1.sas7bdat" will be used in example 1, 2, 4, 5; and the data set "Sample2.sas7bdat" will be used in example 3.

Sample1.sas7bdat

Sample2.sas7bdat

. . . .

The dataset Sample1 contains 5 variables. The labels for each variable are as follows: AN Subject identification number TREAT_ CD - treatment code TREATMENT - treatment text PHASE - study phase WEIGHT weight (in Kg) The dataset Sample2 contains 6 variables. The labels for each variable are as follows: TREAT_CD - treatment code TREATMENT treatment text TIME - time in minutes LSMEAN - least square mean of study score LCI - lower of 95% CI of study score UCI - upper of 95% CI of study score

Output graphs and SAS codes

1) Plot of Mean with one standard deviation bar

To create the error bar plot shown above, a dataset need to be constructed with the following two variables, one variable is used to shown the mean value, and another variable is for drawing the bars. We used dataset Sample1 along with PROC SQL procedure to accomplish this step before passing the dataset to PROC GPLOT. The INTERPOL option with the value of SDT1T in SYMBOL statement is used to create bars; the INTERPOL option with the value of JOIN in SYMBOL statement is used to connect means.
proc sql noprint; create table example1 as select phase, treat_cd, weight, mean(weight) as w_mean from datadir.sample1 group by phase, treat_cd; quit; goptions reset=global gunit=pct vsize=6in hsize=6in ftext=swiss htext=3; proc gplot data=example1; axis1 label=( h=3 'Phase') w=3 offset=(5,5); axis2 label=(a=90 h=3 'Weight') w=3 order=(40 to 110 by 10); symbol1 c=blue i=std1t w=1; symbol2 c=red i=std1t w=1; plot weight*phase=treat_cd / haxis=axis1 hminor=0 vaxis=axis2 vminor=0 nolegend; axis3 label=none value=none w=3 order=(40 to 110 by 10) major=none; symbol3 c=blue i=join v=dot h=2; symbol4 c=red i=join v=dot h=2; legend1 label=none value=('Study Drug' 'Placebo'); plot2 w_mean*phase=treat_cd / vaxis=axis3 vminor=0 legend=legend1; run;

The codes above work well when there are only two treatment groups. As the number of treatment groups increases, the outcomes for different treatment groups, phases or visits need to be sufficiently separated in the graph to avoid confusion caused by overlapping of data points. The macro, %PlotMB, presented below can accomplish this task. 2). Macro %PlotMB Graphics Features a). Macro parameters The macro %PlotMB contains 21 parameters, 3 of which are required and the remaining 18 parameters are optional. The following table is a description of each parameter, the Red color indicates a required parameter.
Parameter Ginds Description Input SAS data set for graph. Variable define Y-axis in graph Variable define X-axis in graph Variable define group in graph Format for decoding of X_var Format for decoding of GroupBy. Default / Options / Note The data set must contain the variables represented in the parameters X_var, Y_var, and Grp_var (if Grp_var is not empty). For example, weight. For example, phase. X_var must be categorized variable. For example, treatment groups. Limited to 7 groups. If blank, the values of X_var will be displayed on X axis. If blank, the values of GroupBy will be used for legend.

Y_var X_var GroupBy X_fmt Grp_fmt

X_label Y_label
Y_order

Label for X-axis Label for Y-axis


Values order on Y-axis

If blank, the variable name "&X_var" will be used. If blank, the variable name "&Y_var" will be used.
The macro accepts only N1 to N2 by N format, if blank, the macro will automatically adjust optimal integer of 1, 2, 5, 10 and time power of 10. Valid values are STDERR (Standard Error bar), STD (One Standard Deviation bar), CI (95% C.I. bar), and OTHER; If value OTHER is filled, the input dataset must also contain the corresponding variables &Y_var._lower and &Y_var_upper, otherwise, the macro will not run. User can only provide one title for the graph. If blank, the macro will not output permanent file to save graph. User is not asked to provide file extensions, macro generates RTF file only. For example, C:\prot001\out_graphs

BarType

What kind of Bar needs to draw

Gtitle Goutfile

Graph title Name of output file

Goutdir

The folder to save the

output file &Goutfile.

If Goutfile is not blank, user must provide folder path in this parameter.

Glegend

Interpol
symlines

The position of the graph legend, e.g., bottom center outside Interpol method to connect lines.
Special line type for groups Special symbols for groups

If blank, the macro will automatically adjust the optimal in order "inside to outside", "bottom, top, middle", "right, left, center". Join, none, spline. Default is JOIN.
For example, symlines = 1 2 3 20 21 for five groups. If blank, the default is in order 1, 2, 3, 4, 20, 21, 24. For example, symshape = dot circle square triangle star for five groups. If blank, the default is dot, triangle, square, circle, star, hash, diamond. For example, symcolor = red green blue black brown for five groups. If blank, the default is black, red, blue, green, violet, brown, olive. Three fields separated by |, for example, 250 | blue | 4; 1st field is the positions on Y-axis, 2nd field is the colors, 3rd field is the line types. Three fields separated by |, for example, 400 | red | 3; 1st field is the positions on X-axis, 2nd field is the colors, 3rd field is the line types.

symshape

symcolor

Special colors for groups

HrefLine

Horizontal reference line on Y-axis Vertical reference line on X-axis

VrefLine

annotate

Annotate dataset for graph

User must follow the annotate dataset structure.

b). Examples of Macro calls (1) Simple Error Bar plot---Weight error bar by Phase in Two treatment groups. The following macro call uses "Sample1.sas7bdat" as the input data set, and plots the mean of weight (as Y variable) with standard error bar for each phase (as X variable) of a treatment group (variable "trt"). The Error plot (when not specified in parameters) is used as default. The legend position is also adjusted automatically and placed at the lower left corner.
%PlotMB( Ginds = ,Y_var = ,X_var = ,GroupBy= ,Grp_fmt= ,BarType= ,Gtitle = ); datadir.sample1 weight phase treat_cd tr1t. stderr Example 1: Plot Mean of Weight with Standard Error Bar

Output Graph:

Notice that the macro automatically adjusts the weight error bars which appear to overlap at phase=1 and phase=2.

(2). STD plot--- Weight STD bar by Phase in Two treatment groups. Instead of using default error bar, the following plot selects the standard deviation bar plot in the parameter BARTYPE. Other options, such as 95% CI, can also be used here.
%PlotMB( Ginds = ,Y_var = ,X_var = ,GroupBy= ,BarType= ,Gtitle = ); datadir.sample1 weight phase trt std Example 2: Select STD Bar Type

Output Graph:

3). OTHER bar plot--- Weight OTHER bar by Phase in five treatment groups. If a user wants to plot a specific bar associated with lower and upper bound, for example, LSMEAN with the estimated 95% C.I. bar (see data set "Sample2.sas7bdat"), it can be generated with macro %PlotMB using the "other" bar type. The input dataset must contain the following variables, LSMEAN, LSMEAN_lower and LSMEAN_upper, as well the time point "time" and treatment group "trt". The variable TIME indicates the period subjects were on the specific drug treatment, TRT. Here TRT contains five different groups. Proc format is used to decode the TRT.
data sample2; set datadir.sample2; lsmean_lower=lcl; lsmean_upper=ucl; run; %PlotMB( Ginds = sample2 ,Y_var = LSMEAN ,X_var = time

,GroupBy ,Grp_fmt ,BarType ,Gtitle );

= = = =

treat_cd tr2t. other LSMEAN with Estimated 95% C.I.

In addition to using the parameter "BarType" with "OTHER", we also used the following parameters. Grp_fmt---to decode the treatment group TRT. GoutFile---name of the output file, called example3.rtf GoutDir---the directory where the output file will be saved. Here we save it in C:\ drive.

Output Graph:

4). Example 4: More parameter options can be used to refine the output graph
%PlotMB( Ginds ,Y_var ,Y_order ,X_var = = = = datadir.sample weight 50 to 85 by 5 phase

,GroupBy = ,BarType = ,Gtitle = ,symlines= ,symshape= ,symcolor= ,HrefLine= ,VrefLine= );

trt stderr Example 4: Plot Refinement '3' '4' 'square' 'circle' 'red' 'black' 67.5|blue |3 2.5|green|4

Additional parameters can be used as needed. Comparing with the output graph in example 1, the following changes were made: 1) the value order in Y axis and X axis are changed through the parameters Y_order and X_order. 2) the line types, symbols, and colors are changed through the parameters LineTyp, Lshapes, and Lcolors. 3) the call adds horizontal reference line at position 67.5 on Y axis with blue color. 4) the call adds vertical reference line at position 2.5 on X axis with green color.

Output Graph:

10

Output multiple BAR-PLOTS to a single file


Combing with the ODS output, the macro %PlotMB allows users to output multiple bar plots to a single file. The sample codes are shown below:
ods results off; ods rtf file="&drive\PlotMB\out_graphs\example6.rtf"; %do i = 1 %to &n_Group; %PlotMB( Ginds ,Y_var ,X_var ,GroupBy ,BarType ,Gtitle ); %end; ods rtf close; ods results on; = = = = = = &&InDs&i &&Yvar&i &&Xvar&i &&GrpVar&i &&BarTypei &&Title&i

Conclusion
We illustrated the use, capabilities, and features of the error bar macro %PlotMB with four different macro calls. Of the 21 parameters built into %PlotMB, 3 of them are required making the utilization of this macro easy and simple. In summary, the macro generates the following types of bar plots: Parameter "BarType" STDERR STD CI OTHER Type of Plot Mean with standard error bar Mean with one standard deviation bar Mean with 95%CI bar Any other bars by user provided lower and upper boundary of bar.

In addition, we demonstrated how to save permanent graphics in a directory, and how to output multiple pages into one file with ODS output option.

References
SAS Institute Inc., SAS 9.1.3 Help and Documentation , Cary, NC: SAS Institute Inc., 2000-004

Acknowledgement
Authors would like to thank Amy Gillespie and Thomas Bradstreet for taking the time to review and comment on the manuscript.

11

Trademark Information
SAS and SAS/GRAPH software are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies.

Contact
Author Name: Xingshu Zhu Company : Merck & Co., Inc. Address : 351 Sumneytown Pike Email:xingshu_zhu@merck.com

Author Name: Shuping Zhang Company : Merck & Co., Inc. Address : 351 Sumneytown Pike Email:shuping_zhang@merck.com

Potrebbero piacerti anche