Sei sulla pagina 1di 6

1

Fundamentals
SAS v9.2 SYNTAX
LWPGR I

Browse the descriptor portion of a SDS

PROC CONTENTS DATA=SAS-Data-Set;
RUN;

Browse the data portion of a SDS

PROC PRINT DATA=SAS-Data-Set <NOOBS>;
<VAR variable(s);>
RUN;

Assign a libref :

LIBNAME libref engine SAS-data-library;

Browse a SAS-data-library:

PROC CONTENTS DATA=libref._ALL_ <NODS>;
RUN;

Creating SAS Data Sets from SDS, EXCEL or RDMS

LIBNAME libref SAS-data-library <options>; (from an existing SDS)
LIBNAME libref Physical-file-name <options>; (from an excel spreadsheet)
LIBNAME libref Engine-name <SAS/ACCESS-options>; (from a Relational Database)

DATA output-SAS-Data-Set;
SET input-SAS-Data-Set;
<WHERE where expression;>
<KEEP/DROP variable-list;>
<LABEL variable=label
variable=label
variable=label
;>
<FORMAT variable(s) format.;>
RUN;


------------------ALTERNATIVE FOR EXCEL AND RDMS----------------

PROC IMPORT OUT=SAS-Data-Set
DATAFILE=external-file-name
DBMS=file-type ;
<GETNAMES=YES;>
RUN ;


Creating an Excel Spreadsheet

LIBNAME librefxls Physical-EXCELfile-name <options>;
DATA librefxls.excelworksheet;
SET input-SAS-Data-Set;
<other statements>;
RUN;
LIBNAME librefxls clear;

------------------ALTERNATIVE----------------

2

LIBNAME librefxls Physical-EXCELfile-name <options>;
PROC COPY in= input-SAS-Data-Set out= librefxls;
SELECT excelworksheet(s);
RUN;
LIBNAME librefxls clear;

------------------ALTERNATIVE----------------

PROC EXPORT DATA=SAS-Data-Set
OUTFILE=external-file-name
DBMS=file-type ;
RUN ;

Creating SAS Data Sets from Raw Data File with list input

DATA Output-SAS-Data-Set;
<LENGTH variable-1 <$> length variable-n <$> length;>
INFILE raw-data-file <DLM=delimiter(s)>;
INPUT variable-1 <$> variable-2 <$> variable-n <$>; List input
INPUT variable-1 : <$>INFORMAT<w>.<d> Modified list input
variable-2 : <$>INFORMAT<w>.<d>
variable-n : <$>INFORMAT<w>.<d>;
<other statements>;
RUN;

Default width of variables created with list input is 8 bytes. Set length explicitly with:
- length statement or
- informat in input statement


Validating data

Some SAS Procedures can be used to detect invalid data:

PROC PRINT step with
VAR and WHERE statements
Detects invalid character and numeric values by
subsetting observations based on conditions
PROC FREQ step with
TABLES statement
Detects invalid character and numeric values by
looking at distinct values
PROC MEANS step with
VAR statement
Detects invalid numeric values by using summary
statistics
PROC UNIVARIATE step with VAR statement
Detects invalid numeric values by looking at
extreme values

PROC PRINT DATA=SAS-Data-Set;
<VAR variable(s);>
<WHERE where expression;>

RUN;

PROC FREQ DATA=SAS-Data-Set <NLEVELS>;
<TABLES variable(s);>
RUN;

PROC MEANS DATA=SAS-Data-Set <STATISTICS>;
<VAR variable(s);>
RUN;

PROC UNIVARIATE DATA=SAS-Data-Set;
<VAR variable(s);>
RUN;
3

Manipulate data

DATA Output-SAS-Data-Set <(DROP=variable(s) or KEEP=variable(s))>;
SET Input-SAS-Data-Set <(DROP=variable(s) or KEEP=variable(s)) >;
<LENGTH variable(s) <$>;
<KEEP variable(s) ;>
<DROP variable(s) ;>
<VARIABLE=expression ;>
<VARIABLE=function(argument 1, argument 2, argument n);>
<WHERE expression ;>
<IF expression ;>
<IF expression THEN DELETE ;>
<IF expression THEN statement ;
ELSE statement ;>
<IF expression THEN DO ;
executable statements
END ;
ELSE DO ;
executable statements
END ;>
RUN ;

Useful functions bases on numeric SAS date values
Varnew =MDY(month,day,year);

return aSAS-datefrom3 given integers (month, day and year)
Varnew =TODAY();

return theSAS-dateof current date
Varnew =MONTH(SAS-date-value);

return theinteger (1-12) that represents themonth of aSAS-date-value
Varnew =DAY(SAS-date-value);

return theinteger (1-31) that represents theday of aSAS-date-value
Varnew =YEAR(SAS-date-value);

return theinteger (four digits) that represents theyear of aSAS-date-va
Varnew =WEEKDAY(SAS-date-value);

return theinteger that represents theweekday of aSAS-date-value (1=s
7=saturday)
Varnew =QTR(SAS-date-value);

return theinteger that represents thequarter of aSAS-date-value
Varnew =YRDIF(SAS begin date, SAS end date, basis );

return thenumber of years between SAS-date-value-start andSAS-date
based on basis
ACT/ACT =months and years haveactual number of days
30/360 =months have30 days, years have360 days
ACT/360 =months haveactual number of days,
years have360 days
ACT/365 =months haveactual number of days,
years have365 days




4

Combining SAS Data Sets

A. Sorting SAS Data Sets

PROC SORT DATA=Input-SAS-Data-Set OUT=Output-SAS-Data-Set;
BY <DESCENDING>variable(s);
RUN;
B. Concatenating SAS Data Sets

PROC APPEND BASE=SAS-Data-Set
DATA= SAS-Data-Set <FORCE>;
RUN;

DATA SAS-Data-Set ;
SET SAS-Data-Set1 SAS-Data-Set2 ;
<BY BY-variable ;>
<other SAS Statements >
RUN;

C. Merging SAS Data Sets

DATA SAS-Data-Set ;
MERGE SAS-Data-Set1 SAS-Data-Set2 <(RENAME=(Old-variable=New-variable)
IN=IN-variable)>;
BY BY-variables ;
<other SAS-Statements>
RUN ;


D. Simple joins using data step

DATA match nonmatch1 nonmatch2;
MERGE SDS1(IN=in1) SDS2(IN=in2);
BY Merge_Var;
IF in1and in2THEN output match;
IF in1AND NOT in2THEN output nonmatch1;
IF NOT in1AND in2THEN output nonmatch2;
RUN;

5

Enhancing Reports:
A. Global Statements (are valid until you changethem, cancel themor close your SAS session)

TITLEn text;
FOOTNOTEn text;
OPTIONS
<NODATE><NODATE>
<DTRESET><NODTRESET>
<NUMBER><NONUMBER><PAGENO=N>
<CENTER><NOCENTER>
<PAGESIZE=N><LINESIZE=N>;

B. Formatting Data Values

- Create a user defined format :

PROC FORMAT <LIB=libref.catalog>;
VALUE <$>format-name range1 =label
range2 =label
;
RUN;

- Use permanent formats in different catalogs :

OPTIONS FMTSEARCH=(libref1 libref2.catalog ...) ;

- Apply formats in PROC STEP (temporary) or DATA STEP (permanent) :

FORMAT variable(s) format.;

C. Using ODS

ODS HTML FILE=HTML-file-specification.html <STYLE=style specification><OPTIONS>;
SAS Code that produces OUTPUT
ODS HTML CLOSE;

Other type of ODS: PDF, RTF, CSVALL, EXCELXP, MSOFFICE2K, listing,
Sample of styles: beige, sasweb, analysis, statistical, default, journal



2. Producing Summary Reports

A. Frequency Reports : Frequency, Percent, Cumulative Frequency and Cumulative Percent

PROC FREQ DATA=SAS-Data-Set <OPTIONS> ;
TABLES SAS-variable1 SAS-variable2 <OPTIONS> ; (one-way frequency report)
or
TABLES SAS-variable1 * SAS-variable2 <OPTIONS>; cross tabular frequency report)
RUN ;

Sample of options of the tables statement: NOCUM, NOPERCENT, NOFREQ, NOROW, NOCOL,
LIST, CROSSLIST, FORMAT=,OUT=, OUTCUM, OUTPCT, CHISQ

Sample of options of the proc Freq statement: NLEVELS, PAGE, COMPRESS



6

B. Calculating Summary Statistics : Default N, MEAN, STD, MIN, and MAX

PROC MEANS DATA=SAS-Data-Set <STATISTICS><OPTIONS>;
VAR SAS-analysis-variable(s) ;
CLASS SAS-Class-variable(s) ;
<OUTPUT OUT= SAS-Data-Set <OUT OPTIONS> ;>
RUN;

Sample of options: Maxdex=, FW=, NOOBS, NOPRINT
Sample of statistics:
Descriptive Statistic Keywords
CLM CSS CV LCLM MAX
MEAN MIN MODE N NMISS
KURTOSIS RANGE SKEWNESS STDDEV STDERR
SUM SUMWGT UCLM USS VAR
Quantile Statistic Keywords
MEDIAN |
P50
P1 P5 P10 Q1 | P25
Q3 | P75 P90 P95 P99 QRANGE
Hypothesis Testing Keywords
PROBT T
Sample of out options: NWAY, DESCENDTYPE,CHARTYPE

PROC SUMMARY DATA==SAS-Data-Set <STATISTICS><OPTIONS>;
VAR SAS-analysis-variable(s) ;
CLASS SAS-Class-variable(s) ;
RUN;


C. The TABULATE Procedure

PROC TABULATE DATA=SAS-Data-Set <OUT= SAS-Data-Set> <options>;
CLASS class-variables ;
VAR analysis-variables ;
TABLE page-expression <all>,
row-expression <all>,
column-expression <all></options>;
RUN ;

Comma =>go to new table dimension
Blank =>concatenatetable information
Asterisk =>cross, nest, add statistic, subgroup information
The statistics and analysis variables are possible in every dimension, but all haveto be in the same dimension.

Sample of statistics: idem as the statistics available for the proc means +
PCTN REPPCTN PAGEPCTN ROWPCTN COLPCTN
PCTSUM REPPCTSUM PAGEPCTSUM ROWPCTSUM COLPCTSUM


Introduction to Graphics

Producing Bar and Pie Charts

PROC GCHART DATA=SAS-Data-Set ;
HBAR|HBAR3D|VBAR|VBAR3D|PIE|PIE3D chart-variable </DISCRETE
SUMVAR=analysis-variable
TYPE=MEAN/SUM
FILL=
EXPLODE=value>;
RUN;
QUIT;
7

Fundamentals
SAS v9.1.3 SYNTAX from courses LWPGRI & II

LWPRG II

Controlling Input and Output

a. Outputting Multiple Observations with explicit output

DATA Output-SAS-Data-Set;
SET Input-SAS-Data-Set;
<additional SAS statements>
OUTPUT;

OUTPUT;
RUN ;

b. Writing to multiple SAS Data Sets

DATA Output-SAS-Data-Set-1 Output-SAS-Data-Set-n;
SET Input-SAS-Data-Set;
<additional SAS statements>
OUTPUT Output-SAS-Data-Set-1;

OUTPUT Output-SAS-Data-Set-n;
RUN ;

Otherwise statement

SELECT <(select-expression)>;

WHEN-1 (value-1<,value-n>)
Statement;
<WHEN-n (value-1<,value-n>)
Statement;>

OTHERWISE statement;



c. Selecting Variables and Observations

DATA Output-SAS-Data-Set <(DROP=variable(s) or KEEP=variable(s) FIRSTOBS=n OBS=n) >;

SET Input-SAS-Data-Set <(DROP=variable(s) or KEEP=variable(s) FIRSTOBS=n OBS=n) >;
< additional statements >
<KEEP variable(s) ;>
<DROP variable(s) ;>
RUN ;



8

Summarizing Data

a. Creating an Accumulating Total Variable

DATA Output-SAS-Data-Set;
SET Input-SAS-Data-Set;

RETAIN totalvariable 0; totalvariable+variable;
totalvariable=totalvariable+variable;

RUN ;


b. Accumulating Totals for a Group of Data

i. One BY-variable

PROC SORT DATA=Input-SAS-Data-Set OUT=Sorted-SAS-Data-Set;
BY <DESCENDING> BY-variable;
RUN;

DATA Output-SAS-Data-Set;
SET Sorted-SAS-Data-Set;
BY BY-variable;
IF first.BY-variable THEN totalvariable=0;
totalvariable+variable;
IF last. BY-variable;
RUN;


ii. Multiple BY-variables

PROC SORT DATA=Input-SAS-Data-Set OUT= Sorted -SAS-Data-Set;
BY <DESCENDING> BY-variable-1 BY-variable-2;
RUN;

DATA Output-SAS-Data-Set;
SET Sorted-SAS-Data-Set;
BY BY-variable-1 BY-variable-2;
IF first.BY-variable-2 THEN totalvariable=0;
totalvariable+variable;
IF last. BY-variable-2;
RUN;

Creating SAS Data Sets from an external file:


a. Reading Raw Data Files with formatted input

DATA libref.SAS-Data-Set(s);
INFILE filename <DLM=delimiter(s)><MISSOVER><TRUNCOVER><DSD>;;
INPUT @startcol variable1 informat.
+n variable2 informat. ;
RUN ;

MISSOVER: Prevent new record frombeing loaded in input buffer when you havemissing values at theend of arecord
DSD: Consecutivedelimiters aretreated as amissing valueand read values with embedded delimiters if thevalueis
surrounded by doublequotes.
9

Controlling When a Record Loads
iii. Using Linepointers (Load a next record)

DATA Output-SAS-Data-Set;
INFILE raw-data-file;
INPUT variable-1 informat. @n variable-2 informat. / @n variable-n informat.; relative line pointer
INPUT #1 variable-1 informat. @n variable-2 informat. #2 variable-n informat.;absolute line pointer
RUN;
iv. Using (Double) Trailing @ (holds raw data record in input buffer)

DATA Output-SAS-Data-Set;
INFILE raw-data-file;
INPUT variable-1 informat. @n variable-2 informat. @; until input without @ or new data iteration
INPUT variable-1 informat. @n variable-2 informat. @@; until end of record
RUN;

Data Transformations

DATA Output-SAS-Data-Set;
SET Input-SAS-Data-Set;
new-variable=FUNCTION(argument-1, argument-2,argument-n);
RUN ;

d. Manipulating Character Values

Varnew =SUBSTR(string,start,<length>);
length initial string
get substring of <length> characters in astring,
starting fromstart
SUBSTR(string,start,<length>)=value;
length initial string
replaces substring with value
Varnew =LENGTH(argument);
length 8 Bytes
Return thelength of anon blank character string
Varnew =SCAN(string,n,<delimiters>);
length 200 Bytes
return then-th word in list of words string
Varnew =CHAR(string,position);
length initial string
Return asinglecharacter fromaspecified position in acharacter string
Varnew =LEFT(argument);
length initial string
left-aligneargument
Varnew =RIGHT(argument);
length initial string
right-aligneargument
Varnew =TRIM(argument);
length initial string
removetrailing blanks fromargument
Varnew =TRIMN(argument);
length initial string
removetrailing blanks fromargument and return anull string
if theargument is blank
Varnew =STRIP(argument);
length initial string
removeall theleading and thetrailing blanks fromargument
Varnew =COMPBL(argument);
length initial string
Removes multipleblanks fromacharacter string by translating each
occurrenceof two or moreconsecutiveblanks into asinglerow
Varnew =string1 || string 2;
length String1 + length String 2
concatenatestring 1 and string 2
Varnew =CAT(separator, string-1,string-n);
length depends on where you use it
concatenatestring 1 and and string-n

Varnew =CATX(separator, string-1,string-n);
length depends on where you use it
concatenatestring 1 and and string-n with separator and
removes leading and trailing blanks
Varnew =CATS(separator, string-1,string-n);
length depends on where you use it
concatenatestring 1 and and string-n and
removes leading and trailing blanks
Varnew =CATT(separator, string-1,string-n);
length depends on where you use it
concatenatestring 1 and and string-n and removes trailing blanks
Varnew =FIND(target, value,<modifiers, startpos>);

return starting position of thefirst occurrenceof value within
atarget Modifiers: I = case-insensitive search
T = ignore trailing blanks
Startpos: start searching from this number
Varnew =UPCASE(argument);
length initial string
return argument in upper case
Varnew =LOWCASE(argument);
length initial string
return argument in lower case
Varnew =PROPCASE(argument<,delimiter(s)>);
length initial string
return argument in proper case
(starting with acapital for every new word after delimiter)
Varnew =TRANWRD(source, target, replacement);
length 200 Bytes
replaces or removes all occurrences of target with replacement in source
Varnew =COMPRESS(source<,char>);
length 200 Bytes
removes thecharacter listed in thechar argument fromthesource
10

e. Manipulating Numeric Values

Varnew =ROUND(argument<,round-off-unit>);

return argument rounded to nearest round-off-unit (eg. 10, .1, .01,)

Varnew =CEIL(argument);

return smallest integer greater or equal to theargument
Varnew =FLOOR(argument);

return biggest integer smaller or equal to theargument
Varnew =INT(argument); return integer portion of argument
Varnew =MIN(argument-1argument-n);

return smallest non-missing valuefromarguments
Varnew =MAX(argument-1argument-n);

return biggest non-missing valuefromarguments
Varnew =SUM(argument-1argument-n);

return sumof thearguments and ignores missing values
Varnew =MEAN(argument-1argument-n);

return mean of thearguments and ignores missing values

Varnew =N(argument-1argument-n);

Count of non missing argument

Varnew =NMISS(argument-1argument-n);

Count of missing argument



f. Converting Variable Type

DATA new-SAS-data-set;
SET old-SAS-data-set(RENAME=(old-variable-name=new-variable-name));

i. Explicit Numeric-to-Character conversion
Var
new
=PUT(old-numeric-variable, format.);

ii. Explicit Character-to-Numeric conversion
Var
new
=INPUT(old-character-variable, informat.);
RUN;

Debugging Techniques

a. Using Putlog Statement

DATA Output-SAS-Data-Set;
<Sas statements>
PUTLOG text; =>write text to log
PUTLOG variable=; =>write variable=value to log
PUTLOG variable format.; =>write value of variable in indicated format to log
PUTLOG _ALL_; =>write name and value of all variables in PDV to log
RUN;


b. Using Debug Option

DATA Output-SAS-Data-Set /DEBUG;
<Sas statements>
RUN;

Commands in DEBUG-mode Abbreviation Action
STEP ENTER key Steps through a programone statement at a time.
EXAMINE E variable(s) Displays the value of the variable.
WATCH W variable(s) Suspends execution when thevalueof the variable chang
LIST WATCH L W Lists variables that are watched.
SET SET Assign a new value to the specified variable
QUIT Q Halts execution of the DATA step.

11

Processing Data Iteratively

a. Do Loop Processing

iii. Iterative Loop Processing

DATA Output-SAS-Data-Set;
DO index-variable=start TO stop <BY increment>;
<additional SAS-statements>
END;
RUN;

iv. Conditional Loop Processing

DATA Output-SAS-Data-Set;
DO WHILE (expression); expression evaluated at top of loop
<additional SAS-statements>
END;
RUN;

DATA Output-SAS-Data-Set;
DO UNTIL (expression); expression evaluated at bottomof loop
<additional SAS-statements> (executed at least once)
END;
RUN;

v. Iterative DO Statement with a conditional Clause

DATA Output-SAS-Data-Set;
DO index-variable=start TO stop <BY increment>
WHILE|UNTIL (expression);
<additional SAS-statements>
END;
RUN;

b. SAS Array Processing

DATA Output-SAS-Data-set;
SET Input-SAS-Data-set;

/* defining the array */
ARRAY array-name{number-of-elements}<$> <length><array-elements><initial-value-list>;

/* processing the array */
DO index-variable=1 TO number-of-elements;
additional SAS statements using array-name{index-variable}
END;
<DROP index-variable;>


RUN;
c. Using SAS Arrays

vi. Create new variables (with name=array-name-2 using old-variables)

DATA Output-SAS-Data-set (DROP=index-variable);
SET Input-SAS-Data-set;
ARRAY array-name-1{number-of-elements-1} <old-variables>;
ARRAY array-name-2{number-of-elements-2};
DO index-variable=1 TO number-of-elements-2;
array-name-2{index-variable}=expression-using-array-name-1{index-variable}
END;
RUN;
12

Useful array functions

DIM(array name) Return the number of elements in an array
HBOUND(array name) Return the upper bound of an array
LBOUND(array name) Return the lower bound of an array

Rotate SAS data set

DATA Output-SAS-Data-set (DROP=old-variables);
SET Input-SAS-Data-set;
ARRAY array-name{number-of-elements}< old-variables>;
DO index-variable=1 TO number-of-old-variables;
new-variable=array-name{index-variable};
OUTPUT;
END;
RUN;

-------------ALTERNATIVE----------------

PROC TRANSPOSE DATA=Input-SAS-Data-set
<OUT= Output-SAS-Data-set>
<NAME=Variable name;>
<BY <DESCENDING> variable-1
<<DESENDING> variable-n> <NOTSORTED>;>
<VAR variable(s);>
<ID variable(s);>
RUN;


Combining SAS Data Sets

Using data manipulation techniques with match-merging

DATA SDS;
MERGE SDS1(IN=in1) SDS2(IN=in2);
BY Merge_Var;
IF in1and in2THEN output match;
IF in1AND NOT in2THEN output nonmatch1;
IF NOT in1AND in2THEN output nonmatch2;
RUN;

Other SAS Languages

a. PROC SQL

PROC SQL;
SELECT variable-1,variable-n
FROM Input-SAS-Data-Set-1 <AS alias-inputdataset-1>,
Input-SAS-Data-Set-2 <AS alias-inputdataset-2>
WHERE expression ;
QUIT;

b. SAS Macro

%Let variable-1=value; ---declaration of the macro variable---

---example of usage of the macro variable with a proc print---
Title listing report for the variable &variable-1
Proc Print data= Input-SAS-Data-Set;
Var &variable-1;
Run;

Potrebbero piacerti anche