Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
SAS Overview
SAS Softwares
There are a variety of SAS softwares. Few mostly used are ,
SAS Softwares
SAS Software
BASE SAS
SAS / ACCESS SAS / AF SAS / CONNECT SAS / FSP SAS / GRAPH SAS / NVISION SAS / SHARE
Applications
Report Generation , Mathematical and Data Analysis
Interface to DB2 User friendly windowing applications Communication with remote SAS sessions Interactive data entry & retrieval facilities Visual representation of data analysis Creation of 3D objects , animation & prototyping Concurrent update access to SAS files
Applications
Statistical & Mathematical analysis Report Generation Graphics Business Forecasting Animation Modelling Data Analysis Operations Research
Features of SAS
Portability
Jan. 1, 1959
Jan. 1, 1960
Jan. 1, 1961
-365
366
Every statement must end with a semicolon Variable names must begin with an alphabet and should not exceed eight characters Variable names should not have embedded blanks Variable names should not have SAS automatic variable names Comments can be demarked by /* and */ symbols or by *
SAS Variables
SAS variables can be classified as follows: Numeric Variables Character Variables Macro Variables
SAS Variables
Numeric Variables
Stores a maximum of 8 bytes Character Variables Stores a maximum of 200 bytes Macro Variables Represented with a & prefix
Selected INFORMATS
Informats are special instructions used to read data values into a variable. Some of the Informats are : $CHARw. COMMAw.d DATEw. $W. Reads character data with blanks Removes embedded characters Reads Date in the form of ddmmyy Reads standard character data
Selected FORMATS
Formats are special instructions SAS System uses to write data values into variables. E.g : $CHARW. BESTW. Writes standard character data Default format for writing numeric values Writes numeric values with commas separating every three digits
COMMAW.d
Selected FORMATS
DATEW.
DOLLARW.D
Two Automatic variables ( numeric ) are created for each DATA step Processing . They are , _N_ _ERROR_ _N_ : Initially set to 1 . Increments by 1 each time the DATA step iterates.
SAS Operators
Comparison Operators
Logical Operators
Arithmetic Operators
Operation
Addition Subtraction Multiplication Division Exponentiation
Symbol
+ * / **
E.g
X=Y+Z; X=Y-Z; X=Y*Z; X=Y/Z; X=Y**Z;
Meaning
Adds Y and Z Subtracts Z from Y Multiplies Y by Z Divides Y by Z Raises Y to the Z power
Comparison Operators
Symbol = ~= or ^=
Mnemonic Operator EQ NE
>
< >= <=
GT
LT GE LE
Greater than
Less than Greater than or Equal to Greater than or Equal to
Logical Operators
Symbol & | ~ or ^
Other Operators
Mnemonic Operator
MIN( ) MAX( )
SAS Program
A SAS Program is a collection of SAS steps. SAS steps can either be DATA step PROC Step ( or )
DATA step starts with the key word DATA. It is used for creating SAS Data sets PROC step is used for accessing the SAS Datasets
SAS Step boundary can be identified in two ways. RUN statement Beginning of next SAS step
A run statement marks the end of a SAS step. In case no run statement is present , then the beginning of next SAS step marks the end of previous SAS step.
SAS step boundary terminated by RUN statement E.g. : /* SAS step1 */ DATA sample ; INFILE ext; INPUT @001 name @030 age; RUN; /* SAS step2 */ PROC PRINT DATA = sample; RUN;
SAS step boundary terminated with out a RUN statement E.g. : /* SAS step1 */ DATA sample ; INFILE ext; INPUT @001 name @030 age; ; /* SAS step2 */ PROC PRINT DATA = sample; RUN;
SAS Program
Batch Mode
Interactive Mode
SAS Output
When you execute a SAS program, the output generated by SAS is divided into two major parts namely , ( i ) SAS Log & ( ii ) Output SAS Log Contains information about the processing of the SAS program, including any warning and error messages Output Contains reports generated by SAS Procedures and DATA steps In Batch SAS , Output is routed to SASLIST by default .The output can also be routed to an external file with the help of FILE statement.
SAS Output
SAS Datasets
SAS Catalogs
Other Files
SAS Datasets can not be accessed by any other programming languages other than SAS.
SAS System can however fetch data from Mainframe Datasets.
Descriptor Information
Observation
Variable
Descriptor Portion
The descriptor portion of a SAS data set contains :
General information about the SAS data set ( Data set name, number of observations, and so on)
Variable attributes ( Variable name, type, length, position, informat, format, label )
SAS Dataset
Parts of a Dataset
Every SAS Data set has three elements namely
Libref Data set name Member Type General form of a SAS Data set is :
libref.data-set-name.member type
Where ,libref is the logical name of a SAS data library .data-set name is the dataset name member type is DATA for SAS data files and VIEW for SAS data views. ( This is assigned by the SAS system )
SAS Datasets
PROC Steps
Information
Input to a SAS Dataset can be any external file , TSO file , PS , member of a PDS , a VSAM file, SAS Dataset or even an Excel file. Data retrieved from external sources SHOULD be converted to a SAS Dataset. This is because SAS system will recognise only SAS Datasets.
DATA Step
SAS sores information in the form of SAS Datasets. A SAS Dataset is created by using DATA statement.
E.G : DATA details ; The above SAS creates a SAS Dataset called details . Data to a SAS Dataset can be supplied through INFILE or CARDS or CARDS4 statement
When no dataset name is specified on a DATA statement , SAS Automatically names the Dataset created as DATA1,DATA2DATAn. This is called as DATAn Naming convention.
The INFILE statement is used to reference the external file where the raw data is available. Using INPUT statement the data can be retrieved from the external file.
E.g. : DATA detail ; INFILE uhgxn0.extrnl.file INPUT @001 Name @032 Age ; RUN ;
These statements will create a SAS Dataset called detail containing two fields Name and Age. The values for Name will be taken column 1 of external file ( uhgxn0.extrnl.file ) and values for Age will be taken from column 32 of external file.
IA.AIRCRAFTCAP
Model MF4000 LF5200 LF5200 010012 030006 030008
Combining Datasets
Combining Datasets
Concatenation Combines two or more datasets one after the other into a single dataset . This is accomplished using SET statement Interleaving Combines individual sorted datasets into one sorted dataset. This is accomplished using SETBY statement Merging Combines observation from two or more datasets into a single observation in a new dataset .This is accomplished by MERGE and MERGEBY statements.
Combining Datasets
Updating Replaces the value of variables in one dataset ( Master Dataset ) with non missing values from another dataset ( Transaction Dataset ). This is accomplished using UPDATEBY statement.
Concatenation
LIST1
Name Arijit Mohan Kumar Age 20 22 54
LIST2
Name Rohit Raj Sekar Age 18 33 17
LIST3
Name Arijit Mohan Kumar Rohit Raj Sekar Age 20 22 54 18 33 17
Merging
Match Merging
Merging
One- to-One Merging In One -to-One merging , no BY statement is used. The SAS system combines the first observation in all datasets named in MERGE Statement into first observation in new dataset , the second observation in all datasets into second observation in new data set and so on. Match Merging In Match merging Datasets are merged according to the variables mentioned in the BY statement
One-to-One Merging
PAYROLL
Name Anil Roopa Age 22 21 Sex M F
INCREASE
Name Anil
Roopa Kiran
Salary 34500
26000 22000
NEWPAY
Name DATA newpay; MERGE payroll increase; RUN; Anil Roopa Kiran Age 22 21 . Sex M F Salary 34500 26000 22000
Match Merging
WORK.ONE WORK.TWO
X 1 1 2 3 3 Z A1 A2 B1 C1 C2
X 1 2 3
Y A B C
WORK.THREE
DATA work.three; MERGE work.one work.two; BY X; RUN;
X 1 1 2 3 3 Y A A B C C Z A1 A2 B1 C1 C2
Interleaving
LIST1
Name Anil Sunil Age 33 12
LIST2
Name Karthik
Prakash
Age 17
43
LIST3
Name Anil DATA list3; SET list1 list2; BY name; RUN; Karthik Prakash Sunil Age 33 17 43 12
UPDATING
PAYROLL
Name
Anil Hari
INCREASE
Name Anil Hari Kiran 26000 Salary 34500
Salary
24500 32000
Kiran
16000
NEWPAY
Name DATA newpay; UPDATE payroll increase; BY Name; RUN; Anil Hari Kiran Salary 34500 32000 26000
PROCEDURES
Procedures are used to perform operations on SAS Data set
SAS has got a number of Procedures Procedures are represented by the keyword PROC Few commonly and most widely used PROCS such as CONTENTS , SORT , PRINT are discussed below.
CONTENTS Procedure
Used to browse the Descriptor portion Gives information about Data set and the variables present in the Data set General form of the CONTENTS procedure:
PROC CONTENTS
E.g. : DATA ONE; INPUT @001 NAME $15. @020 AGE 2.; CARDS; RAMESH 12 GOPAL 34 RAJU 07 ; RUN; PROC CONTENTS DATA = ONE; RUN;
PROC CONTENTS
PROC CONTENTS
SORT Procedure
Rearranges the observations in a SAS data set
Creates a new SAS data set containing the rearranged observations Sort on multiple variables Sort in ascending (default) or descending order Do not generate printed output Treats missing values as the smallest possible value
PROC SORT
E.g. : DATA ONE; INPUT @001 NAME $15. @020 AGE 2.; CARDS; RAMESH 12 GOPAL 34 RAJU 07 RAJU 07 ; RUN; PROC SORT DATA = ONE NODUPLICATES; BY NAME; RUN;
PROC SORT
The Data set Work.one will contain only three records. Duplicate record RAJU is deleted.
Name
Age
GOPAL
RAJU RAMESH
34
07 12
PRINT Procedure
Used to display the contents of a SAS Data set PROC PRINT < option list >; VAR variable-list; ID Variable-list; BY Variable-list; PAGEBY BY-Variable; SUMBY BY-Variable; SUM variable-list; RUN;
Sample Output
Thank You !