Sei sulla pagina 1di 3

Functions That Return Values from

Previous Observations
Because SAS processes data from raw data files and SAS data sets line by line (or
observation by observation), it is difficult to compare a value in the present
observation
with one from a previous observation. Two functions, LAG and DIF, are useful in
this
regard.
Lets start out with a short program that demonstrates how the LAG function
works:

Program 11-13 Demonstrating the LAG and LAGn functions


datalook_back;
inputTimeTemperature;
Prev_temp=lag(Temperature);
Two_back=lag2(Temperature);
datalines;
160
262
365
470
;

A listing of data set Look_Back follows:


Listing of LOOK_BACK
Prev_
Obs Time Temperature temp Two_back
1 1 60 . .
2 2 62 60 .
3 3 65 62 60
4 4 70 65 62

As you can see from this listing, the LAG function returns the temperature from
the
previous time and the LAG2 function returns the temperature from the time
before that.
(There is a whole family of LAG functions: Lag, LAG2, LAG3, and so on.) This
program
might give you the idea that the LAG function returns the value of its argument
from the
previous observation. This is not always true. The correct definition of the LAG
function
is that it returns the value of its argument the last time the LAG function
executed. To
help clarify this somewhat clunky sounding definition, see if you can predict the
values
of x and Last_x in the program that follows:

Program 11-14 Demonstrating what happens when you execute a LAG


function conditionally
datalaggard;
inputx@@;
ifXge5thenLast_x=lag(x);
datalines;
9871212

Here is a listing of data set Laggard:


Listing of LAGGARD
Obs x Last_x
19.
289
378
41.
52.
6 12 7

OK, are you surprised? The value of Last_x in the first three observations is clear.
But,
what happened in Observation 6? To understand this, you need to read the
definition
carefully. The IF statement is not true in Observations 4 and 5; therefore, Last_x,
which
is set to a missing value at each iteration of the DATA step, remains missing. In
Observation 6, the IF statement is true and the LAG function returns the value of
x the
last time this function executed, which was back at Observation 3, where x was
equal
to 7.
The take-home message is this: Be careful if you execute a LAG function
conditionally. In most cases, you want to execute the LAG function for each
iteration of
the DATA step. When you do, this function returns the value of its argument from
the
previous observation.
A common use of the LAG function is to compute differences between
observations. For
example, you can modify Program 11-13 to compute the difference in
temperature from
one time to the next, as follows:

Program 11-15 Using the LAG function to compute interobservation


differences
datadiff;
inputTimeTemperature;
Diff_temp=Temperaturelag(Temperature);
datalines;
160
262
65
470
;

Here is a listing of Diff:


Listing of DIFF
Diff_
Obs Time Temperature temp
1 1 60 .
2 2 62 2
3 3 65 3
4 4 70 5

Programmers often use the form:

xlag(x);

Therefore, a set of DIF functions (DIF, DIF2, DIF3, and so on) is available. DIF(x) is
equal to x LAG(x). You could, therefore, rewrite Program 11-15 like this:
Program 11-16 Demonstrating the DIF function
datadiff;
inputTimeTemperature;
Diff_temp=dif(Temperature);
datalines;
160
262
365
470
;

Potrebbero piacerti anche