Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
By
Robert A. Yaffee
Statistics, Social Science, and Mapping Group
Academic Computing Services
Information Technology Services
New York Unive rsity
January 2002
Invocation of LIMDEP
We begin the procedure by invoking the LIMDEP program. To invoke LIMDEP, double
click on the LIMDEP icon . The opening dialog box, seen in Figure 1, appears
on the screen.
1
Figure 1 Project dialog box
When the user clicks on File and New, the Text/Command document dialog box
seen in Figure 2, opens.
The user at this point will see that the Text/Command Document option is
selected. He can click on OK to open a command window, shown in Figure 3, appears.
2
The Basic Command Syntax
The user can write his program syntax in this window. A basic program is
entered into this window to illustrate our exposition of the fundamental program
command syntax in Figure 4 on the next page. Annotation of the commands follows.
Command Termination
Title Statement
First, we insert a title statement in our program to notify us of the purpose of the
program. Titles may be stacked, one atop of the other, just before the statistical
procedure, in order to properly describe it.
Second, we define the output file in which our output will be place for viewing
and analysis.
Open; output=output1 $
Data Definition
Next we have to define the in- line data and variable names. Reading Data files is
done with the READ command. The READ command contains subcommands that
define the number of observations (nobs), the number of variables (nvar), and the
variable names (names).
Input data formats are not needed if the data in the variables in the data file are
separated by one blank space. If the variables in the data file are not separated by blank
spaces, LIMDEP requires a format specification to be entered. FORTRAN floating point
or exponential formats are employed. LIMDEP does not handle string or integer formats.
If formats were used for this data set, the input format for this data set could be included
as follows:
3
Read; nobs = 30; nvar = 8;
names = id, age, sex, gpa, ses, income, marital, religion;
(F2.0,1X,F2.0,1X,F1.0,1X,F3.1,1X,F2.0,1X,F6.0,2(1X,F1.0)) $
The formulation for the numeric formats is Fw.d, where F=numeric floating point
format, w=column space width of variable, and d = number of decimals to the right of the
decimal point. 1X = 1 blank space between other variables. If there were two spaces
between the variables, then 2X would be used.
In the event that the researcher wishes to read an external spreadsheet file, from
Excel or Lotus, he would use the following Read command.
4
Figure 4: Basic LIMDEP command syntax
5
Data
The data follow the file and data definition statements in the Read command.
Data Review
The analyst gives a set of variables (in this example, the complete set) a nickname
or namelist. The nickname or namelist in this case is called xlist.
The listing of the data for the variables is accomplished with the LIST command.
The List command specifies that all of the variables in the namelist called Xlist should be
listed. By reviewing this printout, the user check to be sure whether LIMDEP is reading
the data correctly. The listing of the data includes line numbers and observation numbers
to help check to be sure all data requested are read.
List; xlist $
Range checks of the discrete and continuous distributions are facilitated with
frequency tables and graphical representation of variables with the Histogram commands.
Part of the histogram command in the program syntax, shown between Figure 5
and 6, contains an Rhs designation. This refers to variables on the right hand side of the
equation list.
Figure 5 shows two panels. The upper panel is called the Trace Windows. In it is
a complete list of commands as entered by the programmer and any and all warnings or
error diagnostics pertaining to them? The lower panel is called the Output Window. In it
the reviewer finds the output from the commands after they have been executed.
6
Figure 5 Listing of data
The first part of the histogram output consists of the Frequency and percentage
tables in Figure 6. The range check using these categories can help the analyst spot
miscodes or typographical errors. The second part of the histogram output contains the
histogram shown in Figure 7 below. This graphical depiction facilitates the detection of
typographical errors as well.
7
Histogram for Variable SEX
20
15
Frequency
10
0
0 1
SEX
Figure 7 Histogram of Gender
6
Frequency
0
18 19 20 21 22 23 2425 26 27 28 29 30 31 32 33 34 35 36 37 38 39
AGE
8
To evaluate the distributional characteristics of continuous distributions,
summary statistics may be computed. The DSTATS command provides the mean as
well as measures of dispersion of the variable, found in Figure 8. If one invokes the
output = 3 option, covariance and correlation matrices of the variable list are also
included in the output. Their icons are shown in the lower left of Figure 8.
9
Another form of consistency check may be performed between nominal variables.
It is helpful for the analyst to check to see whether the relationships between discrete
variables are as expected. To do so, he can perform a crosstabulation analysis between
categorical variables by using the Crosstabs command. This Crosstabs command, unlike
most in LIMDEP, permits the inclusion of value labels.
The Crosstabs command betweens with the designation of the procedure. The LHS, the
left hand side of the equation, is associated with the gender variable, sex. The values of
sex are male (coded as 0) and female( coded as 1 ). The RHS is associated with the Adult
variable, the values of which are 0 for child and 1 for adult. The labels for these variables
are inserted in the Labels subcommand. The crosstabulations requested in the output
subcommand are those of the count, the expected, the row percentages, the column
percentages and the total percentages. The count output is default. The other outputs are
specifically requested in the output subcommand. In that command, shown above, the P
requests the predicted or expected crosstabulations, the R requests the row percentages,
the C refers to the column percentages, and the T invokes the total percentages. The
output from the above command appears in Figure 10 below.
10
Figure 10 Crosstabs Count, Expected, and Row Percents
Further down, he gets the crosstabulation of column percents, and total percents as well,
shown in Figure 11.
11
Figure 11 Expected Values
Two test of significance of the relationship are given in the form of the Pearson and
Likelihood Ratio Chi-square tests shown above in Figure 10. In these ways, the analyst
can perform the consistency checks with the crosstabulation of discrete data. The output
does not contain the nonparametric correlations sometimes associated with
crosstabulations.
Scatterplots
The researcher might examine the relationship between the dependent variable
and the candidate predictor variables in a hypothesized model. To do so, he could
employ a series of Plot commands:
In the Plot command, the rhs variable is the dependent variable. In this case, a school
administrator is trying to ascertain what variables help explain grade point average (gpa)
on the part of students. GPA is therefore the dependent variable in the study. The
candidate predictor variables are assigned the LHS position. The title for each graph is
included in a title subcommand. In the title, blank spaces are indicated by underscores.
A typical scatterplot output appears as follows.
12
GPA against Income
4.25
4.00
3.75
3.50
3.25
GPA
3.00
2.75
2.50
2.25
2.00
.306 .924 1.542 2.160 2.778 3.396
INCOME (x10^05)
This relationship is clearly somewhat nonlinear and possibly amenable to a natural log
transformation.
After the data cleaning and exploratory data analysis has been performed, one
may choose to sample, sort, or subset portions of the data for further analysis. Also,
LIMDEP version 7 permits the writing out of data sets. LIMDEP version 7 does all of
this with ease. One can sample all of the data set, if one does not have a very large data
set.
Sampling:
The researcher may sample all of the observations in the data set. He merely has
to issue the command:
Sample; all $
Alternatively, he can include a portion of the data set—for example, the first 100
observations with the following command:
Sample; 1-100 $
13
He may prefer to sample only particular segments of the data set. For example,
he can select the first 100 observations and observations 300 through 400 with the
following command:
Sample; 1-100,300-400 $
Draw; N = 486 $
If the researcher wishes to randomly sample with replacement, he can use the
DRAW command with the Rep option. Suppose he wishes to randomly sample with
replacement 1998 cases, he merely enters
Reject; adult = 0 $
Reject; age > 20 $
Include; adult = 1 $
Include; age > 20 $
By doing so, he would subset out of the whole sample, only those adults for analysis.
The command must be issued before the any of the statistical procedure commands for
which it is to hold.
Whenever the researcher might need to sort the data by a variable, say date, he
can employ the SORT command. The variable by which the data are to be sorted in
14
ascending order is the variable associated with the left hand side, the LHS. The variables
that are to be sorted accordingly are associated with the right hand side, the RHS
subcommand.
Sort; LHS = date; RHS = list of variables to be sorted, each of which is separated
by a comma. The command is terminated by a $.
Appending Data
Observations can be appended to the data set with the append command. Suppose
that the data to be appended is saved in an ASCII data file, called ADD.DAT.
The general command syntax is
Append; File=filename
; Nvar = number of variables
; Nobs = number of observations
[ ; format = Fortran format, WKS, or binary]
[; Names = names or list of names]
[; by variables ] $
Suppose we have only two variables, the ID variable and the AGE variable. If
we have only three observations but wish to add four more from ADD.DAT, then our
Command syntax would be
15
Figure 13 Appending Observations to a Data Set
Two data sets can be merged by a unique identifying variable. Although not
displayed in the program, both data sets must be presorted by that unique identifying
variable so that their case sequences are identical. The Read command is used to add
variables (columns) from an external data set to your current data set.
Let the identifier be the id variable. After both data sets have been sorted so that
they have the same case sequence, they can be merged with the addition of a second Read
command which reads the variables to be matched with the current sample.
An example of such a program reads two variables from the addvar.dat data set to
the current one.
16
The output from this merge is displayed.
LIMDEP can also write ASCII or Excel files to a disk. The Write command is
very similar to the Read command. You merely have to give it the Write command, list
the variables you want written to the ASCII file, and provide a file name. For example,
produces the output on the next page. In this case, the data are written to an external
ASCII file, called new.dat. When New.Dat is opened up in the lower panel of Figure 15,
the analyst can see what was produced. These data can then be used by other programs
or other procedures within LIMDEP.
17
Figure 15 Writing Data to an ASCII output file
To write an Excel file, the user would merely use a format = xls option along with the
write command.
LIMDEP is not case sensitive. However, names in LIMDEP must begin with a
letter. The names should be 8 characters or less in length. To construct the names, the
ASCII character set can be used with the Windows version of LIMDEP. The names
should be constructed from the underscore ‘_’, the 26 letters and 10 digits. The use of
other punctuation can cause unexpected consequences.
Transformations
With rules of nomenclature in mind, the programmer can construct new variables
and transform other variables with the CREATE command. LIMDEP has a wide variety
of functions that can be used to create the new variables. A list of variables can be
constructed with one create command.
18
Examples of the Create command are:
For example, to construct a minor-adult variable from age, the following Create
command can be employed, if there are no missing values:
Recode; Oldvar;
1,2 = -1;
3 = 0;
4,5 = 1 $
Variable transformations include the deletion of variables that are no longer used
with the DELETE command.
Variables may also be transformed with the RENAME command. The syntax
format for this command is
19
By creating, recoding, deleting, renaming, and sorting, existing variables may be
transformed for further analysis.
Missing Values
Missing values must be recoded to -999, which is the LIMDEP internal missing
value code. If other variables have other missing values, they can be recoded to – 999,
and thereby treated as missing. When a value in a particular variable has been coded as
-999 that value will be treated as missing.
This situation could engender a pairwise missing condition unless steps are taken
to skip cases which have missing values. To program LIMDEP to cause cases values
coded as -999 to be dropped, LIMDEP offers the SKIP command.
In this way, that case would be deleted from the statistical analysis.
Calculators
20
Figure 15 Finding the Calculators
The scalar calculator can perform arithmetic calculations. By clicking on the
scalar calculator, a pop down window appears:
To multiply 433 by 34, the user enters the multiplication in the expression box
and hits the enter key. The result appears in the output window below the expression
box.
LIMDEP has its own matrix language and calculation. To define a matrix, the
user selects from Figure 15, the matrix calculator. Once the matrix calculator windows
pops down, the analyst first defines matrix a as a = [1, 2/4, 9], then he defines matrix z as
the inverse of matrix a with the command, z = Ginv(a). Matrix a then is squared, and
finally, the Moore-Penrose generalized inverse of a is computed with the MPINV =
G2nv(a), command. The analyst wishes to compute C = a’a. He enters the commands
in the Expression window of the matrix calculator.
21
Figure 17 Entering a Matrix Computation
After hitting the enter key, a matrix is created and kept for further computation. In
the lower panel are the lists of matrices created with the commands in Figure 17.
The extensive matrix language makes a powerful LIMDEP device for custom-
building one’s own statistical processes.
A simple example is given of how to run a very basic regression model. First, a
classical ordinary least squares model is run. Second, a robust regression model with a
White’s heteroskedastically consistent variance estimator is run. The dependent variable
is a continuous variable of Grade Point Average (GPA). The independent variables
include socioeconomic status (SES), income, sex, and age. The program syntax and
output for the ordinary least squares regression procedure is given in Figure 20.
22
Figure 20 Classical Regression Program Syntax
23
Figure 21 Robust Regression Output
The robust regression model adjusted for degrees of freedom explains 53.4 percent of the
variance, corrected for the number of variables in the model. This model is corrected for
heteroskedasticity, so it should not be a problem. Nonetheless, there are three outliers in a
sample of 30. Depending on their influence, these outliers could distort the mean. The
small sample size could contribute to making those outliers influential enough to be
problematic. The researcher would be well advised to obtain a larger sample size for his
analysis.
Limitations
LIMDEP is a specialized econometrics statistical package and not a general
purpose statistical package. Researchers using this package are interested in models of
limited dependent variables and panel data analysis. The data management features are
somewhat limited. More specifically, the management of titles, value labels, data types,
missing value codes, graphs, and general purpose or overall nonparametric statistical
techniques happens to be less than complete.
LIMDEP has limited labeling capability. Value labels are also not widely or
easily applicable to discrete variables. This might seem unusual in a package designed to
permit the analysis of limited or discrete variables. Indeed, there are no provisions for
general value labels or formats for discrete variables.
24
To be sure, some data types are not accommodated. LIMDEP does not handle
string or integer variables. It does not manipulate text or labeled data. String or integer
level variables must be converted to numeric or exponent ial data types for LIMDEP to
process them properly. The user can import ASCII, formatted ASCII, DIF, WKS,
Binary, and Excel (XLS) files (the latter can be copied and pasted) into LIMDEP.
Alternatively, one can use DBMSCOPY or Stat/Transfer to convert files into LIMDEP.
Missing value processing are uniform. There is just one basic missing value code.
That is -999 and other codes need to be recoded to that value in order to avoid errors.
Principal Advantages
LIMDEP programs count data models that include a family of poisson regression
models with normal heterogeneity, with underreporting, with excess zeroes, with group
effects and sample selection FIML estimation. The negative binomial models can
accommodate random effects with normal group effects.
For the discrete dependent variables, LIMDEP allows the user to run tobit, logit,
and probit models. These models could handle random effects and heteroskedasticity.
LIMDEP also handles extreme value and multinomial probit models.
Not only does it handle linear regression models, LIMDEP handles nonlinear
regression models as well. It also runs nonparametric, semi-parametric, and parametric
survival models. The latter family of models could handle gamma heterogeneity.
25
The matrix language provides the advanced user with considerable capability to
custom build his analysis.
Once the LIMDEP program has been written to the Command window, the
programmer can select it and run it. To select the commands to be run, the analyst can
click edit in the header bar. A drop down menu will appear.
When the user clicks on Select All, the whole program becomes selected and appears to
be highlighted or blackened, as shown in Figure 23 below.
26
Figure 23 Selecting All of the Program
Once the program has been selected, the user can click on the green Go icon in the header
bar of Figure 23 to submit the program. The program listing, warning, and error
diagnostics will appear in the Trace Window. Once the errors have been correct, the
output will appear in the Output Window, as shown in Figure 5 or Figure 21.
Exiting LIMDEP
The user should save his files, go to the File option in the header bar. Click on
that and then click on exit.
27