Sei sulla pagina 1di 5

Basic Syntax

Data Processing All Stata functions have the same format (syntax):
with Stata 14.1 Cheat Sheet [byvarlist1:] command [varlist2] [=exp] [ifexp] [inrange] [weight] [usingfilename] [,options]
For more info see Statas reference manual (stata.com)
apply the function: what are column to save output as condition: only apply to apply pull data from a file special options
Useful Shortcuts command across
each unique
you going to do
to varlists?
apply
command to
a new variable apply the function specific rows
if something is true
weights (if not loaded) for command

combination of
F2 keyboard buttons Ctrl + 9 variables in In this example, we want a detailed summary
varlist1 bysort rep78 : summarize price if foreign == 0 & price <= 9000, detail with stats like kurtosis, plus mean and median
describe data open a new .do file
Ctrl + 8 Ctrl + D
To find out more about any command like what options it takes type helpcommand
open the data editor highlight text in .do file,
clear then ctrl + d executes it
delete data in memory in the command line Basic Data Operations Change Data Types
AT COMMAND PROMPT Arithmetic Logic == tests if something is equal Stata has 6 data types, and data can also be missing:
= assigns a value to a variable no data true/false words numbers
add (numbers) & and == equal < less than missing byte string int long float double
PgUp PgDn scroll through previous commands + combine (strings)
! or ~ not != not <= less than or equal to To convert between numbers & strings:
subtract or > greater than gen foreignString = string(foreign) "1"
Tab autocompletes variable name after typing part | or ~= equal 1 tostring foreign, gen(foreignString) "1"
>= greater or equal to
cls clear the console (where results are displayed) * multiply if foreign != 1 & price >= 10000 if foreign != 1 | price >= 10000
decode foreign , gen(foreignString) "foreign"

Set up / divide make


Chevy Colt
foreign
0
price
3,984
make
Chevy Colt
foreign
0
price
3,984
gen foreignNumeric = real(foreignString) "1"
Buick Riviera 0 10,372 Buick Riviera 0 10,372 1 destring foreignString, gen(foreignNumeric) "1"
pwd ^ raise to a power Honda Civic
Volvo 260
1
1
4,499
11,995
Honda Civic
Volvo 260
1
1
4,499
11,995
encode foreignString, gen(foreignNumeric) "foreign"
print current (working) directory
recast double mpg
cd "C:\Program Files (x86)\Stata13" Explore Data generic way to convert between types
change working drive
dir VIEW DATA ORGANIZATION SEE DATA DISTRIBUTION Summarize Data
describe make price codebook make price
display filenames in working directory include missing values create binary variable for every rep78
display variable type, format, overview of variable type, stats, value in a new variable, repairRecord
fs *.dta and any value/variable labels number of missing/unique values
List all Stata files in working directory underlined parts tabulate rep78, mi gen(repairRecord)
are shortcuts count summarize make price mpg one-way table: number of rows with each value of rep78
capture log close use "capture" count if price > 5000 print summary statistics tabulate rep78 foreign, mi
close the log on any existing do files or "cap" number of rows (observations) (mean, stdev, min, max) two-way table: cross-tabulate number of observations
log using "myDoFile.do", replace Can be combined with logic for variables for each combination of rep78 and foreign
create a new log file to record your work and results ds, has(type string) inspect mpg bysort rep78: tabulate foreign
search mdesc lookfor "in." show histogram of data, for each value of rep78, apply the command tabulate foreign
packages contain search for variable types, number of missing or zero
find the package mdesc to install extra commands that tabstat price weight mpg, by(foreign) stat(mean sd n)
variable name, or variable label observations create compact table of summary statistics
ssc install mdesc expand Statas toolkit displays stats
isid mpg histogram mpg, frequency formats numbers for all data
install the package mdesc; needs to be done once
check if mpg uniquely plot a histogram of the table foreign, contents(mean price sd price) f(%9.2fc) row
Import Data identifies the data distribution of a variable create a flexible table of summary statistics
BROWSE OBSERVATIONS WITHIN THE DATA collapse (mean) price (max) mpg, by(foreign) replaces data
sysuse auto, clear for many examples, we Missing values are treated as the largest calculate mean price & max mpg by car type (foreign)
load system data (Auto data) use the auto dataset. browse or Ctrl + 8 positive number. To exclude missing values,
use "yourStataFile.dta", clear open the data editor ask whether the value is less than "." Create New Variables
load a dataset from the current directory frequently used list make price if price > 10000 & price < . clist ... (compact form) generate mpgSq = mpg^2 gen byte lowPr = price < 4000
commands are list the make and price for observations with price > $10,000 create a new variable. Useful also for creating binary
import excel "yourSpreadsheet.xlsx", /* highlighted in yellow variables based on a condition (generate byte)
*/ sheet("Sheet1") cellrange(A2:H11) firstrow display price[4]
generate id = _n bysort rep78: gen repairIdx = _n
import an Excel spreadsheet display the 4th observation in price; only works on single values
_n creates a running index of observations in a group
import delimited"yourFile.csv", /* gsort price mpg (ascending) gsort price mpg (descending) generate totRows = _N bysort rep78: gen repairTot = _N
*/ rowrange(2:11) colrange(1:8) varnames(2) sort in order, first by price then miles per gallon _N creates a total count of observations (per group)
import a .csv file duplicates report pctile mpgQuartile = mpg, nq = 4
finds all duplicate values in each variable create quartiles of the mpg data
webuse set "https://github.com/GeoCenter/StataTraining/raw/master/Day2/Data"
webuse "wb_indicators_long" levelsof rep78 egen meanPrice = mean(price), by(foreign) see help egen
set web-based directory and load data from the web display the unique values for rep78 calculate mean price for each group in foreign for more options
Tim Essam (tessam@usaid.gov) Laura Hughes (lhughes@usaid.gov) inspired by RStudios awesome Cheat Sheets (rstudio.com/resources/cheatsheets) geocenter.github.io/StataTraining updated January 2016
Disclaimer: we are not affiliated with Stata. But we like it. CC BY NC
Data Transformation Reshape Data Manipulate Strings
with Stata 14.1 Cheat Sheet webuse set https://github.com/GeoCenter/StataTraining/raw/master/Day2/Data GET STRING PROPERTIES
For more info see Statas reference manual (stata.com) webuse "coffeeMaize.dta" load demo dataset display length("This string has 29 characters")
MELT DATA (WIDE LONG) return the length of the string
Select Parts of Data (Subsetting) reshape variables starting unique id create new variable which captures charlist make * user-defined package
with coffee and maize variable (key) the info in the column names
SELECT SPECIFIC COLUMNS display the set of unique characters within a string
drop make reshape long coffee@ maize@, i(country) j(year) new variable display strpos("Stata", "a")
remove the 'make' variable convert a wide dataset to long return the position in Stata where a is first found
keep make price WIDE LONG (TIDY) TIDY DATASETS have FIND MATCHING STRINGS
opposite of drop; keep only columns 'make' and 'price' coffee
country 2011 coffee maize maize melt country year coffee maize each observation display strmatch("123.89", "1??.?9")
2012 2011 2012 Malawi 2011
FILTER SPECIFIC ROWS Malawi Malawi 2012 in its own row and return true (1) or false (0) if string matches pattern
drop if mpg < 20 drop in 1/4 Rwanda Rwanda 2011 each variable in its display substr("Stata", 3, 5)
Uganda cast
Rwanda 2012
own column.
drop observations based on a condition (left) Uganda 2011
return the string located between characters 3-5
or rows 1-4 (right) CAST DATA (LONG WIDE)
Uganda 2012
When datasets are list make if regexm(make, "[0-9]")
keep in 1/30 what will be create new variables tidy, they have a list observations where make matches the regular
opposite of drop; keep only rows 1-30 create new variables named unique id with the year added consistent, expression (here, records that contain a number)
keep if inrange(price, 5000, 10000) coffee2011, maize2012... variable (key) to the column name standard format
that is easier to list if regexm(make, "(Cad.|Chev.|Datsun)")
keep values of price between $5,000 $10,000 (inclusive) reshape wide coffee maize, i(country) j(year) return all observations where make contains
manipulate and
keep if inlist(make, "Honda Accord", "Honda Civic", "Subaru") convert a long dataset to wide analyze. "Cad.", "Chev." or "Datsun"
keep the specified values of make compare the given list against the first word in make
xpose, clear varname
sample 25 transpose rows and columns of data, clearing the data and saving list if inlist(word(make, 1), "Cad.", "Chev.", "Datsun")
sample 25% of the observations in the dataset old column names as a new variable called "_varname" return all observations where the first word of the
(use set seed # command for reproducible sampling) make variable contains the listed words
Replace Parts of Data Combine Data TRANSFORM STRINGS
display regexr("My string", "My", "Your")
CHANGE COLUMN NAMES ADDING (APPENDING) NEW DATA replace string1 ("My") with string2 ("Your")
rename (rep78 foreign) (repairRecord carType) id blue pink
webuse coffeeMaize2.dta, clear replace make = subinstr(make, "Cad.", "Cadillac", 1)
rename one or multiple variables id blue pink save coffeeMaize2.dta, replace load demo data replace first occurrence of "Cad." with Cadillac
should webuse coffeeMaize.dta, clear in the make variable
CHANGE ROW VALUES contain
replace price = 5000 if price < 5000 + the same
variables
append using "coffeeMaize2.dta", gen(filenum) display stritrim(" Too much Space")
id blue pink
(columns) add observations from "coffeeMaize2.dta" to replace consecutive spaces with a single space
replace all values of price that are less than $5,000 with 5000 current data and create variable "filenum" to display trim(" leading / trailing spaces ")
recode price (0 / 5000 = 5000) track the origin of each observation
remove extra spaces before and after a string
change all prices less than 5000 to be $5,000
MERGING TWO DATASETS TOGETHER display strlower("STATA should not be ALL-CAPS")
recode foreign (0 = 2 "US")(1 = 1 "Not US"), gen(foreign2) webuse ind_age.dta, clear
save ind_age.dta, replace change string case; see also strupper, strproper
change the values and value labels then store in a new must contain a
ONE-TO-ONE
variable, foreign2 common variable webuse ind_ag.dta, clear display strtoname("1Var name")
id blue pink (id) id brown
REPLACE MISSING VALUES
id blue pink brown _merge
merge 1:1 id using "ind_age.dta" convert string to Stata-compatible variable name
+ =
3
one-to-one merge of "ind_age.dta" display real("100")
mvdecode _all, mv(9999) useful for cleaning survey datasets 3
3 into the loaded dataset and create convert string to a numeric or missing value
replace the number 9999 with missing value in all variables
variable "_merge" to track the origin
mvencode _all, mv(9999) useful for exporting data MANY-TO-ONE
Save & Export Data
replace missing values with the number 9999 for all variables id blue pink id brown id blue pink brown _merge
webuse hh2.dta, clear
save hh2.dta, replace save "myData.dta", replace
=
3 Stata 12-compatible file

Label Data + .
3
1
webuse ind2.dta, clear saveold "myData.dta", replace version(12)
merge m:1 hid using "hh2.dta" save data in Stata format, replacing the data if
Value labels map string descriptions to numers. They allow the _merge code
1 row only
3
3 a file with same name exists
underlying data to be numeric (making logical tests simpler) (master) in ind2
. 1 many-to-one merge of "hh2.dta"
while also connecting the values to human-understandable text. 2 row only
(using) in hh2 . . 2 into the loaded dataset and create export excel "myData.xls", /*
label define myLabel 0 "US" 1 "Not US"
3 row in
(match) both variable "_merge" to track the origin */ firstrow(variables) replace
label values foreign myLabel export data as an Excel file (.xls) with the
FUZZY MATCHING: COMBINING TWO DATASETS WITHOUT A COMMON ID variable names as the first row
define a label and apply it the values in foreign
reclink match records from different data sets using probabilistic matching ssc install reclink export delimited "myData.csv", delimiter(",") replace
label list
jarowinkler create distance measure for similarity between two strings ssc install jarowinkler export data as a comma-delimited file (.csv)
list all labels within the dataset
Tim Essam (tessam@usaid.gov) Laura Hughes (lhughes@usaid.gov) inspired by RStudios awesome Cheat Sheets (rstudio.com/resources/cheatsheets) geocenter.github.io/StataTraining updated March 2016
Disclaimer: we are not affiliated with Stata. But we like it. CC BY NC
Data Analysis Declare Data By declaring data type, you enable Stata to apply data munging and analysis functions specific to certain data types

with Stata 14.1 Cheat Sheet TIME SERIES webuse sunspot, clear PANEL / LONGITUDINAL webuse nlswork, clear
For more info see Statas reference manual (stata.com) tsset time, yearly xtset id year
Results are stored as either r -class or e -class. See Programming Cheat Sheet declare sunspot data to be yearly time series declare national longitudinal data to be a panel
tsreport xtdescribe
Summarize Data Examples use auto.dta (sysuse auto, clear)
unless otherwise noted r
report time series aspects of a dataset report panel aspects of a dataset xtline plot
r wage relative to inflation

univar price mpg, boxplot ssc install univar generate lag_spot = L1.spot xtsum hours 4
id 1 id 2

calculate univariate summary, with box-and-whiskers plot create a new variable of annual lags of sun spots tsline plot summarize hours worked, decomposing 2

stem mpg tsline spot Number of sunspots 200


standard deviation into between and 0

return stem-and-leaf display of mpg plot time series of sunspots 100 within components 4
id 3 id 4

e xtline ln_wage if id <= 22, tlabel(#3)


summarize price mpg, detail
2
frequently used commands are
arima spot, ar(1/2)
0
1850 1900 1950
highlighted in yellow
calculate a variety of univariate summary statistics estimate an auto-regressive model with 2 lags plot panel data as a line plot 0
1970 1980 1990

ci mpg price, level(99) TIME SERIES OPERATORS e


xtreg ln_w c.age##c.age ttl_exp, fe vce(robust)
estimate a fixed-effects model with robust standard errors
r compute standard errors and confidence intervals
L. lag x t-1 L2. 2-period lag x t-2
F. lead x t+1 F2. 2-period lead x t+2 SURVEY DATA webuse nhanes2b, clear
correlate mpg price D. difference x t-x t-1 D2. difference of difference xt-xt1-(xt1-xt2)
return correlation or covariance matrix svyset psuid [pweight = finalwgt], strata(stratid)
S. seasonal difference x t-xt-1 S2. lag-2 (seasonal difference) xtxt2
declare survey design for a dataset
pwcorr price mpg weight, star(0.05) USEFUL ADD-INS r
return all pairwise correlation coefficients with sig. levels tscollap compact time series into means, sums and end-of-period values svydescribe
carryforward carry non-missing values forward from one obs. to the next report survey data details
mean price mpg
estimates of means, including standard errors tsspell identify spells or runs in time series svy: mean age, over(sex)
SURVIVAL ANALYSIS webuse drugtr, clear estimate a population mean for each subpopulation
proportion rep78 foreign
estimates of proportions, including standard errors for stset studytime, failure(died) svy, subpop(rural): mean age
e categories identified in varlist declare survey design for a dataset estimate a population mean for rural areas
r e
ratio stsum svy: tabulate sex heartatk
estimates of ratio, including standard errors summarize survival-time data report two-way table with tests of independence
total price e
stcox drug age svy: reg zinc c.age##c.age female weight rural
estimates of totals, including standard errors estimate a cox proportional hazard model estimate a regression using survey weights

Statistical Tests 1 Estimate Models stores results as e -class 2 Diagnostics not appropriate with robust standard errors
tabulate foreign rep78, chi2 exact expected regress price mpg weight, robust estat hettest test for heteroskedasticity
tabulate foreign and repair record and return chi2 estimate ordinary least squares (OLS) model r ovtest test for omitted variable bias
and Fishers exact statistic alongside the expected values on mpg weight and foreign, apply robust standard errors vif report variance inflation factor
ttest mpg, by(foreign) regress price mpg weight if foreign == 0, cluster(rep78) dfbeta(length) Type help regress postestimation plots

estimate t test on equality of means for mpg by foreign regress price only on domestic cars, cluster standard errors calculate measure of influence for additional diagnostic plots
rreg price mpg weight, genwt(reg_wt) rvfplot, yline(0) avplots
r prtest foreign == 0.5

price

price
estimate robust regression to eliminate outliers plot residuals plot all partial-

Residuals
mpg rep78
one-sample test of proportions probit foreign turn price, vce(robust) against fitted regression leverage
ADDITIONAL MODELS

price

price
ksmirnov mpg, by(foreign) exact estimate probit regression with pca built-in Stata principal components analysis
Fitted values values headroom weight plots in one graph
Kolmogorov-Smirnov equality-of-distributions test robust standard errors
3 Postestimation
command
factor factor analysis
commands that use a fitted model
ranksum mpg, by(foreign) exact logit foreign headroom mpg, or poisson nbreg count outcomes

equality tests on unmatched data (independent samples) estimate logistic regression and tobit censored data
regress price headroom length Used in all postestimation examples
ivregress ivreg2 instrumental variables
report odds ratios
anova systolic drug webuse systolic, clear bootstrap, reps(100): regress mpg /* rddiff sscuser-written difference-in-difference display _b[length] display _se[length]
analysis of variance and covariance */ weight gear foreign
install ivreg2 regression discontinuity
return coefficient estimate or standard error for mpg
e pwmean mpg, over(rep78) pveffects mcompare(tukey)
xtabond xtabond2 dynamic panel estimator from most recent regression model
estimate regression with bootstrapping psmatch2
jackknife r(mean), double: sum mpg synth
propensity score matching
margins, dydx(length) returns e-class information when post option is used
estimate pairwise comparisons of means with equal synthetic control analysis
variances include multiple comparison adjustment jackknife standard error of sample mean oaxaca Blinder-Oaxaca decomposition r
return the estimated marginal effect for mpg
margins, eyex(length)
Estimation with Categorical & Factor Variables more details at http://www.stata.com/manuals14/u25.pdf return the estimated elasticity for price
CONTINUOUS VARIABLES OPERATOR DESCRIPTION EXAMPLE predict yhat if e(sample)
measure something i. specify indicators regress price i.rep78 specify rep78 variable to be an indicator variable create predictions for sample on which model was fit
ib. specify base indicator regress price ib(3).rep78 set the third category of rep78 to be the base category predict double resid, residuals
CATEGORICAL VARIABLES fvset command to change base fvset base frequent rep78 set the base to most frequently occurring category for rep78
identify a group to which calculate residuals based on last fit model
c. treat variable as continuous regress price i.foreign#c.mpg i.foreign treat mpg as a continuous variable and
an observations belongs specify an interaction between foreign and mpg test mpg = 0
r test linear hypotheses that mpg estimate equals zero
o. omit a variable or indicator regress price io(2).rep78 set rep78 as an indicator; omit observations with rep78 == 2
INDICATOR VARIABLES
denote whether # specify interactions regress price mpg c.mpg#c.mpg create a squared mpg term to be used in regression
T F lincom headroom - length
something is true or false ## specify factorial interactions regress price c.mpg##c.mpg create all possible interactions with mpg (mpg and mpg2)
test linear combination of estimates (headroom = length)
Tim Essam (tessam@usaid.gov) Laura Hughes (lhughes@usaid.gov) inspired by RStudios awesome Cheat Sheets (rstudio.com/resources/cheatsheets) geocenter.github.io/StataTraining updated March 2016
Disclaimer: we are not affiliated with Stata. But we like it. CC BY NC
Data Visualization BASIC PLOT SYNTAX: graph <plot type>
variables: y first
y1 y2 yn x [in]
plot-specific options
[if], <plot options>
facet
by(var)
annotations
xline(xint) yline(yint) text(y x "annotation")
with Stata 14.1 Cheat Sheet titles axes
For more info see Statas reference manual (stata.com) title("title") subtitle("subtitle") xtitle("x-axis title") ytitle("y axis title") xscale(range(low high) log reverse off noline) yscale(<options>)
ONE VARIABLE sysuse auto, clear custom appearance plot size save
<marker, line, text, axis, legend, background options> scheme(s1mono) play(customTheme) xsize(5) ysize(4) saving("myPlot.gph", replace)
CONTINUOUS
histogram mpg, width(5) freq kdensity kdenopts(bwidth(5)) TWO+ CONTINUOUS VARIABLES
histogram
bin(#) width(#) density fraction frequency percent addlabels
y1 graph matrix mpg price weight, half twoway pcspike wage68 ttl_exp68 wage88 ttl_exp88
addlabopts(<options>) normal normopts(<options>) kdensity y2 scatter plot of each combination of variables Parallel coordinates plot (sysuse nlswide1)
kdenopts(<options>) half jitter(#) jitterseed(#) vertical, horizontal
y3 diagonal [aweights(<variable>)]
kdensity mpg, bwidth(3)
smoothed histogram twoway pccapsym wage68 ttl_exp68 wage88 ttl_exp88
bwidth kernel(<options> main plot-specific options; twoway scatter mpg weight, jitter(7) Slope/bump plot (sysuse nlswide1)
normal normopts(<line options>) see help for complete set scatter plot vertical horizontal headlabel
jitter(#) jitterseed(#) sort cmissing(yes | no)
DISCRETE connect(<options>) [aweight(<variable>)]
graph bar (count), over(foreign, gap(*0.5)) intensity(*0.5) THREE VARIABLES
bar plot graph hbar draws horizontal bar charts 23 twoway scatter mpg weight, mlabel(mpg)
(asis) (percent) (count) over(<variable>, <options: gap(*#)
twoway contour mpg price weight, level(20) crule(intensity)
20 scatter plot with labelled values 3D contour plot
relabel descending reverse>) cw missing nofill allcategories 17 jitter(#) jitterseed(#) sort cmissing(yes | no)
percentages stack bargap(#) intensity(*#) yalternate xalternate ccuts(#s) levels(#) minmax crule(hue | chue| intensity)
2 10 connect(<options>) [aweight(<variable>)] scolor(<color>) ecolor (<color>) ccolors(<colorlist>) heatmap
graph bar (percent), over(rep78) over(foreign) interp(thinplatespline | shepard | none)
grouped bar plot graph hbar ... regress price mpg trunk weight length turn, nocons
(asis) (percent) (count) over(<variable>, <options: gap(*#) twoway connected mpg price, sort(price)
relabel descending reverse>) cw missing nofill allcategories scatter plot with connected lines and symbols matrix regmat = e(V) ssc install plotmatrix
a b c percentages stack bargap(#) intensity(*#) yalternate xalternate jitter(#) jitterseed(#) sort see also line plotmatrix, mat(regmat) color(green)
connect(<options>) cmissing(yes | no) heatmap mat(<variable) split(<options>) color(<color>) freq
DISCRETE X, CONTINUOUS Y
graph bar (median) price, over(foreign) graph hbar ...
twoway area mpg price, sort(price)
SUMMARY PLOTS
bar plot (asis) (percent) (count) (stat: mean median sum min max ...) twoway mband mpg weight || scatter mpg weight
over(<variable>, <options: gap(*#) relabel descending reverse line plot with area shading
sort(<variable>)>) cw missing nofill allcategories percentages sort cmissing(yes | no) vertical, horizontal plot median of the y values
stack bargap(#) intensity(*#) yalternate xalternate base(#) bands(#)
graph dot (mean) length headroom, over(foreign) m(1, ms(S))
dot plot (asis) (percent) (count) (stat: mean median sum min max ...) twoway bar price rep78 binscatter weight mpg, line(none) ssc install binscatter
over(<variable>, <options: gap(*#) relabel descending reverse
sort(<variable>)>) cw missing nofill allcategories percentages bar plot plot a single value (mean or median) for each x value
linegap(#) marker(#, <options>) linetype(dot | line | rectangle) vertical, horizontal base(#) barwidth(#) medians nquantiles(#) discrete controls(<variables>)
dots(<options>) lines(<options>) rectangles(<options>) rwidth linetype(lfit | qfit | connect | none) aweight[<variable>]
graph hbox mpg, over(rep78, descending) by(foreign) missing FITTING RESULTS
box plot graph box draws vertical boxplots twoway dot mpg rep78
over(<variable>, <options: total gap(*#) relabel descending reverse dot plot vertical, horizontal base(#) ndots(#) twoway lfitci mpg weight || scatter mpg weight
sort(<variable>)>) missing allcategories intensity(*#) boxgap(#) dcolor(<color>) dfcolor(<color>) dlcolor(<color>) calculate and plot linear fit to data with confidence intervals
medtype(line | line | marker) medline(<options>) medmarker(<options>) dsize(<markersize>) dsymbol(<marker type>) level(#) stdp stdf nofit fitplot(<plottype>) ciplot(<plottype>)
vioplot price, over(foreign) ssc install vioplot dlwidth(<strokesize>) dotextend(yes | no) range(# #) n(#) atobs estopts(<options>) predopts(<options>)
violin plot over(<variable>, <options: total missing>)>) nofill
vertical horizontal obs kernel(<options>) bwidth(#) twoway lowess mpg weight || scatter mpg weight
barwidth(#) dscale(#) ygap(#) ogap(#) density(<options>) twoway dropline mpg price in 1/5 calculate and plot lowess smoothing
bar(<options>) median(<options>) obsopts(<options>) dropped line plot bwidth(#) mean noweight logit adjust
vertical, horizontal base(#)
Plot Placement twoway qfitci mpg weight, alwidth(none) || scatter mpg weight
JUXTAPOSE (FACET) twoway rcapsym length headroom price calculate and plot quadriatic fit to data with confidence intervals
level(#) stdp stdf nofit fitplot(<plottype>) ciplot(<plottype>)
twoway scatter mpg price, by(foreign, norescale) range plot (y1 y2) with capped lines range(# #) n(#) atobs estopts(<options>) predopts(<options>)
total missing colfirst rows(#) cols(#) holes(<numlist>) vertical horizontal see also rcap
compact [no]edgelabel [no]rescale [no]yrescal [no]xrescale REGRESSION RESULTS
[no]iyaxes [no]ixaxes [no]iytick [no]ixtick [no]iylabel
[no]ixlabel [no]iytitle [no]ixtitle imargin(<options>) regress price mpg headroom trunk length turn
coefplot, drop(_cons) xline(0) ssc install coefplot
SUPERIMPOSE twoway rarea length headroom price, sort
Plot regression coefficients
range plot (y1 y2) with area shading
graph combine plot1.gph plot2.gph... vertical horizontal sort
baselevels b(<options>) at(<options>) noci levels(#)
combine 2+ saved graphs into a single plot keep(<variables>) drop(<variables>) rename(<list>)
cmissing(yes | no) horizontal vertical generate(<variable>)
scatter y3 y2 y1 x, marker(i o i) mlabel(var3 var2 var1) regress mpg weight length turn
plot several y values for a single x value twoway rbar length headroom price margins, eyex(weight) at(weight = (1800(200)4800))
graph twoway scatter mpg price in 27/74 || scatter mpg price /* range plot (y1 y2) with bars marginsplot, noci
*/ if mpg < 15 & price > 12000 in 27/74, mlabel(make) m(i) vertical horizontal barwidth(#) mwidth Plot marginal effects of regression
combine twoway plots using || msize(<marker size>) horizontal noci

Laura Hughes (lhughes@usaid.gov) Tim Essam (tessam@usaid.gov) inspired by RStudios awesome Cheat Sheets (rstudio.com/resources/cheatsheets) geocenter.github.io/StataTraining updated January 2016
Disclaimer: we are not affiliated with Stata. But we like it. CC BY NC
Plotting in Stata 14.1 ANATOMY OF A PLOT Apply Themes
annotation title titles
Customizing Appearance subtitle Schemes are sets of graphical parameters, so you dont
have to specify the look of the graphs every time.

200
For more info see Statas reference manual (stata.com) plots contain many features marker label
y-axis 1
graph region
10
8
USING A SAVED THEME
line

150
y-axis title
inner graph region 9 5
inner plot region y-axis title 4 marker twoway scatter mpg price, scheme(customTheme)

100
6
plot region y-axis labels 2 7 grid lines Create custom themes by
help scheme entries saving options in a .scheme file

50
y-line 3 see all options for setting scheme properties
tick marks

0
outer region inner region 0 20 40 60 80 100 adopath ++ "~/<location>/StataThemes"
scatter price mpg, graphregion(fcolor("192 192 192") ifcolor("208 208 208")) x-axis title set path of the folder (StataThemes) where custom
specify the fill of the background in RGB or with a Stata color x-axis
legend
y2 .scheme files are saved set as default scheme
scatter price mpg, plotregion(fcolor("224 224 224") ifcolor("240 240 240")) Fitted values
specify the fill of the plot background in RGB or with a Stata color set scheme customTheme, permanently
change the theme
SYMBOLS LINES / BORDERS TEXT
marker arguments for the plot line marker axes tick marks marker label titles axis labels USING THE GRAPH EDITOR
<marker objects (in green) go in the <line options> <marker xscale(...) grid lines <marker title(...) xlabel(...)
SYNTAX

options portion of these options> yscale(...) options> subtitle(...) ylabel(...)


options>
commands (in orange)
xline(...)
xlabel(...)
twoway scatter mpg price, play(graphEditorTheme)
yline(...) legend annotation xtitle(...) legend
for example: ylabel(...) ytitle(...)
scatter price mpg, xline(20, lwidth(vthick)) legend(region(...)) text(...) legend(...)

mcolor("145 168 208") mcolor(none) lcolor("145 168 208") lcolor(none) color("145 168 208") color(none) Select the
specify the fill and stroke of the marker specify the stroke color of the line or border specify the color of the text Graph Editor
in RGB or with a Stata color
COLOR

marker mlcolor("145 168 208") marker label mlabcolor("145 168 208")


mfcolor("145 168 208") mfcolor(none) tick marks tlcolor("145 168 208") axis labels labcolor("145 168 208")
specify the fill of the marker
grid lines glcolor("145 168 208")
Click
msize(medium) specify the marker size: lwidth(medthick) marker mlwidth(thin) size(medsmall) specify the size of the text:
Record
specify the thickness tick marks tlwidth(thin) marker label mlabsize(medsmall)
(stroke) of a line:
ehuge medlarge grid lines glwidth(thin) axis labels labsize(medsmall)
SIZE / THICKNESSS

vhuge
medium
medsmall
vvvthick
vvthick
medthin
thin
28 pt. vhuge 10 pt. medsmall
8 pt. small
Double click on
symbols and areas
small vthick vthin
20 pt. huge 6 pt. vsmall on plot, or regions
tiny on sidebar to
huge
vsmall thick vvthin
16 pt. vlarge 4 pt.
2 pt. half_tiny customize
vlarge 14 pt. large 1.3 pt. third_tiny
tiny medthick vvvthin 12 pt. medlarge 1 pt. quarter_tiny Unclick
large vtiny medium none 11 pt. medium 1 pt minuscule Record
Save theme
msymbol(Dh) specify the marker symbol: line axes lpattern(dash) specify the marker label mlabel(foreign) as a .grec file
line pattern label the points with the values
grid lines glpattern(dash) of the foreign variable
O D T S
APPEARANCE

solid longdash longdash_dot


o d t s dash shortdash shortdash_dot
axis labels nolabels
no axis labels
Save Plots
axis labels format(%12.2f ) graph twoway scatter y x, saving("myPlot.gph") replace
Oh Dh Th Sh dot dash_dot blank change the format of the axis labels save the graph when drawing
oh dh th sh axes noline axes off no axis/labels legend off graph save "myPlot.gph", replace
turn off legend
tick marks noticks tick marks tlength(2) save current graph to disk
+ X p none i legend label(# "label")
grid lines nogrid nogmin nogmax change legend label text graph combine plot1.gph plot2.gph...
combine 2+ saved graphs into a single plot
POSITION

jitter(#) jitterseed(#) tick marks xlabel(#10, tposition(crossing)) marker label mlabposition(5) graph export "myPlot.pdf", as(.pdf) see options to set
randomly displace the markers set seed number of tick marks, position (outside | crossing | inside) label location relative to marker (clock position: 0 12) export the current graph as an image file size and resolution
Laura Hughes (lhughes@usaid.gov) Tim Essam (tessam@usaid.gov) inspired by RStudios awesome Cheat Sheets (rstudio.com/resources/cheatsheets) geocenter.github.io/StataTraining updated January 2016
Disclaimer: we are not affiliated with Stata. But we like it. CC BY NC

Potrebbero piacerti anche