Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
gives the number of observation, mean, standard deviation, minimum and maximum value
Syntax:
su variable
for specific summary measures,
Syntax:
tabstat variable, statistics(specify desired summary measures separated by spaces, syntax sensitive)
example: tabstat variable, statistics(mean range sd var median p25 p50 p75)
or go to Statistics Summaries, tables, and tests Other tables Compact table of summary statistics
can summarize one or more variables
Histogram
enter data in one column
Syntax:
histogram variable, width(#) start(LCB) frequency ytitle(Title of y-axis) xtitle(Title of x-axis) xlabel(LCB1 LCB2 LCB3 LCB4)
Frequency Polygon
make separate columns for the midpoints and frequency
remember: make a new class before the first class and after the last class
Syntax:
twoway (connected frequency variable), xlabel(midpoints separated by spaces)
Pie Chart
make separate columns for the variable and the percentage/frequency
Syntax:
graph pie percent, over(variable) sort
“sort” arranges the levels in ascending order
”sort descending” arranges the level in descending order
sorting includes others in the arrangement; for convenience, arrange each levels in descending order and place others in the last row. By doing
this, syntax “sort” may be omitted
labels may be inserted manually or add the syntax:
plabel(_all percent, size(*#) color(color))
Line Graph
make separate columns for each variable, including the time variable
Syntax:
twoway (var1 time), ytitle(Title of y-axis) ylabel(firstvalue(interval)lastvalue) xtitle(time) xlabel(firstvalue(interval)lastvalue)
for comparison of several levels
twoway (var1 time) (var2 time) (var3 time), ytitle(Title of y-axis) ylabel(firstvalue(interval)lastvalue) xtitle(time)
xlabel(firstvalue(interval)lastvalue)
Scatterplot
make separate columns for each variable
Syntax:
twoway (scatter var1 var2), ytitle(Title of y-axis) ylabel(firstvalue(interval)lastvalue) xtitle(Title of x-axis) xlabel(firstvalue(interval)lastvalue)
var1 is on the y-axis
var2 is on the y-axis
Sturges’ Rule Outliers
𝐾𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 = 1 + 3.32 log(𝑛) 𝑥 > 𝑄3 + 1.5(𝑄3 − 𝑄1 )
width = range/K 𝑥 < 𝑄1 − 1.5(𝑄3 − 𝑄1 )
Measures of Spread
[Absolute] Range [Absolute] Interquartile Range
𝑅𝑎𝑛𝑔𝑒 = 𝐻𝐶𝐵 − 𝐿𝐶𝐵 Difference between P25 and P75 (or Q1 and Q3)
𝑅𝑎𝑛𝑔𝑒 = ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑎𝑖𝑜𝑛 − 𝑙𝑜𝑤𝑒𝑠𝑡 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
Interpretation: Half of the observations fall within ____ to ____.
Interpretation: The difference between the highest & lowest values is
______.
[Absolute] Variance [Absolute] Standard Deviation
Σ(𝑋𝑖 − 𝑋) 2
ungrouped:
𝑠2 =
𝑛−1 𝑛 ∑2𝑖 𝑥𝑖2 − (∑ 𝑥𝑖 )2
or 𝑠=√
(Σ𝑋𝑖 ) 𝑛(𝑛 − 1)
Σ𝑋𝑖 −
𝑠2 = 𝑛 grouped:
𝑛−1
𝑛 ∑2𝑖 𝑓𝑖 𝑥𝑖2 − (∑ 𝑓𝑖 𝑥𝑖 )2
𝑠=√
Interpretation: The ______ varies by _______. 𝑛(𝑛 − 1)
Interpretation:
Measures of Location
Quartile (Qi) Decile (Di)
i = 1,2,3 i = 1, 2, 3,… 9
quartile class decile class
𝑛×𝑖 𝑛×𝑖
𝑐𝑓 ≥ 𝑐𝑓 ≥
4 10
ungrouped: ungrouped:
even even
𝑛 𝑡ℎ 𝑛 𝑡ℎ 𝑛 𝑡ℎ 𝑛 𝑡ℎ
( × 𝑖) + [( × 𝑖) + 1] ( × 𝑖) + [( × 𝑖) + 1]
𝑄𝑖 = 4 4 𝐷𝑖 = 10 10
2 2
odd odd
(𝑛 + 1)𝑖 (𝑛 + 1)𝑖
𝑄𝑖 = 𝑡ℎ 𝐷𝑖 = 𝑡ℎ
4 10
grouped: grouped:
𝑛 𝑛
( × 𝑖) −< 𝐶𝐹𝑄𝑖−1 ( × 𝑖) −< 𝐶𝐹𝐷𝑖−1
𝑄𝑖 = 𝐿𝐶𝐵𝑄𝑖 + 4 ×𝑐 𝐷𝑖 = 𝐿𝐶𝐵𝐷𝑖 + 10 ×𝑐
𝑓𝑄𝑖 𝑓𝐷𝑖
LCBQi is the LCB of the Qith class
c is the class width
<CFQi – 1 is the cf of the class before the Qith class
fQi is the frequency of the Qith class
Percentile (Pi) Interpretations:
i = 1, 2, 3,… 99 ____% of the observations are below or equal to ____ and ____%
percentile class observations are above _____.
𝑛×𝑖
𝑐𝑓 ≥
100 Percentile Decile Quartile
ungrouped: P90 D9
even P80 D8
𝑛 𝑡ℎ 𝑛 𝑡ℎ Q3 = P75
( × 𝑖) + [( × 𝑖) + 1] P70 D7
𝑃𝑖 = 100 100 P60 D6
2 P50 D5 Q2 = P50
odd
(𝑛 + 1)𝑖 P40 D4
𝑃𝑖 = 𝑡ℎ P30 D3
100 Q1 = P25
grouped: P20 D2
𝑛 P10 D1
( × 𝑖) −< 𝐶𝐹𝑃𝑖−1
𝑃𝑖 = 𝐿𝐶𝐵𝑃𝑖 + 100 ×𝑐
𝑓𝑃𝑖