Sei sulla pagina 1di 3

Variable is a characteristic that changes or varies over time or for different individuals and objects

under consideration. Ex (body temp, height, age, number of offspring, and religion)
Experimental unit or Element of the sample the object on which measurement is taken
*When a variable is measured on a set of experimental units, a set of measurements or DATA result*
Experimental unit is the individual or object on which a variable is measured
Population is the set of all measurements of interest to the investigator
Sample is a subset of measurements selected from the population of interest
Univariate data result when a single variable is measured on a single experimental unit
Bivariate data result when two variables are measured on a single experimental unit
Multivariate data result when more than two variables are measured
*Variables can be classified into one of two types: Qualitative or Quantitative*
Qualitative variables measure a quality or characteristic on each experimental unit (often called
Categorical data)
Quantitative variables measure a numerical quantity or amount on each experimental unit (produce
numerical data)
2 types of Quantitative variables (discrete and continuous)
Discrete can assume only a finite or countable number of values
Continuous can assume the infinitely many values corresponding the points on a line interval Ex:
height, weight, time, distance, and volume
Data distribution graph of a statistical table
*When the variable of interest is qualitative or categorical, the statistical table is a list of the categories
being considered along with a measure of how often each value occurred in three different ways*
Frequency number of measurements in each category
Relative frequency proportion of measurements in each category
Percentage
Pie chart is the familiar circular graph that shows how the measurements are distributed among the
categories
Bar chart shows the same distribution of measurements among the categories, with the height of the
bar measuring how often a particular category was observed
Line charts time series data best when comparing intervals such as daily, weekly, monthly, quarterly,
or yearly.
Dot plot smallest quantitative graph of data
Stem and leaf plot represents a graphical display of the data using the actual numerical values of each
data point
A distribution is:
Symmetric - if the left and right sides of the distribution when divided at the middle value, form mirror
images.
Skewed to the right - if a greater proportion of the measurements lie to the right peak value (contains
large measurements).
Skewed to the left - if a greater proportion of the measurements lie to the left of the peak value
(contains small measurements)
Unimodal if it has one peak
Bimodal if it has two peaks (usually represent a mixture of two different populations in the data set)
*When comparing graphs created for two data sets, you should compare their scales of measurement
locations and shapes and look for unusual measurements or outliers*
Relative frequency histogram is a bar graph in which the height of the bar shows how often
measurements fall in a particular class or subinterval
Numerical measures can be calculated for either a sample or a population of measurements, you can
use this data to calculate a set of numbers that will convey a good mental picture of the frequency
distribution
Parameters when associated with the population
Statistics when calculated from sample measurements
Measure of center a measure along the horizontal axis that locates the center of distribution
Mean also called the arithmetic average of a set of measurements
Median which is the value in the middle position in the set of measurements are ordered from
smallest to largest
.5(n+1) indicated the position of the median in the ordered data set
*The median is less sensitive to extreme values or outliers*
Mode category that occurs most frequently, or most frequently occurring value of x (usually describe
large data sets)
Modal class highest peak of a frequency and the midpoint of the class
*It is possible to have more than one mode; these modes would appear as local peaks*
Variability or dispersion very important characteristic of data, it can help you create a mental picture
of the spread of the data
Range simplest measure of variation, set of measurements is defined as the difference between the
largest and the smallest measurement
Variance of a population average of the squares of deviations of the measurements about their mean
Variance of a sample is the sum of the squared deviations of the measurements about their mean
divided by (n 1)
Standard deviation it is equal to the positive square root of the variance
Percentile is another measure of relative standing and is most often used for large data sets.
*Percentiles are not very useful for small data sets*
Pth percentile is the value of x that is greater than p% of the measurements and is less than the
remaining (100 p)%
Q1 (lower quartile) is the value of x that is greater than of the measurements and is less than the
remaining
Q2 (second quartile) is the median
Q3 (upper quartile) is the value of x that is greater than of the measurements and is less than the
remaining
Interquartile range (IQR) is a set of measurements that is the difference between the upper and lower
quartiles; that is, IQR = Q3 Q1
Five number summary consists of the smallest number, lower quartile, median, upper quartile, and
the largest number presented in order from smallest to largest
Scatterplot two dimensional extension of the dotplot that is used to graph one quantitative variable
R = + , x increases when y increases and vice versa.
R = - x decreases when y increases or x increases when y decreases
R = 0, then there is no linear relationship between the two variables
R = -1 or 1, all points lie exactly on a straight line
R is near -1 or 1, the stronger the linear relationship between 2 variables

Potrebbero piacerti anche