Sei sulla pagina 1di 14

Power and Sample Size

Sameem Iqbal Siddiqui


Center for Economic Research Pakistan

Objectives
Introduce the concept Figure out why we care Factors affecting power: Understanding the Intuition Possible ways to calculate power

Set up
In an experiment in Punjab, we randomly assign self-employed individuals not involved in agriculture to business training. We would like to see whether this increases their income. From our baseline, we know that mean income for self-employed individuals is about PKR 11,000. And, we expect the treatment to have an effect of 20% increase on income. Given this setup, what is the sample size we would need to detect the 20% effect?

What are power calculations?


Power calculations will give you the sample size required to be able to detect the treatment effect. In our case, 20%. Definition of Power: Given a treatment, the probability of being able to detect an effect. Studies generally use 80% as a benchmark for power. That is, 80% of the time you will be able to detect the treatment effect. So, in practice, generally, you will fix the power and see what sample sizes are needed to achieve that power

Why do we care?
Being sure ex ante that we are able to detect expected treatment effects

Design of Intervention
Main decisions: 1) Level of randomization: Individual or group
o Individual (IND): Assign treatment at the school level o Group (G): Assign treatment at the village level

2) No of treatments
o Simple T/C or o Alter intensity of treatment or o Combine treatments

3) Compliance 100% or lower. Most important thing to think about is what effects you are interested in capturing and how the design can help you capture that while achieving the highest power possible.

Factors affecting power for IND


Increase in: Sample size increases power Size of the effect increases power Standard deviation decreases power Significance level decreases power

Sample Size Calculations for IND


Whats needed Mean of the outcome variable Standard deviation of the outcome variable Significance level Where to get it From pilot, baseline or other studies From pilot, baseline or other studies Usually 5%

Effect size
Power Level of randomization

Expected, think about cost-benefit analysis of program


Usually 80% Depends on study design

Note 1: For group-design, you would need an intracluster coefficient, which tells you the proportion of variance explained by the group relative to the overall variance. Note 2: For compliance lower than 100%, we would need to adjust effect size.

Ways to Calculate Power


Stata commands: sampsi, sampclus Stata simulations Stata MDE formula dofile Excel: MDE formula, set up parameters Optimal Design software

Example: Sampsi
Help file: sampsi n1 n2, options sampsi treatment mean control mean, p(0.8) sd(standard deviation)
This will give you the N needed in each group to achieve 80% power given mean and standard deviation of outcome variable.

Sampsi output
Estimated sample size for two-sample comparison of means Test Ho: m1 = m2, where m1 is the mean in population 1 and m2 is the mean in population 2 Assumptions: alpha = 0.0500 (two-sided) power = 0.8000 m1 = 13700.5 m2 = 11417.1 sd1 = 15956.8 sd2 = 15956.8 n2/n1 = 1.00 Estimated required sample sizes: n1 = n2 = 767 767

Total Sample: ~1500

Appendix

Basic Statistics (1)


What do we mean by detecting an effect?
We are claiming that the mean for the treated is different from the mean of the control.

How do we do this? A hypothesis test: Test for significance of some claim


Null Hypothesis: control mean Alternative Hypothesis: control mean H0: treatment mean = H1: treatment mean !=

Basic Statistics (2)


Decision
Reject H0 Truth H0 H1 Type I Error Right Decision Do not reject H0 Right decision Type II Error

Type I Error: Wrongly rejecting the null. That is, the null is correct, but we reject it. Want to keep this as small as possible. P(Type I Error) = Significance level = 5% (traditionally) Type II Error: Failing to reject a null when it is false P(Type II Error) = beta, generally unknown Power: 1 P(Type II Error) = 1 beta What is the probability that you would be able to reject a false null? I.e. the probability of making the right decision or probability of not making a type II error. Traditionally, we fix this to 80%.

Potrebbero piacerti anche