Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
• RNA extraction
• Probe labeling
– Ex: dye differences
• Printing
– Ex: print-order, plate-order, clone variation
• Hybridization
– Ex: temperature, time, mixing technique
• Human
– Ex: variation between lab researchers
• Scanning
– Ex: laser & detector, chemistry of the fluorescent label
• Image analysis
– Ex: identification, quantification, background methods
• Raw Exploration
• Normalization
– Logarithmic Transformation (adjustment of variances)
– M vs. A plot (rotation of logarithmic transformation)
• This method adjust the median of differences to 0.
– Background Transformation (RMA background approach used
for linear scenarios) (to minimize the noise in the observed
plot)
– Averaging normalization techniques
• After normalization of all of the spots in the microarray
chip, we average them to obtain a more stable master
slide.
– Establish the cutting points
• Naïve approach (Establish cut off points by logs ratios)
• Justifiable approach (Establish cut off points by T-statistic)
• Statistical analysis
– For each gene i we have the hypothesis test:
– Null (neutral) hypothesis H0,i: Mi = 0
– Alternative hypothesis H1,i: Mi 0
• The process of normalization can be classified into linear and non linear
normalization.
– Linear= is applied to selected genes or global ones. The process is quite suitable
for consistent data.
– Non-linear= is highly precise for data at extreme values, but requires a gene set
for reference.
log2R = log2G
Why Log2??...
• M vs. A is basically a
rotation of the log2R vs.
log2G scatter plot.
• Now the quantity of
interest, i.e. the fold
change, is contained in
one variable, namely M!
Fluorescent signal
Observed data
Background noise
• The equation of the RMA method, E(Si|Xi=xi) will
be used as the background intensity correction for
gene i (it is applied to all genes in the microarray
in order to minimize the noise from the observed
signal).
• Useful when having different segmentation of the same
gene.
• Combines all segmentation of the same gene into an
average transformed single unit.
• Can apply T test to work out if the mean of data is same
or different between two conditions.
• Can apply ANOVA to work out if the mean of data is
same or different across two or more conditions.
normalization
Average slide
Naïve approach
• Establish cut off points
by logs ratios.
– This has to be done post
M vs. A transformation &
background correction
– Top and bottom 0.5 of
the absolute M values
have to be shaven off.
Justifiable approach
• Establish cut off points using T-
statistic via Significance Analysis
of Microarrays*
– T = mean(x) / SE(x)
• Where
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC126873/pdf/gb-2002-3-9-
research0048.pdf
1. Good image analysis is essential. Some software are
obsolete and not that good.
2. Normalization is needed. We understand more now
than a few years ago.
3. Use at least the t-statistics to identify differentially
expressed genes. Do not rely exclusively on log-ratios.
4. Multiple testing must be considered for false positives;
adjust your p -values.
5. Talk to a biostatistician before doing the experiments!
They too have a family to feed thanks to your work!.
• Analysis of Microarray Data
– Henrik Bengtsson hb@maths.lth.se
• Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP.
(2003). Summaries of Affymetrix GeneChip probe level data.
Nucleic Acids Res. 31:e15.