Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
HISAT2
Jelena Nadj
Seven Bridges Genomics
March 3rd, 2016
Introduction
Design
EM algorithm
Implementation
HISAT2 overview
Uses an indexing scheme based on the BTW and GCSA (an extension of
BWT for a graph) index
Supports genomes of any size, including those larger than 4 billion bases
Introduction
RNA-seq
EM algorithm
Implementation
Contributors: Daehvan Kim, Ben Langmead, Joe Paggi, Geo Pertea & Steven Salzberg
(Tophat2 developers)
Homepage
Github
Kim D, Langmead B and Salzberg SL. HISAT: a fast spliced aligner with low memory
requirements. Nature Methods 2015
GPLv3 license
Introduction
RNA-seq
HISAT2 VS Others
Implementation
HISAT2 modules
Introduction
Design
EM algorithm
Implementation
HISAT2 modules
HISAT2 (aligner)
Optionally can take splice site, exon, SNP information (in the HISAT2
format), new splice site info (derived from the previous iteration of
HISAT2)
Introduction
Methods
EM algorithm
Implementation
HISAT2 Index
In contrast to most other aligners, HISAT2 employs two different types of indexes:
Numerous small GFM (FM) indexes for regions that collectively cover the
genome, where each index represents 56,000 bp, which makes ~55,000
indexes needed to cover human genome
Introduction
Design
EM algorithm
Implementation
HISAT2 Index
HISAT2 first tries to identify the positions from which the read may have
originated (on the whole genome)
This is done by first using the global index, which gives a small set of candidates
Search through a global FM index of the human genome suffers from many cache
'misses' - local index is much smaller, fits in the cache
Introduction
Design
EM algorithm
Implementation
Introduction
Design Principles
EM algorithm
Implementation
Introduction
Design Principles
EM algorithm
Implementation
After mapping the longer part, HISAT can usually align the remaining
small anchor within a single local index
Introduction
Design Principles
EM algorithm
Implementation
After mapping the longer part, HISAT can usually align the remaining
small anchor within a single local index
Introduction
Design Principles
a) global + extension
EM algorithm
Implementation
Introduction
Design Principles
2-pass algorithms
EM algorithm
Implementation
Introduction
Design Principles
Pseudogenes
EM algorithm
Implementation
Introduction
Speed
Design Principles
HISAT2 vs Others
Implementation
Introduction
Sensitivity
Design Principles
HISAT2 vs Others
Implementation
Introduction
RNA-seq
EM algorithm
RNA-seq quantification
vs.
microarrays or qRT-PCR
Implementation
Introduction
RNA-seq
EM algorithm
alternative splicing:
mapping ambiguity
(multiple mapping)
Source: http://dx.doi.org/10.13070/mm.en.3.203
Implementation
Introduction
RNA-seq
sequencing errors
(error model - mismatches & indels)
Source: doi:10.1186/gb-2011-12-3-r22
EM algorithm
Implementation
Introduction
RNA-seq
EM algorithm
Implementation
Introduction
RNA-seq
EM algorithm
Implementation
Introduction
RNA-seq
EM algorithm
Implementation
Introduction
RNA-seq
EM algorithm
Implementation
Introduction
RNA-seq
EM algorithm
Implementation
Source: http://research.microsoft.com/en-us/um/people/cmbishop/prml/
Source: http://artint.info/html/ArtInt_255.html
Introduction
RNA-seq
EM algorithm
Implementation
RNA example EM
Introduction
RNA-seq
EM algorithm
EM flavours
Source: http://cs.stanford.edu/~pliang/papers/online-naacl2009.pdf p. 3
Implementation
Introduction
RNA-seq
EM algorithm
Implementation
Introduction
RNA-seq
EM algorithm
priors
(error model, fragment length, abundances)
KullbackLeibler divergence
at 10-6
generative model
Implementation
Introduction
RNA-seq
EM algorithm
Implementation
--frag-len-mean / --frag-len-stddev
--forget-param
--haplotype-file
--calc-covar
--output-align-prob / --output-align-samp
--expr-alpha <float>
--aux-param-file
https://igor.sbgenomics.com/u/dusan_randjelovic/express-1-5-1-demo/apps/#express/
Introduction
RNA-seq
EM algorithm
Implementation
Introduction
RNA-seq
EM algorithm
Implementation
eXpress output
results.xprs
params.xprs
varcov.xprs
Introduction
RNA-seq
EM algorithm
Implementation
Introduction
RNA-seq
eXpress benchmark
EM algorithm
Implementation
Introduction
RNA-seq
EM algorithm
Implementation
Call to action