Sei sulla pagina 1di 2

1052 Introns and Exons

Introns and Exons of eukaryotes. Group II introns are known in eukary-


otic organellar (but not nuclear) genomes, as well as
A Stoltzfus in eubacterial chromosomes and plasmids. Though
doi: 10.1006/rwgn.2001.0708 common in some organellar genomes, self-splicing
introns are extremely rare elsewhere, and seem to be
entirely absent from most prokaryotic genomes as
An intron (or `intervening sequence') is a segment of well as many eukaryotic nuclear genomes.
RNA excised from a gene transcript, with concomi-
tant ligation of flanking segments called `exons.' This Role in Gene Expression
process of excision and ligation, known as `splicing,'
is one of several posttranscriptional processing steps In most cases, introns appear to be dispensable.
that may occur prior to translation. Although `intron,' Introns can be removed entirely from mitochondria
in the strict sense, refers only to segments excised of S. cerevisiae without obvious ill effect. Neverthe-
from RNA (and, by extension, the DNA segments less, in a variety of cases, introns and splicing figure
that encode them), there exist developmental analogs importantly in development. The delay caused by the
of introns that are excised from DNA (the ciliate transcription and splicing of a gene with many long
IES elements) or from protein (the printrons or introns can be important (e.g., the knrl gene of Dros-
inteins). ophila). The intron may contain within itself some
other feature: a DNA regulatory site (e.g., a promoter
or enhancer), a structural RNA (e.g., intron-encoded
snoRNAs in eukaryotic nuclear genomes), or a pro-
Diversity and Distribution
tein-coding region (e.g., intron-encoded maturases in
Introns of some type are found in every kingdom of organellar group I and II introns and homing endonu-
cellular life, and also in viruses, bacteriophages, and cleases in bacteriophage introns). Splicing may join
plasmids. Different types of introns have different parts of two different RNA transcripts, a process
splicing mechanisms and distinctive patterns of dis- known as `trans-splicing' that is common in trypano-
tribution with respect to gene families, subcellular somes but rare or absent in most other organisms.
compartments, and taxonomic groups (e.g., protein- Finally, the pattern of splicing of a single transcript
spliced tRNA introns are known only from tRNA may be variable, such that different mRNAs, and
genes in archaebacterial genomes or eukaryotic nuclear different protein products, are produced from the
genomes). A single gene may have multiple introns same pre-mRNA. Regulation of such `alternative
and, rarely, introns of multiple types (e.g., some fungal splicing' schemes plays a crucial role in sex determin-
mitochondrial genes have both group I and group II ation in Drosophila. The frequency and importance of
introns). alternative splicing in most species is not well under-
The most familiar introns are the `spliceosomal' stood.
introns, which are excised by a ribonucleoprotein
`spliceosome,' and which typically have the sequence
Mutation and Evolution
GU...AG. Spliceosomal introns are known only from
genes in the eukaryotic nucleus (or nucleomorph) and Introns are passively subject to the same mutational
in eukaryotic viruses. They range in length from less lesions that affect other genomic sequences; in some
than 20 nt (nucleotides) to over 200 kilo-nt, while cases they contribute actively to the mutational pro-
exons range in length from less than 10 nt to over 3 cess as mobile elements. Nucleotide substitutions that
kilo-nt. The mean density of introns varies widely, alter splicing have been implicated in many heritable
from over 4 introns per kilo-nt of protein-coding diseases in humans. Such changes usually map to
sequence in the most intron-dense nuclear genomes within a few nt of a splice junction. Over evolutionary
(including those of vertebrates and vascular plants), to time-scales, the internal sequences of spliceosomal
0.04 in the yeast Saccharomyces cerevisiae. introns diverge rapidly (by nucleotide substitutions
Group I and group II introns are collectively as well as by short insertions and deletions), presum-
known as `self-splicing' introns, because the intron ably because the demands of splicing impose no con-
RNA plays a primary role in the biochemistry of straint on most internal sites. By contrast, group I and
splicing, in some cases being sufficient for splicing in II introns evolve more slowly, and are densely packed
vitro. Group I introns are the most broadly distri- with sequences that participate in splicing and mo-
buted mobile elements known, being found in the bility.
genomes of eubacteria and their phages, as well as in Rearrangement mutations involving introns also
the nuclear, mitochondrial, and chloroplast genomes occur, sometimes based on recombination between
Invariants, Phylogenetic 1053

repetitive elements within introns. In animal genomes, are 256 possible patterns but some of them carry the
intron-mediated rearrangements have contributed same information. For example, the same relationship
importantly to the evolution of novel chimaeric would be inferred if the pattern were GGTT. The
genes by so-called `exon shuffling.' On the scale of method restricts itself to positions that have exactly
millions to hundreds of millions of years, homologous two purines (A and/or G) and two pyrimidines (C
genes may diverge by loss and gain of introns. Loss of and/or T) in their pattern as all the examples used
an intron may occur by way of reverse transcription here do. Their relationship is shown by the tree in
and recombinational reincorporation of a spliced gene Figure 1A where the arrowhead indicates that only a
product. Insertion of introns by transposition has single transversion mutation is required to explain the
been observed experimentally for group I, group II, observed nucleotides at the tips of this tree. (A trans-
and spliceosomal introns. For group I and II introns, version is the historical change from (or to) a purine
`homing' to (intronless) allelic sites is also observed. to (or from) a pyrimidine; all other interchanges are
called transitions.)
See also: Eukaryotic Genes; Pre-mRNA Splicing On the other hand, a pattern such as ACCA would
suggest that sequences 1 and 4 were sisters rather than
sequences 1 and 2 (see Figure 1B). The two relation-
Invariants, Phylogenetic ships (trees) cannot both be true, but if sequences 1
and 2 really are the true sister sequences, then this
W Fitch third pattern can only have arisen by virtue of two
Copyright ß 2001 Academic Press transversions having occurred during the history of
doi: 10.1006/rwgn.2001.0710 these sequences (see Figure 1C).
However, we can estimate how often the mislead-
ing case in Figure 1C arises. Note that in Figure 1D
Phylogenetic invariants is a method first proposed we have shown only three of the four nucleotides in the
by Lake (1987). The `invariants' derive from the fact pattern. What could the fourth nucleotide be? As we
that the addition and subtraction of the numbers of only consider those patterns with two purines and two
certain nucleotide distribution patterns are expected pyrimidines, there must be a pyrimidine. Which one?
to remain constant (at zero) for all incorrect phylo- If we assume that there is no bias as to which nucleo-
genies. And thus can be used to distinguish among tide the mutation is to, then it can be either C (as in
alternative phylogenetic trees. It is a property that is Figure 1C) or T (as in Figure 1E) with equal prob-
used on nucleotide sequences taken four at a time. For ability. But that means that, for the wrong tree, the
example, suppose that we had four such sequences number of occurrences of a pattern like that in Figure
that are homologously aligned from left to right, one 1C should be the same as the number for the pattern
under the other: like that in Figure 1E. Hence, subtracting those two
numbers should give an number not statistically differ-
...AGA...
ent from zero for the two tree structures that are wrong.
...AGT...
(The third possible tree is for the pattern ACAC
...C T T...
which suggests that sequences 1 and 3 are sisters.)
...C TA...
There are more details to the method but the pre-
so that for any position in the alignment the four ceding gives the spirit of the method. It is a method
nucleotides produce a (vertical) pattern such as that is guaranteed to give the correct answer given
AACC. This might suggest that the first two se- sufficient lengths of the sequences being compared.
quences are sister sequences meaning that they are This virtue, however, is more than offset by the answer
more closely related to each other than either of them to the question of how long the sequences must be to
is to the second two sequences (see Figure 1A). There get that correct answer. It turns out that the sequences

1A 4C 1A 4A 1A 4A 1A 4A 1A 4A
A
A C A A A A A A
C
2A 3C 2C 3C 2C 3C 2C 3? 2C 3T

A B C D E

Figure 1

Potrebbero piacerti anche