Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Molecular Systematics of Fishes
Molecular Systematics of Fishes
Molecular Systematics of Fishes
Ebook922 pages9 hours

Molecular Systematics of Fishes

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Sequenced biological macromolecules have revitalized systematic studies of evolutionary history. Molecular Systematics of Fishes is the first authoritative overview of the theory and application of these sequencing data to fishes. This volume explores the phylogeny of fishes at multiple taxonomic levels, uses methods of analysis of molecular data that apply both within and between fish populations, and employs molecule-based phylogenies to address broader questions of evolution. Targeted readers include ichthyologists, marine scientists, and all students, faculty, and researchers interested in fish evolution and ecology and vertebrate systematics.
  • Focuses on the phylogeny and evolutionary biology of fishes
  • Contains phylogenies of fishes at multiple taxonomic levels
  • Applies molecule-based phylogenies to broader questions of evolution
  • Includes methods for critique of analysis of molecular data
LanguageEnglish
Release dateJul 10, 1997
ISBN9780080536910
Molecular Systematics of Fishes

Related to Molecular Systematics of Fishes

Related ebooks

Power Resources For You

View More

Related articles

Reviews for Molecular Systematics of Fishes

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Molecular Systematics of Fishes - Thomas D. Kocher

    Canada

    Preface

    Fishes are the most diverse group of extant vertebrates, and yet our knowledge of the evolutionary relationships among them is largely incomplete. Over the past few years, molecular genetic methods, particularly PCR amplification and DNA sequencing, have become widely used to study the evolutionary history of fishes. Because of the strong tradition of morphological systematics of fishes, this group is uniquely suitable for testing and evaluating the efficacy of different approaches to elucidating the relationships among taxa.

    This book surveys the use of these new methods at many taxonomic levels, from the structure of local populations to the relationships among the deepest branches of the piscine family tree. The authors bring a diversity of experience and approaches to their analyses, and the result is a collective evaluation of the utility of these techniques for understanding evolutionary patterns and processes. Although this book focuses on fishes, the conclusions should be broadly applicable to the molecular systematics of other groups.

    We thank the authors for seeing this project through to completion. We are indebted to a host of anonymous individuals for constructive critical reviews of each chapter in manuscript form. In an increasingly busy world, it was a delight to see that many careful reviewers are still willing to take the time to coax a higher quality manuscript from their colleagues. In addition to these reviewers, Raymond R. Wilson, Joseph E. Faber, Allyson N. Hubers, Mark D. Chandler, Rachel A. Bartholomew, Rachael A. Callcut, and Gary R. Kutsikovich reviewed the entire volume at various stages. We owe special thanks to Rachel A. Bartholomew and Rachael A. Callcut for helping to prepare the indices, Karen L. Carleton for work on the references, and Craig Albertson for the artwork on the cover jacket.

    Our work on the molecular systematics of fishes has been generously funded by grants from the National Science Foundation, the Alfred Sloan Foundation, the National Geographic Society, the National Research Council, and the NOAA Sea Grant Program. We especially thank our families and students for their patience and understanding during the many periods that our work has required us to be elsewhere—in body or in thought.

    This volume is dedicated to our mentors (especially Richard Rosenblatt, David Hillis, Allan Wilson, and Jeff Mitton) who encouraged, critiqued, and shaped our ideas in molecular systematics. We hope that this volume will contribute to the preservation of fish species so that future generations will be able to wonder at the beauty and diversity of fishes in their natural habitats.

    Thomas D. Kocher,     University of New Hampshire

    Carol A. Stepien,     Case Western Reserve University

    CHAPTER 1

    Molecules and Morphology in Studies of Fish Evolution

    CAROL A. STEPIEN,      Department of Biology Case Western Reserve University, Cleveland, Ohio 44106

    THOMAS D. KOCHER,      Department of Zoology, University of New Hampshire, Durham, New Hampshire 03824

    I. Introduction

    Fishes are the most diverse group of living vertebrates, with more than 24,600 extant species currently known (Nelson, 1994). For more than a century, systematists have sought to organize this diversity by studying aspects of their external and internal morphology. Their patient counting and dissection have achieved remarkable success in identifying groups of evolutionarily related species and provide the foundation and starting point for all current work on the systematics of fishes (for summaries of present status of morphological systematics of fishes see Nelson, 1994; Stiassny et al., 1996).

    The development of molecular techniques has helped invigorate studies of fish systematics. The realm of methods developed for molecular systematics (Hillis, et al., 1996; Ferraris and Palumbi, 1996) offer new suites of characters for analyzing relationships among fishes (Carvalho and Pitcher, 1995) and have been effectively applied from the level of populations to orders. It is hoped that this book illustrates the broad utility of molecular approaches for addressing fish systematic questions.

    Morphological studies have been especially successful in defining species and in organizing these species into genera. These groupings have usually been confirmed when examined with molecular approaches. Molecular characters have revealed some cryptic species (reviewed by Avise, 1994) and identified some incorrectly split groups (e.g., species in the clinid kelpfish genus Gibbonsia by Stepien and Rosenblatt, 1991; Stepien et al., Chapter 15). In general, the overall concordance between morphological and molecular studies has been good. Testing for congruence of relationships derived from independent data sets is a particularly robust approach to systematic problems (Miyamoto and Fitch, 1995).

    Although morphological studies have generally been successful in defining genera, it is rare to find studies which present a hypothesis of relationship above the level of the species comprising a genus, primarily due to a lack of congruence of characters. Fortunately, this is one of the strengths of molecular data, and inter- and intrageneric relationships are now being rapidly tested and elucidated. Molecular data are also the primary means used to assess the phylogeographic relationships among populations, examining questions of zoogeographic subdivision and relationships among areas (see Chapter 5 by Nielsen et al., Chapter 8 by Bermingham et al., and Chapter 9 by Faber and Stepien). Studies at these lower systematic levels are shedding more light on the mechanisms underlying the diversity of fishes.

    Both morphological and molecular studies have had particular difficulty discerning higher-level relationships. In both types of data, the central problems are identifying homologous characters and finding a sufficient number of synapomorphies to identify lineages with statistical confidence. Although great strides have been made in identifying appropriate molecules and refining analytical techniques, interpreting relationships among the deepest clades of the piscine phylogeny are still problematic.

    This book is arranged in approximate order of primary phylogenetic problems addressed, ranging from lower (relationships among populations and closely related species) to higher-level systematic questions. The first set of chapters primarily focus on discerning population and species level problems in relation to phylogeography and include Chapter 3 by Kornfield and Parker (mbuna species flock), Chapter 4 by Sültmann and Mayer (cichlid adaptive radiation), Chapter 5 by Nielsen et al. (Pacific trout Oncorhynchus), Chapter 6 by Wiley and Hagen (sand darters Ammocrypta), Chapter 7 by Sturmbauer et al. (cichlids), and Chapter 8 by Bermingham et al. (biogeographic patterns involving fishes of the Panamanian Isthmus). The next set of chapters address resolution of DNA for testing middle-level systematic problems (species through family-level questions) and discriminating among morphology-based hypotheses, including Chapter 9 by Faber and Stepien (Percidae), Chapter 10 by Phillips and Oakley (Salmoninae), Chapter 11 by Parker (Cyprinodontiformes), and Chapter 12 by Bernardi (Fundulidae, Cyprinodontiformes). The final set of chapters focus on the resolution power of genes to address higher-level systematic questions and evaluating the level of maximum phylogenetic utility. These include Chapter 13 by Naylor et al. (lamniform sharks), Chapter 14 by Orti (Characiformes), Chapter 15 by Stepien et al. (Blennioidei), Chapter 16 by Klein et al. (Cichlidae), and Chapter 17 by Lydeard and Roe (Actinopterygii).

    II. History of Molecular Techniques

    An increasingly sophisticated realm of techniques has been developed since the mid-1970s to study the molecular similarities of organisms. Although preceded by protein sequencing and immunology, the widespread use of molecular techniques in fish systematics really began with the discovery of allozyme polymorphisms.

    A. Allozyme Studies

    Allozyme/isozyme studies involve identifying protein polymorphisms by comparing their similarities and differences in net electric charge. Allozyme and isozyme studies have been one of the most popular approaches in examining population genetic and stock divergence questions in fishes. They have also been especially useful in identifying cryptic species and in testing biogeographic hypotheses. Allozyme/isozyme electrophoresis has the advantage of being relatively rapid, cost effective, and efficient. Another advantage is that the sampling is spread over a variety of presumably independent gene loci. The chief disadvantage of using an allozyme approach is that bands (alleles) that have the same electric charge and migrate to the same point in the gel may not be homologous (i.e., evolutionary convergence). The scoring of gels is often somewhat subjective and bands are difficult to interpret when weak or close together. Variants have traditionally been assumed to be selectively neutral, enabling hypotheses of separation time to be tested. However, several studies have shown that some allozyme variants are not neutral markers and are under selection (Avise, 1994; Pogson et al., 1995; Powers and Shulte, 1996). Our view is that increasing evidence shows that most (if not all) neutral genetic markers, including allozymes, mtDNA, and microsatellites, are indeed subject to varying amounts of selective constraint. The possibility that loci are under selection does not eliminate their utility in systematics, however. For example, morphologists regularly utilize characters that are the products of selection. In this volume, Nielsen et al. (Chapter 5; Salmonidae) and Stepien et al. (Chapter 15; Blennioidei) examine the congruence of hypotheses derived from allozyme data with other molecular data sets.

    B. Mitochondrial DNA

    The mitochondrial (mt) genome has many properties that make it useful for reconstructing recent phylogenetic history (reviewed by Wilson et al., 1985; Avise, 1994; Simon et al., 1994). The most important feature is its clonal inheritance. Fish mitochondrial genomes are haploid and apparently nonrecombining. The evolution of the molecule therefore corresponds exactly to the model of bifurcating evolutionary trees. Second, mtDNA evolves more quickly than most nuclear genes, allowing the identification of informative phylogenetic characters among even closely related species and populations.

    Two other features of mtDNA are typically listed as advantages for phylogenetic analysis. First, mtDNA is maternally inherited. Although it is true that mtDNA is predominantly maternally inherited, several instances of heteroplasmy of distinct mitochondrial lineages suggest that this is not strictly, or universally correct (Magoulas and Zouros, 1993). Second, it may no longer be appropriate to consider that substitutions in mtDNA accumulate according to a strictly neutral process. Patterns of sequence differentiation suggest that selective sweeps may be common (Ballard and Kreitman, 1994), and laboratory experiments have suggested competitive differences among mitochondrial haplotypes (Hutter and Rand, 1995). Whether these departures from neutral evolution invalidate the concept of molecular clocks remains to be seen.

    Many studies of mtDNA have analyzed restriction fragment length polymorphisms (RFLPs). Whole mtDNA can be digested with specific endonucleases, and the products are then separated by size using gel electrophoresis. In the most comprehensive studies, restriction sites are mapped and their presence or absence (rather than mere sharing of fragment lengths) is scored (Dowling et al., 1990). RFLP studies have been a popular approach in quantifying the degree of divergence within and among populations. In applying this approach to species and higher-level systematic questions, the homology of restriction site characters becomes less certain. A better approach for these comparisons involves direct analysis of DNA sequences.

    C. Polymerase Chain Reaction and DNA Sequencing

    Until the development of the polymerase chain reaction (PCR) (Saiki et al., 1988), sequencing of genes for phylogenetic analysis was rarely performed because of the huge investment required to clone homologous genes from multiple samples. The introduction of primer sequences with wide phylogenetic utility (universal primers; e.g., Kocher et al. 1989) allowed the rapid amplification of particular sequences from a large number of samples and helped create an explosion of studies using DNA sequences to examine phylogenetic questions.

    DNA sequence data have a number of inherent advantages over other kinds of systematic data. First, an essentially unlimited number of sequence characters are potentially available. Fish genomes typically contain on the order of a billion nucleotide pairs, each of which is potentially informative for phylogenetic analysis. Second, these characters are useful for studying relationships among both close and distant relatives. Each gene, as well as individual sites within a gene, evolves at a unique rate because of variation in the level of functional constraint. Slowly evolving genes such as nuclear 18S rDNA may be useful for discerning relationships among highly divergent groups (Hillis and Dixon, 1991). More rapidly evolving areas, such as the mtDNA control region, may be useful for discerning lower-level systematic relationships, such as among populations and species, as shown for percid relationships in the study by Faber and Stepien (Chapter 9). In coding regions, the variation in DNA sequences may be evaluated among first, second, and third codon positions and at the amino acid level in order to increase potential phylogenetic utility at higher systematic levels. The relative strength of the phylogenetic signal with codon position and between the nucleotide and amino acid levels are critically evaluated by Naylor et al. (Chapter 13) and Lydeard and Roe (Chapter 17).

    D. Mitochondrial DNA Sequence Regions

    Mitochondrial DNA regions have been well studied in fishes, and knowledge of universal primer sequences (e.g., Kocher et al., 1989; Meyer et al., 1990, Simon et al., 1994; Palumbi, 1996) for amplification by PCR and sequencing has made them very accessible. As illustrated in this volume, they can be effectively used to address many different levels of taxonomic questions, depending on the region sequenced and the use of various correction factors for types and positions of substitutions. Silent sites of mitochondrial protein-coding genes and the nontranscribed control region are shown to be particularly useful for analyzing relationships of recently diverged taxa, such as among populations, species, and genera. In the case of higher-level systematic questions, silent sites and rapidly evolving regions may have experienced multiple substitutions, obscuring phylogenetic signal. At higher taxonomic levels, more slowly evolving regions, such as the 12S and 16S ribosomal RNA genes may be useful. Alternatively, because substitutions in nonsynonymous nucleotide sites (which alter the encoded amino acids) occur more rarely, these changes may provide a higher signal/noise ratio for deep comparisons.

    The sequence evolution of mtDNA has been relatively well studied in fishes. Base substitution events occur relatively rapidly. MtDNA structure, gene order, and secondary structure are largely conserved in fishes, as well as in other vertebrates. It is inherited as a single unit and thus has been characterized as sampling a single gene, which is a possible disadvantage that may particularly affect population genetic studies. Because the evolutionary history of a single gene can be different from the average history of an entire genome (discussed by Avise, 1994), caution must be used in interpreting mitochondrial gene trees as reflecting the history of populations.

    The cytochrome b gene is probably the best-studied mitochondrial gene in fishes (e.g., Kocher et al., 1989; Meyer et al., 1990; Carr and Marshall, 1991; Block et al., 1993; Zhu et al., 1994; Carr et al., 1995). Like most mitochondrially encoded proteins, it is a transmembrane protein important in the respiratory chain of cellular metabolism. Although it has been widely used, some have questioned the ability of this sequence (especially short subsets of the gene) to resolve phylogenies (Martin et al., 1990; Graybeal, 1993; Meyer, 1994). In this volume, mtDNA sequences from the cytochrome b gene are used to analyze a variety of levels of relationships ranging from population genetics to higher-level systematics. For example, Bermingham et al. (Chapter 8) use cytochrome b data to assess population genetic and phylogeographic questions in tropical damselfishes of the Abudefduf saxatilis species group. Cytochrome b sequences are used to analyze relationships among species and groups of sand darters (family Percidae) (Wiley and Hagen, Chapter 6), among species of salmonids (Phillips and Oakley, Chapter 11), among members of the family Fundulidae (Cyprinodontiformes) (Bernardi, Chapter 12), and among lamniform sharks (Naylor et al., Chapter 13). At higher taxonomic levels, Lydeard and Roe (Chapter 17) test the use of cytochrome b to analyze relationships among actinopterygian fishes, revealing strong phylogenetic signal. By examining their data using different codon positions, Lydeard and Roe achieve greater utility at higher taxonomic levels than does Bernardi (Chapter 12).

    Mitochondrial ribosomal genes (12S and 16S rDNA subunits) are often used to study more distantly related taxa. Substitutions in the small subunit (12S) accumulate relatively slowly, approximating the average for the entire mitochondrial genome, whereas those in the large subunit (16S) evolve even more slowly (Simon et al., 1994). The 12S rDNA gene is used by Stepien et al. (Chapter 15) to examine relationships among species, genera, tribes, families, and suborders of blenniiform fishes, showing strong utility at these different levels and congruence with morphological-based hypotheses. Stepien (12S; Chapter 15), Orti (12S and 16S, Characiform fishes; Chapter 14), and Parker (16S, Cyprinodontiformes; Chapter 11) evaluate differences in the amount of phylogenetic signal among stem and loop regions of the ribosomal genes, reporting a greater retention of the phylogenetic signal at higher taxonomic levels in the more slowly evolving stem regions and more useful characters at lower taxonomic levels in the more rapidly changing loop regions.

    The mtDNA control region is involved in the control of mtDNA replication and RNA transcription. It is also called the displacement loop (D-loop) because one of the two strands of the helix is displaced by the synthesis of a new strand during replication. The highly variable left domain region has been believed to be largely selectively neutral, which may account for its very rapid rate of variation. In fishes, the control region is usually long (e.g., 888 to 1223 bp in percids; Faber and Stepien, Chapter 9) and often contains tandemly repeated segments. There is a set of conserved sequence blocks that are probably involved in controlling mtDNA replication and transcription, which may be useful for some systematic studies (see Attardi, 1985; Lee et al., 1995; Faber and Stepien, Chapter 9).

    The highly variable control region has thus been a popular sequence for examining population structure and relationships among closely related species of fishes (e.g., Meyer et al., 1990; Arnason and Rand, 1992; Sturmbauer and Meyer, 1992, 1993; Brown et al., 1993; Stepien, 1995; Lee et al., 1995). In this volume, Sturmbauer et al. (Chapter 7) employ sequence data from the control region to address phylogenetic questions and models of adaptive radiation and biogeography of cichlid fishes in Lake Tanganyika, Africa. Nielsen et al. (Chapter 5) utilize control region variation to discern patterns of geographic structure in the Pacific trout Oncorhynchus mykiss. The utility of control region sequences for discerning higher-level relationships is critically evaluated by Phillips and Oakley (Chapter 11) and by Faber and Stepien (Chapter 9). Although some areas of this rapidly evolving sequence are alignable even among distantly related fishes (see Lee et al., 1995), the high rate of evolution of this sequence appears to preclude analyses beyond the level of closely related species and perhaps genera.

    E. Nuclear DNA Sequences

    Several nuclear DNA regions have been used to address systematic questions among fishes. One of these is the major histocompatibility complex (MHC) used by Klein et al. (Chapter 16) to examine evolutionary hypotheses of the haplochromine flock of cichlids in Lake Victoria, East Africa. MHC molecules are believed to play a central role in the vertebrate immune system by presenting peptides to T lymphocytes, thereby initiating immune response cascades. Because MHC molecules are well known due to their role in the immune system and are highly variable, they also offer a wealth of potential systematic information. There are two classes of MHC molecules (I and II), which each consist of two polypeptide chains (a and b), but differ in structure and function (Bjorkman and Parham, 1990). Klein et al. (Chapter 16) use examples from classes I and II to test phylogenetic utility among recently diverged fish species as well as at higher phylogenetic levels. They also address whether selection causes sequence and allele frequency convergence in MHC genes.

    Stepien et al. (Chapter 15) compare sequence-based trees of blennioid fishes derived from the nuclear internal transcribed spacer (ITS)-1 region of the ribosomal array (Stepien et al., 1993) with trees produced from mitochondrial 12S rDNA gene sequences. A much greater number of variable characters is obtained using mtDNA 12S gene than was found from the nuclear ITS-1 region (Stepien et al., 1993), suggesting that nuclear ITS sequences are best used for studying deeper divergences. In contrast, Phillips and Oakley (Chapter 10) find nuclear rDNA spacers to be most useful at lower taxonomic levels (interspecific and subspecific levels). These results suggest that the ITS-1 region may evolve at different rates in different fish groups. Other chapters explore the utility of new genes for phylogenetic analysis. Parker (Chapter 11) tests the relative degree of phylogenetic signal among first, second, and third codon positions of the nuclear tyrosine kinase gene X-src sequences for resolving relationships among the cyprinodontid killifishes. Orti (Chapter 14) compares nuclear DNA sequences from the protein-coding gene ependymin (a major glycoprotein component of the extracellular fluid in the brain of fishes) with mitochondrial 12S and 16S rDNA sequences to test the evolution of characiform fishes at various hierarchical levels. Much work remains in identifying a standard set of nuclear genes for phylogenetic analysis of fishes.

    F. Other Nuclear Techniques

    The introduction of PCR opened other avenues for the analysis of genome sequences. We touch here on two popular methods: randomly amplified polymorphic DNAs (RAPDs) and microsatellite polymorphisms.

    The RAPD method primarily detects sequence changes within the annealing sites of PCR primers, resulting in the presence or absence of amplification products from a particular locus. RAPD polymorphisms usually have a pattern of dominant inheritance (Williams et al., 1990) and can be used to screen for differences among individuals, populations, and species. Sultmann and Mayer (Chapter 4) employ RAPDs to identify polymorphic loci in cichlid groups, followed by locus-specific DNA amplification and sequence determination of the fragments. In this way, they avoid problems with determining homology of fragments among species. They find a large number of insertions and deletions (some of which are species specific) that can be treated as characters along with nucleotide substitutions. Their phylogenies show considerable congruence with morphological hypotheses and other molecular studies. They conclude that RAPDs are able to detect polymorphisms among closely related taxonomic groups, ranging from populations to genera.

    Microsatellite DNAs are highly variable, tandemly repeated DNA sequences with unit repeats one to six bases in length. Length polymorphisms arising from variation in the number of repeats are quantified by sizing PCR-amplified copies of the locus on a polyacrylamide gel. Microsatellites are abundantly distributed throughout the nuclear genome and are highly polymorphic. They follow a Mendelian codominant inheritance pattern. Microsatellites have been widely used to analyze mating systems and population genetic structure (Queller et al., 1993), despite the fact that their pattern of mutation is still poorly understood (Jarne and Lagoda, 1996). In Chapter 5, Nielsen et al. examine the biogeographic variation of nuclear microsatellite repeats in Pacific trout, O. mykiss, in comparison with mtDNA control region sequences. Although their mtDNA data show significant latitudinal and longitudinal correlations, microsatellite data are only weakly associated with longitude (and not at all with latitude). These differences suggest that the evolutionary processes resulting in phylogeographic patterns of genetic variation differentially affect the mitochondrial and nuclear genomes. Kornfield and Parker (Chapter 3) test the utility of microsatellite loci for examining relationships within a rapidly evolving species flock (the mbuna of Lake Malawi), in comparison with results from allozyme, mtDNA RFLP, mtDNA sequence, nuclear DNA sequence, and RAPDs data sets. They conclude that microsatellites are the first class of molecular markers to possess sufficient power to elucidate that level of evolutionary history. Sultman and Mayer (Chapter 4) compare microsatellite allele size frequencies among cichlid species from Lake Victoria. In total, these results suggest that microsatellite loci are applicable to species- and population-level work in rapidly evolving groups, as exemplified by the adaptive radiations of the Cichlidae.

    G. A Look to the Future

    Although new kinds of polymorphisms will be identified as we come to understand the structure of genomes, there is some hope that the techniques used to study these polymorphisms have stabilized. Most investigators are now directly examining DNA sequence polymorphisms, the most fundamental unit of molecular variation. PCR and DNA sequencing will likely be the primary tools of molecular systematics in the foreseeable future. We anticipate that the major differences will be increases in length of sequence examined and the number of genetic loci scored.

    III. Controversy over Analytical Methods

    Systematic biology is well known for its vigorous and highly polarized methodological debates. Although much of the acrimony has subsided, strong proponents of distance and cladistic approaches remain. This polarization is strongly correlated with the type of data sets studied by individual scientists. Morphologists have generally rejected distance approaches. Molecular systematists appear relatively flexible in the approaches taken to recover phylogenetic relationships from their data and have found that the evolution of sequences is often most easily modeled with distance methods. Still, character-state analyses of molecular data abound, and we should be careful not to equate molecular studies with distance analyses or morphological studies with cladistic analyses.

    A. Cladistic Approaches

    The rise of cladistic methodology, as proposed by Hennig (1950, 1966) and popularized by Wiley (1981), has greatly contributed to the development of systematics from a collection of ad hoc procedures to a respectable science. Cladistics has markedly increased objectivity for interpreting the evolutionary history of characters and testing the relative strength of competing systematic hypotheses. This standard methodology has facilitated the comparison of hypotheses proposed by various investigators and support for different types of data sets. Examples of such comparisons occur in almost every chapter of this volume.

    B. Distance Approaches

    Along with the development of molecular techniques, such as allozyme–isozyme electrophoresis, emerged the use of genetic distances and clustering algorithms which describe the degree of similarity or genetic relatedness among pairs of taxa and summarize this information in a tree. Distance methods differ from cladistics in that they reduce the difference among each pair of taxa to a single number. Some workers argue that distance methods lose information inherent in the character-state matrix. Others argue that distance methods allow the evolution of the sequence to be more easily modeled. This allows accurate correction for unobserved multiple substitutions (homoplasy) in sequence data that is not possible with other methods. Like character-state methods, distance methods can be bootstrapped to evaluate the internal consistency of data. Recent theoretical work has focused on the calculation of standard errors of distances and branch lengths. Most types of distance trees are constructed with branch lengths that are proportional to the amount of divergence, making it possible to estimate relative times of separation.

    C. Distance Corrections, Weighting, and Clustering

    Genetic distances may be corrected for the effects of multiple substitutions per site. Methods for correcting these include the Jukes-Cantor equation (Jukes and Cantor, 1969), which uses a Poisson model to calculate the probabilities of multiple substitutions, assuming equal probability of the type of substitution, no nucleotide bias (same proportions of G, A, T, and C), and that all sites along a sequence have an equal probability of change. Because some or all of these assumptions are violated by most DNA sequence data sets, additional correction factors are often used. The Kimura two-parameter method (Kimura, 1980) allows differential weighting of transition and transversion probabilities. Tamura and Nei’s (1993) distance correction is based on the gamma distribution and corrects for nucleotide frequency differences, transition : transversion biases, and variation of substitution rate among different sites. Gamma distances are discussed at length by Kocher and Carleton (Chapter 2). Kumar et al. (1993) suggest that if various distance correction methods give similar results, then the simplest possible model should be used in order to minimize variance of the estimates. They suggest using the Jukes-Cantor or simple pairwise distances in cases when genetic distances are low, as long as substitution rates do not vary among lineages.

    Differential weighting of characters has been widely discussed (Wheeler, 1986; Swofford et al., 1996). It is clear that data for different nucleotide positions in coding regions, i.e., first, second, and third codon positions, should be analyzed separately because of their distinct patterns of selective constraint. Weighting is a relatively crude way to correct for the variation in rate among sites in noncoding sequences, especially as the pattern of selective constraint for these sequences is poorly understood. Weighting has also been used to model the relative frequency of different types of nucleotide substitution in parsimony analyses (Fitch and Ye, 1991). The advantage of this approach relative to the use of an appropriate distance method is not clear.

    Clustering algorithms have greatly improved in recent years. Neighbor joining (Saitou and Nei, 1987) is a widely used distance clustering algorithm that allows unequal rates of divergences among lineages. It is no longer necessary (or desirable) to assume that rates of sequence change are constant throughout a phylogeny.

    D. Molecular Clocks

    Use of molecular characters has also been associated with the assumption of a molecular clock, i.e., that mutations arise at relatively regular, predictable rates (Zuckerkandl and Pauling, 1962, 1965). Today, it is unlikely that any proponents of a universal clock, that ticks at a regular rate across all taxa, remain. Still, most workers accept the idea of local clocks—that rates of evolution within a particular group are relatively similar. Clocks may be calibrated based on comparisons with taxa having known divergences, using well-corroborated geological events (such as the linkage of the Isthmus of Panama as a barrier between the Atlantic and Pacific aquatic fauna; see Vawter et al, 1980; Grant, 1987; Stepien and Rosenblatt, 1996; Chapter 8 by Bermingham et al.), or with the fossil record. Dating divergences to the fossil record is complicated by the fact that the actual divergence usually predates its first fossil appearance by an unknown amount of time. Problems with clock calibration are discussed by Bermingham et al. (Chapter 8) and by Stepien et al. (Chapter 15).

    E. Combining Data and Testing for Congruence

    There are two primary schools of thought among systematic biologists regarding combining morphological and molecular data. The first is the total evidence approach (Mickevich and Johnson, 1976; Kluge and Wolf, 1993) which states that phylogenetic analysis should be performed on a combined data set using all possible evidence. The null hypothesis for this approach is that there are no significant differences or partitions within the data set, i.e., that there is only one evolutionary history for the clade in question. Huelsenbeck et al. (1996) raise the point that estimates from total evidence have less sampling error as separate analyses of data partitions are based on fewer characters. It is advocated that total evidence tests should examine whether different sets of data have significantly different signals and these possible partitions should be tested against the combined data set (de Queiroz, 1993; Bull et al., 1993, Ballard, 1996). The other school of thought states that data sets should be analyzed separately (see Bull et al., 1993; Miyamoto and Fitch, 1995). Relationships among taxa that are congruent in separate analyses are regarded as strongly supported. In other words, the congruence of data from separate sources (such as separate analyses using different genes, or between morphological and molecular data sets) indicates increased support that the relationships are likely to be true. Miyamoto and Fitch (1995) suggest that relationships among taxa that are supported by different independent data sets are particularly robust, equivalent to obtaining independent verification of an experimental hypothesis from a different experimental source. This independent type of verification may be lost in combining data sets.

    An explicit assessment of congruence versus total evidence approaches is discussed in Chapter 11 by Parker. Parker analyzes problems in systematics of the Cyprinodontiformes by combining morphological characters from Parenti (1981, 1984) along with molecular data, including the nuclear tyrosine kinase gene X-src (Meyer and Lydeard, 1993) and mt16S rDNA sequences (Parker and Kornfield, 1995). He evaluates the methodology for combining data sets and comparing trees, including T-PTP (Faith, 1991) and bootstrap tests (Rodrigo et al., 1993). His conclusions argue for the utility of both combination and congruence approaches.

    Many of the authors in this volume compare taxonomic congruence between molecular-based and morphological-based hypotheses (e.g., Chapter 9 by Faber and Stepien, Chapter 10 by Phillips and Oakley, Chapter 12 by Bernardi, and Chapter 17 by Lydeard and Roe). Phillips and Oakley (Chapter 10) compare results from morphological and molecular studies of salmonid relationships and conclude that morphological traits suggesting one clade are unreliable. Bernardi (Chapter 12) discerns considerable concordance between molecular data and the definition of subgenera, but is unable to resolve higher-level relationships within the family. Lydeard and Roe (Chapter 17) also find greatest concordance of the two types of data at the lowest levels of the taxonomic hierarchy.

    IV. Achievements and Failures of Molecular Systematics

    The greatest achievement of molecular systematics is the consistent and large set of characters generated for the analysis of phylogenies. The availability of these data has allowed the resolution of many intrageneric phylogenies that had not been previously addressed. Molecular studies have been spectacularly successful at the lowest taxonomic levels, particularly the analysis of relationships among populations or intraspecific phylogeography (Avise et al., 1987; see Chapters 3 through 9 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 of this volume). Molecular data offer an abundance of characters for studies at this level.

    Molecular studies have not yet fulfilled their promise for resolving deep relationships. There are two problems holding up progress in this area. First, it can become difficult to identify homology in highly diverged sequences. Alignments of characters becomes more difficult as the sequences diverge, particularly for hypervariable regions of rDNA genes. Hillis and Dixon (1991) have suggested that rDNA sequences beyond about 30% sequence difference should be discarded as unalignable. A better understanding of the relationship between rRNA structure and function would help in the identification of homologous sites. The second problem is saturation: the equilibrium value of sequence difference that is reached when multiple substitutions erase the record of previous substitutions at a site. For DNA sequence data there are only four nucleotide character states, G, A, T, and C, thus base substitutions at single nucleotide sites are often obscured by multiple substitutions at sites (multiple hits). As with morphological data sets, apparent synapomorphies may be the result of homoplastic convergence rather than shared common ancestry. Saturation is apparent in many molecular systematic studies. Claims that a group of taxa radiated rapidly at some time in the past should be scrutinized. It may be that molecular data are saturated and therefore uninformative as to the timing of particular branching events. This problem may be lessened either by examining more slowly evolving sites or by considering the codon as the character (rather than the individual nucleotides; Goldman and Yang, 1994; see Chapter 13 by Naylor et al. and Chapter 17 by Lydeard and Roe). Further studies of mutational processes, and the selective forces underlying variation in rate among sites, are needed. Alternatively, new kinds of data, such as the analysis of positional data, may be needed. Patterns of SINE insertion (Murata et al., 1993) or the order of homologous loci (Boore et al., 1995) provide another approach for resolving deep relationships.

    Molecular studies have also failed to resolve the phylogeny of some rapidly speciating groups. Even an accurate phylogeny of a gene may not be informative as to the relationships of the species under study. If the gene pools are isolated more rapidly than polymorphisms can be fixed in a lineage, then the reconstructed gene trees may not parallel the evolution of the species (Moran and Kornfield, 1993; Parker and Kornfield, 1997; Chapter 3 by Kornfield and Parker; Chapter 7 by Sturmbauer et al.). Instead, the polymorphisms may be carried through the speciation event and be randomly fixed in the descendant populations (see discussion by Avise, 1994). The solution of this problem may require brute force; the construction of many independent gene trees may uncover the relationships among populations.

    V. Eight Promising Directions for Future Research

    Molecular systematists have been working with DNA sequences for most of the last decade. The basic techniques of PCR and DNA sequencing are firmly established, but how will they be applied in the future? The following areas of molecular systematics may prove especially rewarding in the future.

    1. Integration of Intraspecific Biogeographic Patterns with Studies of Speciation

    The study of the phylogenetic histories of populations in relation to biogeography has been termed intraspecific phylogeography (Avise et al., 1987). Several chapters in this volume specifically address testing these types of phylogeographic questions using fishes. Specifically, Wiley and Hagen (Chapter 6) test geographic distribution and likely histories of vicariance in a southeastern United States percid group, the sand darters. Faber and Stepien (Chapter 9) test for geographic relationship among spawning populations of walleye, Stizostedion vitreum, addressing whether gene flow is decreased due to natal homing. The evolution of species flocks, models of adaptive radiation, and biogeographic barriers are tested by Sturmbauer et al. (Chapter 7) for the cichlids of Lake Tanganyika, Africa. In studies of Panamanian freshwater fishes, Bermingham et al. (Chapter 8) describe very high levels of genetic divergence among populations, postulating that very high levels of phylogeographic structuring may be common in species exhibiting distributions that span large distances across physically isolated drainages. These studies are beginning to shed light on the role of geographic processes in speciation.

    2. Reconstruction of Phylogenies among Congeners

    The now standard methodology of sequencing short stretches of the mitochondrial genome will continue to bear fruit in the analysis of relationships within genera. As outlined by Kocher and Carleton in Chapter 2, these efforts will be most successful for divergences within the last 5 million years. The steady accumulation of these sequences will allow the construction of intrageneric phylogenies for many groups of fishes and will lay the groundwork for studies attempting to understand relationships further back in time.

    3. Reconstruction of Higher-Level Relationships Using Longer Sequences

    Continuing advances in DNA sequencing technology suggest that it will be practical to analyze increasingly longer segments of DNA. Up to a point, longer sequences will allow the resolution of more ancient divergences. Hillis (1996) has suggested that sequences only 5000 bp long may be sufficient to accurately reconstruct even complex phylogenies. This seems a good intermediate goal, although additional complete mitochondrial sequences and many more nuclear sequences would be useful for some questions.

    4. Analysis of Developmental Homologies at the Molecular Level

    Developmental biologists are beginning to focus on the analysis of fish development. A recent mutant hunt resulted in the isolation of more than 1500 mutations affecting development of the zebrafish (Haffter et al., 1996; Driever et al., 1996). We suspect that the genetic basis for many morphological differences will be revealed in the near future. Although the impact on the systematics of fishes is difficult to predict, the elucidation of molecular mechanisms generating morphological differences is sure to have an impact on the analysis of such characters. Where it is possible to cross species, it may be possible to identify the number of genes responsible for morphological differences (e.g., Doebley, 1992), quantifying for the first time the number of characters scored in morphological analyses.

    5. Interpretation of Hybridization and Species Boundaries Using Abundant Nuclear Markers

    Habitat disturbance and continued introductions of exotic species will create new opportunities for the hybridization of species. The analysis of introgression in such hybrid swarms will be facilitated by the abundance of new genetic markers now available. Where the taxonomy of natural species has been in debate, these markers will provide new data on the extent of differentiation across the whole genome. The analysis of hybrids may also shed light on selective constraints and the interaction of genes (Kilpatrick and Rand, 1995; Rieseberg et al., 1996).

    6. Analysis of the Evolution of Repetitive DNA Families

    Although most systematic analyses have focused on sequence variation in single-copy genes, there is some indication that repetitive DNA families offer new and useful tools for identifying relationships (Franck et al., 1994; Elder and Turner, 1994). Sequence variation in tandem and dispersed repetitive DNA may provide new insights in some groups.

    7. Studies of the Molecular Clock in Fishes

    The mechanisms governing the speed and regularity of molecular clocks are poorly understood. The great diversity of habitat and life history among fishes, coupled with their excellent fossil record, makes this an excellent group with which to study molecular clocks. New insights will arise as rigorous accountings of substitution rate are made in groups of fishes varying in population size, environment, and life history.

    8. Genomic Organization

    The increasing availability of genome maps, and even complete DNA sequences, is creating opportunities for the analysis of new characters. For example, Boore et al. (1995) used the pattern of gene arrangements in arthropod mtDNA to study arthropod relationships. O’Brien et al. (1993) proposed the use of a standard set of reference loci in the analysis of genomes, which would make it easy to identify such rearrangements in the nuclear genome. These types of characters may offer the best hope for resolving relationships among ancient lineages and need to be comprehensively addressed in fishes.

    VI. A New Age of Synthesis

    Although morphological and molecular traditions have frequently collided in the past, we argue for a more synergistic approach that recognizes the peculiarities and limitations of each kind of data and in which there is an interplay between morphological and molecular studies. All inherited morphological characters have their origin in molecular characters. A record of the history of evolutionary change can be found in both the structure and the genes of

    Enjoying the preview?
    Page 1 of 1