Sei sulla pagina 1di 8

c Indian Academy of Sciences

RESEARCH ARTICLE

The effect of functional compensation among duplicate genes can constrain their evolutionary divergence
JOSEPH ESFANDIAR HANNON BOZORGMEHR 39 Princedom Street, Manchester M9 4GQ, United Kingdom

Abstract
Gene duplicates have the inherent property of initially being functionally redundant. This means that they can compensate for the effect of deleterious variation occurring at one or more sister sites. Here, I present data bearing on evolutionary theory that illustrates the manner in which any functional adaptation in duplicate genes is markedly constrained because of the compensatory utility provided by a sustained genetic redundancy. Specically, a two-locus epistatic model of paralogous genes was simulated to investigate the degree of purifying selection imposed, and whether this would serve to impede any possible biochemical innovation. Three population sizes were considered to see if, as expected, there was a signicant difference in any selection for robustness. Interestingly, physical linkage between tandem duplicates was actually found to increase the probability of any neofunctionalization and the efcacy of selection, contrary to what is expected in the case of singleton genes. The results indicate that an evolutionary trade-off often exists between any functional change under either positive or relaxed selection and the need to compensate for failures due to degenerative mutations, thereby guaranteeing the reliability of protein production.
[Bozorgmehr J. E. H. 2012 The effect of functional compensation among duplicate genes can constrain their evolutionary divergence. J. Genet. 91, 18]

Introduction
Gene duplication is believed to be a widespread phenomenon, as common in fact as individual point mutations (Wagner 2001), and has greatly affected the evolution of genomes. It is usually caused by unequal crossing over during recombination that leads to the tandem or segmental duplication of genes on one of a pair of homologous chromosomes (Ohta 2000). The reverse transcription of mRNA followed by its reinsertion into chromosomal DNA is another known process in the creation of new loci. Mistakes occurring in either mitosis or meiosis can also generate copies of an entire chromosome or genome (Zhang 2003), resulting in either aneuploidy or polyploidy respectively. What is of particular interest is how duplicated genes diverge and whether they can acquire genuinely novel functions that contribute to the complexity of the genome and organism (Roth et al. 2006). It is suggested that a duplicate gene can provide a molecular substrate for evolution to work on under a relaxed regime of selection, and with fewer constraints than is the case for a singleton (Jordan et al. 2004). In such a scenario, one of the paralogue would be free to
E-mail: bozorgmehr@tiscali.co.uk.

diverge whilst the other one would cover for it by maintaining the native function (Brookeld 2003). This would, however, come with the risk of a paralogue being affected by a disabling null mutation and its subsequent nonfunctionalization (Maltsev et al. 2005), and genomes are indeed replete with defunct copies of once functional genes. The initial redundancy at paralogous loci, however, may serve to obviate any functional divergence since one or more sites may serve to compensate for deleterious changes occurring at corresponding ones. This would subject them to the preserving power of purifying selection (Wagner 2002). Therefore, so long as any deleterious variation is present among paralogous sites, it should not be assumed that duplicate genes are free to independently evolve since the need to preserve any pre-existing functionality is of paramount importance in relation to reproductive tness. It has been found that there even exist back-up circuits whereby a duplicate gene can respond to the status of a sister site, and is upregulated when the latter is damaged or inactivated by mutation (Kafri et al. 2006). Many duplicate genes seem to have acquired a transcriptional reprogramming ability that allows differentially expressed paralogue to provide compensation when needed (He and Zhang 2006). This also appears

Keywords. gene duplication; evolutionary divergence; redundancy; functional compensation.


Journal of Genetics, Vol. 91, No. 1, April 2012

Joseph Esfandiar Hannon Bozorgmehr to be an evolutionarily stable state rather than one providing just a transient solution. As such, if redundant duplicate genes are retained by selection for their compensatory utility, then this could potentially constrain any neofunctionalization unless additional features, introduced by de novo mutations, do not signicantly impair the original/ancestral capability. This may be possible if the gene products are multifunctional or promiscuous in some way, and research conducted in the directed evolution of proteins has provided some experimental evidence for this phenomenon (Tawk et al. 2005). Indeed, many variations in sequence may be permissible as they do not have a profound effect on biochemical functionality and a large measure of redundancy can be preserved (Guan et al. 2007). In addition, tandemly duplicated genes, situated next to each other on the same chromosome, can be subject to gene conversion events that homogenize their sequences (Saitou et al. 2006). But any attempt to explain novelty resulting from gene duplication and subsequent mutation has to examine both the extent and consequence of any tension that may exist between the contrasting forces of both innovation and compensation (Skipper 2003). There may indeed be a simultaneous need for a duplicate gene to adapt to a new role and also to remain as a redundant spare part so as to mask any deleterious changes, especially nullifying ones, at related paralogous sites. Moreover, given that duplicate genes make up to 80% of eukaryotic genomes (Hannay et al. 2008), an adequate explanation must be provided for why so many old genes are still recognizably paralogous and whose functional divergence has been greatly constrained over a great period of evolutionary history (Vavouri et al. 2008). While a large proportion of duplicate genes are in fact processed retrogenes (Ding et al. 2006), most of which have been inactive since their creation, the majority still produce functional proteins. The precise reasons as to why this degree of conservation exists have not been fully established. Gene duplication is sometimes associated with overexpression (Shastry 1995), mostly leading to harmful phenotypes, and pseudogenization is the likely outcome here. However, a double dosage can actually prove to be benecial (Hughes et al. 2007), particularly if the organism is exposed to a toxic environment (Edger and Pires 2009). This may itself account for the xation of many duplicate genes and their continued retention in the genome. Likewise, their role in facilitating alternative metabolic pathways and regulatory interactions (Teichmann and Babu 2004) is another important factor. The preservation of entire gene networks produced by whole genome duplication could also be because selection favours the persistence of all parts, likely because of stoichiometry (Evans et al. 2008). Genomic studies have revealed that the preservation of redundancy, rather than its loss, is in fact widespread and that paralogue are subject to purifying selection since they support genetic robustness and versatility (Gu et al. 2003). An extensive survey of the yeast genome (Dean et al. 2008) provides compelling evidence for evolutionary conservation among duplicate genes created by both whole genome and segmental duplication; redundancy was observed to be pervasive and persistent. A similar study by Li et al. (2010) offers a more nuanced account, claiming that a large number of duplicate pairs may have lost their initial backup capacity through a gradual process of mutational degradation over time. Even so, the compensatory utility of redundant genes has been observed in humans (Hsiao and Vitkup 2008), mice (Liang and Li 2009) and nematodes (Conant and Wagner 2004). In the case of the owering plant, Arabidopsis thaliana, knock-out tests have even shown that functional compensation by duplicate genes for a more severe phenotypic effect tends to be preserved by natural selection for a longer time than that having a less severe effect (Hanada et al. 2009). Computer simulation, used extensively to examine numerical models in population genetics, provides just such a means where this may not only be tested, but where the constraining power of purifying selection at paralogous sites can actually be measured. Using a general two-locus model, consisting of parent and daughter genes, I have simulated for the xation rate of mutations under both adaptive and neutral evolution in the former when deleterious mutations occur in the case of the latter. This represents the rst real attempt to determine the quantitative extent of any potential compensatory phenomenon. It has necessary implications both for evolutionary theory and also for the study of engineering applications where duplicated parts and systems may confer robustness (Ziha 2000).

The model
A hypothesis for the general conservation of function among duplicate genes should explain why any evolutionary divergence under both positive and relaxed selection is less effective due to the reproductive benet provided by a functional redundancy. In this scenario, a paralogous gene effectively masks any damage inicted at a sister site. This does not mean that the paralogues evolution is liable to be constrained by benecial mutations that serve only to adapt, improve or retain existing functionalitybut rather by those that cause a shift in its evolutionary trajectory. It is therefore expected that the xation rate for these types of mutation should be signicantly less than predicted in the one-locus model because of a buffering effect with respect to deleterious tness states. Using a two-locus model, describing the epistatic relationship of a pair of paralogous loci, functionally divergent mutations occur at the parental site with solely deleterious ones at the daughter. It could, of course, be the other way round but it is necessary to presume that only one locus can be subject to

Journal of Genetics, Vol. 91, No. 1, April 2012

The evolution of gene duplicates adaptive evolution while the other is functionally stable. For the purpose of the simulation, the assumption is also made that the daughter locus has already become xed by neutral means (Clark 1994) and that, with the exception of one test, it is clean and homogenously wildtype in its default state. A constant mutation rate occurs at this locus and throughout all the runs of the program. No back-mutations are permitted since they would only obscure the results and their signicance. Fitness levels for all possible genotypes involving the parent locus (A) and the daughter locus (B) are given table 1: semi-dominant alleles (a) that occur at the former represent benecial or neutral variation with corresponding selection coefcient, s, whereas those for the latter (b) are solely deleterious in nature with selection coefcient, t. For the sake of streamlining the model, this deleterious cost (t) for any functional change is made equal to the degree of negative selection existing at the daughter locus. Hence, maximal tness occurs when potentially benecial alleles at the parent locus, a, and wildtypes at the daughter site, B, are both homozygousthe latter serving to maintain the original functionality. As explained above, gain-of-function mutations at the parent locus must come with the resultant loss of native gene function, partial or total, if any important transition is to occur. This is why for the genotype, Aa/bb, there is a decline in tness because only one allelic copy (hemizygote) encoding the ancestral and pristine functionality is present. Likewise, for aa/bb, the absence of all wildtypes means that, although there is a complete gain in novel functionality, there is also a loss with respect to the ancestral one. If by a mutation a duplicate gene acquires a new function but with the resultant loss of redundancy, it can then no longer compensate for any subsequent deleterious mutations affecting a paralogous site maintaining the ancestral function. Hence, those individuals that retain a duplicate as a backup may be at an advantage over those for whom any divergence has caused a loss in initial redundancy. Three population sizes (n = 250/500/1000) were considered, as well as both linked and unlinked loci, to determine any possible relationship with the outcome. It has been suggested that the compensation provided by duplicate genes is not only based on stringent mutational conditions but also population sizes for it to take place (Lynch et al. 2001). Specically, this occurs when the product of the deleterious mutation rate and population size, N, is >1 (Wagner 1999). However, this only refers to the fact that any selection for robustness becomes demonstrably signicant when this criterion is met, not that there exists a threshold below which functional compensation does not biologically occur. Although the model does not simulate for these typically large population sizes, something that is quite difcult to do, the effect of compensation is nonetheless observable, albeit less pronounced.

Materials and methods


The simulation program, written in GNU C, employs a twolocus WrightFisher model of differential viability for a diploid population of xed size reproducing by random mating. The xation rate at the parent site was determined by simulating for each iteration, up to as many generations as required for variation to x or disappear, and taking the mean of 10,000 runs. However, the xation rate at the daughter locus was not measured because only deleterious mutations were allowed there and their fate was not of interest. Also, any mutation rate at the parent locus was disabled in order to test how the predicted rate of xation at this site is affected by any paralogous epistasis concerning the tness relationships involved. To determine the degree to which purifying selection constrained functional divergence, the simulated rate of xation was used to then calculate the net/resultant selective effect. In the case of semi-dominant variation, as used here, the diffusion approximation for a one-locus model is given as = 1 e4Nsp /1 e4Ns : where is the probability of xation, N the population size, s the selection coefcient, and p the initial frequency (table 2) (Kimura 1962; Patwa and Wahl 2008). The initial allelic frequency (p) at the parent locus was set at N/2N (i.e., 0.5) in all cases and, for neutral evolution, this would mean that the probability of xation is simply p. It makes no difference whether to use this instead of 1/2N other than to present the results more clearly and understandably.

Table 2. Denition of terminology used. Symbol N v s t p1 p2 Description of parameter Diploid population size Recombination rate (0.0 = complete linkage) Mutation rate at daughter locus Back mutation rate at daughter locus Positive selection coefcient at parent locus Negative selection coefcient at both loci Initial frequency of variation at parent locus Initial frequency of variation at daughter locus Expected probability of xation in a one locus model Simulated probability of xation at parent locus Net/resultant selection coefcient at parent locus

Table 1. Fitness states for each of the nine genotypes. BB AA Aa aa 12s 1-s 1 Bb 12s 1-s 1-t bb 12s 1-(s + t) 12t

Journal of Genetics, Vol. 91, No. 1, April 2012

Joseph Esfandiar Hannon Bozorgmehr

Results
The simulated data, displayed in table 3, indicates that the power of positive selection to promote benecial variation is restrained in duplicate genes when there is a consequent loss in redundancy. For 0.0000< = s < = 0.0002, although the degree of purifying selection is relaxed compared to that for a singleton, any variation at the parent locus is selected against. This is true for all three population sizes examined. For the values of the parameters chosen where N = 500, there is a uniform decrease of 0.0003 in the net selection coefcient at the parent locus due to a compensatory effect. As Kimuras diffusion approximation indicates, there is a direct relationship between population size and the efcacy of selection. In larger populations, even the slightest degree of negative selection results in a signicantly lower rate of xation and this can be observed in the data for N = 1000. In smaller ones, such as for N = 250, random drift is the predominant evolutionary force and the effect of selection is found to be weaker. Also, as the mutational load accumulates at the new duplicated locus, this should begin to weigh upon the effect of any selection for redundancy. The nal entries on table 3

show that, if an initial frequency of deleterious variation is also introduced, in this instance set to 0.1, this greatly increases the compensatory effect and neutralizes a tness advantage for s < = 0.0005. In table 4, for the case of relaxed selection, a more extensive range of deleterious tness levels are presented for the mutations that occur at the daughter locus, while any selective advantage gained at the parent site is nil throughout. Interestingly, the maximal level of purifying selection at the parent locus, when the resultant coefcient is greatest at the daughter site, is only ve times larger (0.0005) than for when t is just 0.001. There is thus a greater selection for redundancy with increasing levels of deleteriousness at the respective sites. As expected, increasing the rate of mutation at the daughter locus greatly increases any selection for redundancy at the parental site and the compensatory effect. As both t and increase, there is a correspondingly exponential logarithmic decrease in the xation of neutral variation. The correlation is therefore of the order (1-et ) and (1-e ), although this cannot be accurately approximated due to the stochastic processes involved. Also, a range of different but realistic mutation rates was considered.

Table 3. The effect of compensation on functional divergence under positive selection. N 250 250 250 250 250 250 250 250 500 500 500 500 500 500 500 500 1000 1000 1000 1000 1000 1000 1000 1000 500 500 500 500 1000 1000 1000 1000 0.0 0.0 0.0 0.0 0.5 0.5 0.5 0.5 0.0 0.0 0.0 0.0 0.5 0.5 0.5 0.5 0.0 0.0 0.0 0.0 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 v 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 s +0.0001 +0.0002 +0.0005 +0.0010 +0.0001 +0.0002 +0.0005 +0.0010 +0.0001 +0.0002 +0.0005 +0.0010 +0.0001 +0.0002 +0.0005 +0.0010 +0.0001 +0.0002 +0.0005 +0.0010 +0.0001 +0.0002 +0.0005 +0.0010 +0.0001 +0.0002 +0.0005 +0.0010 +0.0001 +0.0002 +0.0005 +0.0010 t 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 p1 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 p2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.5125 0.5250 0.5622 0.6225 0.5125 0.5250 0.5622 0.6225 0.5250 0.5498 0.6225 0.7311 0.5250 0.5498 0.6225 0.7311 0.5498 0.5987 0.7311 0.8808 0.5498 0.5987 0.7311 0.8808 0.5250 0.5498 0.6225 0.7311 0.5498 0.5987 0.7311 0.8808 0.4835 0.5006 0.5429 0.6038 0.4832 0.4881 0.5316 0.6013 0.4711 0.5084 0.5752 0.6867 0.4580 0.4802 0.5528 0.6744 0.4528 0.4881 0.6443 0.8321 0.3877 0.4432 0.5790 0.7781 0.3922 0.4284 0.4850 0.6160 0.3242 0.3680 0.5082 0.7126 0.0001 0.0000 +0.0004 +0.0008 0.0001 0.0001 +0.0003 +0.0008 0.0001 0.0000 +0.0003 +0.0008 0.0002 0.0001 +0.0002 +0.0007 0.0001 0.0000 +0.0003 +0.0008 0.0002 0.0001 +0.0002 +0.0006 0.0004 0.0003 0.0001 +0.0005 0.0004 0.0003 +0.0000 +0.0005

Journal of Genetics, Vol. 91, No. 1, April 2012

The evolution of gene duplicates


Table 4. The effect of compensation on functional divergence under relaxed selection. N 500 500 500 500 500 500 500 500 500 500 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 500 500 500 500 500 1000 1000 1000 1000 1000 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0010 0.00001 0.000001 0.0000001 0.00000001 0.0010 0.00001 0.000001 0.0000001 0.00000001 v 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 s 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 t 0.000 0.001 0.002 0.005 0.010 0.020 0.050 0.100 0.250 0.500 0.000 0.001 0.002 0.005 0.010 0.020 0.050 0.100 0.250 0.500 0.500 0.500 0.500 0.500 0.500 0.500 0.500 0.500 0.500 0.500 p1 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 p2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5002 0.4752 0.4539 0.4394 0.4334 0.4139 0.4095 0.3950 0.3830 0.3777 0.5001 0.4342 0.4056 0.3636 0.3423 0.3188 0.3011 0.2796 0.2701 0.2678 0.0394 0.4616 0.4694 0.4703 0.4742 0.0029 0.4192 0.4268 0.4283 0.4344 0.0000 0.0001 0.0002 0.0003 0.0003 0.0004 0.0004 0.0004 0.0005 0.0005 0.0000 0.0001 0.0002 0.0003 0.0003 0.0004 0.0004 0.0005 0.0005 0.0005 0.0032 0.0002 0.0001 0.0001 0.0001 0.0032 0.0001 0.0001 0.0001 0.0001

Discussion
The results provide clear insight into the functional and evolutionary constraints imposed on duplicate genes by virtue of their initial and default redundancy. In all the population sizes examined, positive selection was found to be less effective when a variation, however benecial, causes a loss in redundancy while a paralogous site is subject to deleterious mutationsthis is especially so for weakly benecial alleles that may constitute incremental steps in the development of a new function. This is a very important observation because at some point there must be a loss in the original functionality, typically involving missense mutations or indels, if the duplicate gene is to diverge sufciently as part of any evolutionary transition. In some cases, there may be sufcient functional overlap that permits exible changes that do not signicantly obviate native capability (Basu et al. 2008). For example, amplication of an innate Ald-hydrolytic activity in a duplicated bacterial esterase occurs with no major cost (Negoro et al. 2005). In the case of jingwei, an alcohol dehydrogenase gene present in species of Drosophila has evolved from the retroposition of Adh into another gene and this region has experienced rapid sequence evolution in terms of an accumulation of nonsynonymous substitutions (Long

et al. 2010). The relaxation of selective constraint in this instance is the probable reason since the active site and its reactive chemistry have been highly conserved and only the substrate specicity has been altered. But any variation on the same biochemical theme is itself of a very limited nature. In order for a more signicant functional shift to occur, certain conditions must exist. The mutation rate at the paralogous site(s) should be low, as well as the frequency of any deleterious variation. Further, gain-of-function mutations must be sufciently benecial to escape the constraint of the utility that retaining a functional redundancy confers to the organism. In the DykhuizenHartl model (Zhang 2003), relaxed selection (i.e., nearly neutral drift) is the principal means of divergence at least for the initial stages. Suboptimal changes are allowed to sneak in due to the expected respite in the selection regime. This may be possible if they are only slightly deleterious in nature and an accumulation of them could prove signicant as far as the overall effect on gene function is concerned. Later, as selective conditions change, the evolved sequence may fortuitously prove to provide some novel adaptive function. Nevertheless, unless a new feature is found quickly, there will soon arrive a point of saturation at which any further harmful changes will begin to adversely affect the tertiary structure and stability of the genes protein 5

Journal of Genetics, Vol. 91, No. 1, April 2012

Joseph Esfandiar Hannon Bozorgmehr product (Taverna and Goldstein 2000)thereafter, any further mutations are liable to be more strongly selected against. Moreover, evolution is fundamentally a directionless process and, as well as progressing upon a particular path, it can also take a step back. One consequence of relaxed selection is that a reverse, or compensatory, mutation may succeed in undoing any prior change. This would mean that the gene would tend to revert back rather than to proceed upon an entirely novel evolutionary trajectory. An unexpected discovery of the simulation is the fact that genetic linkage slightly favours the loss of redundancy by increasing the probability of xation for an advantageous mutation that confers a novel function. This would also be true for bi-locus traits since net tness is dependent on the mutual interaction of two or more loci and is not simply the multiplicative or additive result of their respective and separate tness levels. Recombination breaks up instances of linkage disequilibrium that contributes to background interference known as the HillRobertson effect (Comeron et al. 2008). It has thus been considered to make selection more effective, and is advanced as being a major factor in the evolution of sexual reproduction. However, except in regions with very high rates of recombination, it is unlikely that linked tandem duplicates would become signicantly separated from each other. Instead, due to an act of transposition or chromosomal/genomic duplication, they would in fact become distant. It is interesting that many of the paralogous members of the homoebox family of transcription factors, situated on separate chromosomes, are still functionally interchangeable (Greer et al. 2000). The opposite is so for those Hox members clustered together on the same chromosome that have presumably been created through successive tandem duplication. They are instead observed to be distinctly dissimilar in terms of their function, size and organization. The same is also true for the haemoglobin betacluster whose ve members regulate the differing stages of oxygen metabolism during the development of the organism (Caterina et al. 1994). The gamma globin genes (HBG1 and HBG2) are, however, functionally redundant and this may be because they serve to compensate for each other. Moreover, any syntenic rearrangement might well affect gene regulation and expression, but analyses suggest that functional redundancy between dispersed paralogue is almost as prevalent as for those that are in close proximity to one another (Goodstadt and Ponting 2006). Since many proteins are multi-domain entities, they can exhibit more than just one function. This is something that has not been included in the model and simulation for sake of focussing on the central issue. It helps explain why many duplicate genes are not always interchangeable because of the effect of subfunctionalization (Lynch and Conery 2000). In this scenario, complementary degeneration of one of the ancestral functions shared between the paralogue results in a division of labour. This may explain why the deletion of some duplicate genes can and does produce harmful effects since the ancestral functionality has been partitioned, and 6 one or more native subfunctions have become unique to a particular paralogue. Subfunctionalization can thus be seen as also contributing to robustness in the genome, and an important development contrasted against the evolvability of novel functionality (Lenski et al. 2006). Despite the loss of redundancy in this scenario, there is no gain in function although there may well be an advantage in terms of efciency and exibility. Likewise, the escape from adaptive conict model (Storz 2009) envisages only the improvement or elaboration of ancestral subfunctions through gene duplication since any pleiotropic antagonism or impasse that existed for a singleton is no longer a signicant barrier to evolutionary change. It should be acknowledged that the use of a two-locus model may be seen as unrealistic since it does not account for complex multigenic interactions and networking. However, this is only a valid objection if it is applied to the twolocus model itself: something that has been extensively used in population genetics research. It is not apparent that the use of a multilocus model, and one involving a more diverse selection regime, would detract from the basic ndings presented here. However, the simulation could be extended with a multilocus model to investigate multiple paralogous sites produced by successive tandem duplications and how this bears on the effect of functional compensation. It would be expected that the more redundant copies there are of a gene, the more likely it is that the original function is preserved between them. This also means that selective pressures will be relaxed still further and that some of these copies will be even freer to diverge along a novel trajectory. Alternatively, any subfunctionalization or indeed nonfunctionalization could become far more pronounced in this situation. The observational evidence suggests that a mixed picture is a more likely outcome. For example, the evolution of the seven members of the alpha karyopherin gene family has been characterized by the differentiation or diversication of their expression patterns as well as proteinprotein interactions. However, there nonetheless remains a large degree of functional overlap and redundancy across deep time (Bozorgmehr 2011). It has also been suggested that duplications of genes with transcriptional regulatory regions can be deleterious because of the metabolic cost on the cell for producing extra proteins (Wagner 2005) and would reduce the probability of both xation and retention. This was not factored into the 2-locus model as it really only becomes signicant when the number of duplicates of a gene are very high. The purpose of this research is not to provide a predictive paradigm of gene duplicate evolution, but rather to quantiably demonstrate the possible effect of functional compensation in gene duplicates.

Conclusion
In recent years, the importance of functional compensation and redundancy in genomes has increasingly become

Journal of Genetics, Vol. 91, No. 1, April 2012

The evolution of gene duplicates better understood. In classical models, long-term persistence is only likely when paralogue acquire novel functionality. In contrast, the theoretical model described here allows for the possibility of the selective maintenance of both paralogous genes without the acquisition of any novel function. Indeed, the results of the simulation illustrate that a selection for robustness often exists because the evolutionary fates of gene duplicates have become coupled. It therefore provides a proof-of-principle that, in some cases at least, functional divergence may be hindered by the need to compensate in response to deterioration in the original function. While gene duplication is the most natural means of explaining how biological novelty arises, the observed constraints imposed upon a paralogue by stabilizing selection tend towards its evolutionary conservation. This preserving tendency is in fact necessary to prevent the duplicate from degenerating were selection to become completely relaxed. While gene duplicates can become potential incubators for innovation, and so facilitate important adaptations that add to the diversity of the protein repertoire, epistatic tness relationships work to ensure that biochemical information is preserved throughout the proteome. In this way, nature would appear to pursue a safety-rst policy that mitigates for any harmful effects of mutation and so offsets the problem of failure in the process of reproduction.
Dean E. J., Davis J. C., Davis R. W. and Petrov D. A. 2008 Pervasive and persistent redundancy among duplicated genes in yeast. PLoS Genet. 4, e1000113. Ding W., Lin L., Chen B. and Dai J. 2006 L1 elements, processed pseudogenes and retrogenes in mammalian genomes. IUBMB Life 58, 677685. Edger P. P. and Pires J. C. 2009 Gene and genome duplications: the impact of dosage-sensitivity on the fate of nuclear genes. Chromosome Res. 17, 699717. Evans B. J., Chain F. J. and Ilieva D. 2008 Duplicate gene evolution and expression in the wake of vertebrate allopolyploidization. BMC Evol. Biol. 8, 43. Goodstadt L. and Ponting C. P. 2006 Phylogenetic reconstruction of orthology, paralogy, and conserved synteny for dog and human. PLoS Comput. Biol. 2, e133. Greer J. M., Puetz J., Thomas K. R. and Capecchi M. R. 2000 Maintenance of functional equivalence during paralogous Hox gene evolution. Nature 403, 661665. Gu Z., Steinmetz L. M., Gu X., Scharfe C., Davis R. W. and Li W. H. 2003 Role of duplicate genes in genetic robustness against null mutations. Nature 421, 6366. Guan Y., Dunham M. J. and Troyanskaya O. G. 2007 Functional analysis of gene duplications in Saccharomyces cerevisiae. Genetics 175, 933943. Hanada K., Kuromori T., Myouga F., Toyoda T., Li W. H. and Shinozaki K. 2009 Evolutionary persistence of functional compensation by duplicate genes in Arabidopsis. Genome Biol. Evol. 1, 409414. Hannay K., Marcotte E. M. and Vogel C. 2008 Buffering by gene duplicates: an analysis of molecular correlates and evolutionary conservation. BMC Genomics 9, 609. He X. and Zhang J. 2006 Transcriptional reprogramming and backup between duplicate genes: is it a genomewide phenomenon? Genetics 172, 13631367. Hsiao T. L. and Vitkup D. 2008 Role of duplicate genes in robustness against deleterious human mutations. PLoS Genet. 4, e1000014. Hughes T., Ekman D., Ardawatia H., Elofsson A. and Liberles D. A. 2007 Evaluating dosage compensation as a cause of duplicate gene retention in Paramecium tetraurelia. Genome Biol. 8, 213. Jordan I., Wolf Y. and Koonin E. 2004 Duplicated genes evolve slower than singletons despite the initial rate increase. BMC Evol. Biol. 4, 22. Kafri R., Levy M. and Pilpel Y. 2006 The regulatory utilization of genetic redundancy through responsive backup circuits. Proc. Natl. Acad. Sci. USA 103, 1165311658. Kimura M. 1962 On the probability of xation of mutant genes in a population. Genetics 47, 713719. Lenski R. E., Barrick J. E. and Ofria C. 2006 Balancing robustness and evolvability. PLoS Biol. 4, e428. Li J., Yuan Z. and Zhang Z. 2010 The cellular robustness by genetic redundancy in budding yeast. PLoS Genet. 6, e1001187. Liang H. and Li W. H. 2009 Functional compensation by duplicated genes in mouse. Trends Genet. 25, 441442. Long M., Zhang J., Yang H., Li L. and Dean A. M. 2010 Evolution of enzymatic activities of testis-specic short-chain dehydrogenase/reductase in Drosophila. J. Mol. Evol. 71, 241249. Lynch M. and Conery J. 2000 The evolutionary fate and consequences of duplicate genes. Science 290, 11511155. Lynch M., OHely M., Walsh B. and Force A. 2001 The probability of preservation of a newly arisen gene duplicate. Genetics 159, 17891804. Maltsev N., Glass E. M., Ovchinnikova G. and Gu Z. 2005 Molecular mechanisms involved in robustness of yeast central metabolism against null mutations. J. Biochem. 137, 177187.

Acknowledgements I would like to extend my thanks to Professor Joseph Felsenstein of the University of Washington, Seattle, USA, for his help and advice in the formulation and application of the computer simulation that was used here. His assistance on certain aspects of theoretical population genetics was also indispensable.

References
Basu M. K., Carmel L., Rogozin I. B. and Koonin E. 2008 Evolution of protein domain promiscuity in eukaryotes. Genome Res. 18, 449461. Bozorgmehr J. E. H. 2011 An ancient frame-shifting event in the highly conserved KPNA gene family has undergone extensive compensation by natural selection in vertebrates. Biosystems 105, 210215. Brookeld J. F. 2003 Gene duplications: the gradual evolution of functional divergence. Curr. Biol. 13, 229230. Caterina J. J., Ciavatta D. J., Donze D., Behringer R. R. and Townes T. M. 1994 Multiple elements in human beta-globin locus control region 50 HS 2 are involved in enhancer activity and positionindependent, transgene expression. Nucleic Acids Res. 22, 1006 1011. Clark A. G. 1994 Invasion and maintenance of a gene duplication. Proc. Natl. Acad. Sci. USA 91, 29502954. Comeron J. M., Williford A. and Kliman R. M. 2008 The HillRobertson effect: evolutionary consequences of weak selection and linkage in nite populations. Heredity 100, 1931. Conant G. C. and Wagner A. 2004 Duplicate genes and robustness to transient gene knock-downs in Caenorhabditis elegans. Proc. Biol. Sci. 271, 8996.

Journal of Genetics, Vol. 91, No. 1, April 2012

Joseph Esfandiar Hannon Bozorgmehr


Negoro S., Ohki T., Shibata N., Mizuno N., Wakitani Y., Tsurukame J. et al. 2005 X-ray crystallographic analysis of 6aminohexanoate-dimer hydrolase: molecular basis for the birth of a nylon oligomer-degrading enzyme. J. Biol. Chem. 280, 3964439652. Ohta T. 2000 Evolution of gene families. Gene 259, 45 52. Patwa Z. and Wahl L. M. 2008 The xation probability of benecial mutations. J. R. Soc. Interface 5, 12791289. Roth C., Rastogi S., Arvestad L., Dittmar K., Light S., Ekman D. and Liberles D. A. 2006 Evolution after gene duplication: models, mechanisms, sequences, systems, and organisms. J. Exp. Zool. B. Mol. Dev. Evol. 308, 5873. Saitou N., Ezawa K. and Oota S. 2006 Genome-wide search of gene conversions in duplicated genes of mouse and rat. Mol. Biol. Evol. 23, 927940. Shastry B. S. 1995 Overexpression of genes in health and sickness. A birds eye view. Comp. Biochem. Physiol. B. Biochem. Mol. Biol. 112, 113. Skipper M. 2003 Compensation or innovation? Nat. Rev. Genet. 4, 80. Storz J. F. 2009 Gene duplication and the resolution of adaptive conict. Heredity 102, 99100. Taverna D. M. and Goldstein R. M. 2000 The evolution of duplicated genes considering protein stability constraints. Pac. Sym. Comput. Biol. 6980. Tawk D. S., Aharoni A., Gaidukov L., Khersonsky O., McQ Gould S. and Roodveldt C. 2005 The evolvability of promiscuous protein functions. Nat. Genet. 37, 7376. Teichmann S. A. and Babu M. M. 2004 Gene regulatory network growth by duplication. Nat. Genet. 36, 492496. Vavouri T., Semple J. I. and Lehner B. 2008 Widespread conservation of genetic redundancy during a billion years of eukaryotic evolution. Trends Genet. 24, 485488. Wagner A. 1999 Redundant gene functions and natural selection. J. Evol. Biol. 12, 116. Wagner A. 2001 Birth and death of duplicated genes in completely sequenced eukaryotes. Trends Genet. 17, 237239. Wagner A. 2002 Selection and gene duplication: a view from the genome. Genome Biol. 3, 1012.11012.3 Wagner A. 2005 Energy constraints on the evolution of gene expression. Mol. Biol. Evol. 22, 13651374. Zhang J. 2003 Evolution by gene duplication: an update. Trends Ecol. Evol. 18, 292298. Ziha K. 2000 Redundancy and robustness of systems of events. Probabilistic engineering mechanics 15, 347357.

Received 8 June 2011, in nal revised form 18 August 2011; accepted 7 October 2011 Published on the Web: 16 February 2012

Journal of Genetics, Vol. 91, No. 1, April 2012

Potrebbero piacerti anche