Sei sulla pagina 1di 10

Journal o f General Virology (I991), 72, 2875-2884.

Printed in Great Britain

2875

Nucleotide sequence of the genomic RNA of pepper mild mottle virus, a resistance-breaking tobamovirus in pepper
E. Alonso, I. Garcia-Luque, A. de la Cruz, B. Wicke, M. J. Avila-Rincbn, M. T. Serra, C. Castresana and J. R. Diaz-Ruiz*
U.E.I. Fitopatologia, Centro de Investigaciones Biol6gicas, CSIC, Velhzquez, 144, 28006-Madrid, Spain

The entire genomic RNA of a Spanish isolate of pepper mild mottle virus (PMMV-S), a resistance-breaking virus in pepper, was cloned and sequenced and shown to be similar to other tobamoviruses in its genomic organization. It consisted of 6357 nucleotides (nt) and contained four open reading frames (ORFs) which encode a 126K protein and a readthrough 183K protein (nt 70 to 4908), a 28K protein (nt 4909 to 5682) and a

17.5K coat protein (nt 5685 to 6158). This is the first tobamovirus in which none of the ORFs overlap. Both its nucleic acid and predicted protein sequences were compared with the previously determined sequences of other tobamoviruses. The variations and similarities found and their relationship with the pathogenicity of this virus are discussed.

Introduction
Pepper mild mottle virus (PMMV) is a member of the tobamovirus group of positive-strand RNA viruses. The complete nucleotide sequence of three other tobamoviruses, tobacco mosaic virus (TMV) (Goelet et al., 1982), tomato mosaic virus (ToMV) (Ohno et al., 1984) and tobacco mild green mottle virus (TMGMV) (Solis & Garcia-Arenal, 1990) have already been reported. The tobamoviral RNA encodes four different proteins: 126K, 183K, 30K and 17.5K, in that order. The 126K and 183K proteins are involved in the replication processes (Young et al., 1987; Quadt & Jaspars, 1989), the 30K protein participates in the cell-to-cell spread of the virus (Deom et al., 1987; Meshi et al., 1987) and the 17-5K protein is the coat protein. The 126K and 183K proteins are directly translated from the viral RNA, whereas the 30K protein and coat protein are translated from subgenomic RNAs (Palukaitis & Zaitlin, 1986). PMMV is one of the most destructive pathogens of protected pepper crops. It is found infecting pepper cultivars with genetically incorporated resistance to TMV and ToMV. The infection by this virus produces important economic losses all over the world in crops grown under plastic or glass (Wetter & Conti, 1988). Additionally, PMMV is unable to infect tomato plants and possesses a reduced capability to replicate and/or accumulate in tobacco plants when compared to TMV
The nucleotide sequence data reported in this paper will appear in the EMBL, GenBank and DDBJ nucleotide sequences databases under the accession number M81413. 0001-0404 1991 SGM

and ToMV (Wetter et al., 1984; Garcia-Luque et al., 1990). To develop an understanding of the mechanism(s) involved in these biological properties, we have determined the nucleotide sequence of PMMV-S RNA, a Spanish isolate of PMMV (Alonso et al., 1989). The nucleotide sequences of its 5' and 3' non-coding regions have previously been reported (Avila-Rinc6n et al., 1989). In this paper, we present the cloning and complete nucleotide sequence of PMMV-S and the analysis of its deduced amino acid sequences.

Methods
Virus propagation, purification and RNA extraction. The origin of PMMV-S has been reported previously (Alonso et al., 1989). The virus was purified from Nicotiana clevelandii Gray plants as described (Garcia-Luque et al., 1990). Virion RNA was prepared by conventional SDS-phenol extraction after heating of the particles in 20 raM-sodium phosphate buffer pH 7.0, 0.5,% SDS for 20 s at 100 C. eDNA synthesis and cloning, cDNA was prepared as described by Gubler & Hoffman (1983), using a commercial cDNA synthesis kit (Boehringer-Mannheim). PMMV-S RNA was 3' polyadenylated in vitro with Escherichia coti poly(A) polymerase (0-25 unit/~tg of RNA) for 7 min at 37 C, under the conditions recommended by the manufacturer (Pharmacia LKB Biotechnology), and first-strand cDNA synthesis was primed with oligo(dT). Double-stranded cDNA was sizefractionated in 0.8% agarose gels, and the cDNA was eluted and ligated into plasmid pUC18 digested with HincII. In another experiment, EcoRI linkers were added to ds cDNA that had previously been treated with EcoRI methylase and EcoRI-digested prior to being size-fractionated in agarose gels. After elution, the cDNA was cloned into EcoRI-digested pUC18. Plasmids were tested for the presence of viral cDNA inserts by colony hybridization, using randomly primed

2876

E. Alonso and others Nucleotide sequence of P M M V - S and comparison with other tobamoviruses Fig. 2 shows the sequence of PMMV-S RNA and that of its deduced amino acid. The genome of PMMV-S is 6357 nt long. It shares an overall sequence identity of 69.4% with the RNA of ToMV (Ohno et al., 1984), 68.5% with that of TMV (Goelet et al., 1982) and 64% with TMGMV (Solis & Garcia-Arenal, 1990), the other members of the tobamovirus group whose entire RNA sequences are already known. As shown in Table 1, also, PMMV-S shares a higher degree of amino acid sequence identity with ToMV than with TMV and TMGMV.

riP-labelled cDNA to PMMV-S RNA. Other cDNA clones were prepared using a primer complementary to nucleotides (nt) 5211 to 5228, in which a BamHI restriction site was created by addition of an extra G at its 5' end. Ds cDNA was restricted with BamHI and Bcil, size-fractionated in agarose gels, and the 1200 nt fragment was eluted and ligated to pU C 18 digested with BamHI. Another set of clones was obtained after priming cDNA synthesis with an oligonucleotide complementary to nt 4021 to 4036. Ds cDNA was restricted with Sail, fractionated in a 6% polyacrylamide gel, and the 600 nt fragment was eluted and cloned into the SalI-Smal sites of pUC 18. All recombinant DNA techniques were as described by Maniatis et al. (1982), using E. coli strains JM83 and DH5~. Nucleotide sequence determination and analysis. The nucleotide sequence ofcDNA clones was determined by the chemical degradation procedure (Maxam & Gilbert, 1980). Subclones used for sequencing were generated by deletions with nuclease Bal 31 or restriction enzyme digestion. Sequences were analysed using the DNASTAR computer programs.

Organization of the 126K/183K gene sequence The first open reading frame (ORF) of PMMV-S RNA begins at nt 70, in the first AUG encountered from the 5' end, and extends to nt 3423, encoding a protein of 1117 amino acids (126K), with a calculated Mr of 126 304. The readthrough of the amber codon (UAG), possibly by insertion of tyrosine (Beier et al., 1984), results in a 183K protein which terminates at position 4908. It is composed of 1612 amino acids with a predicted Mr of 183340. The nucleotide and amino acid sequences in the readthrough part of the 183K protein (nt 3421 to 4908, amino acids 1118 to 1612) (Fig. 2 and 3) are the most highlyconserved in all the genome, with only 14 and 15 nonconservative amino acid substitutions with respect to the corresponding proteins of ToMV and TMV, respectively. The 126K and 183K proteins are thought to be involved in viral replication because they have been detected in partially purified preparations of the viral polymerase complex and because they contain several sequence motifs which are conserved in proteins known to act in replicative processes of plant and animal viruses (Young et al., 1987; Goldbach & Wellink, 1988; Strauss & Strauss, 1988; Quadt & Jaspars, 1989). The alignment of the 126K/183K proteins of PMMV-S with those from the more closely related tobamoviruses (ToMV and TMV) shows that the sequence is well conserved along all the protein (Fig. 3), except for three stretches (amino acids 155 to 191,623 to 669 and 768 to 791) in which nonconservative substitutions as well as deletions and insertions occur. Other regions of weaker amino acid sequence identity correspond to positions 382 to 388,537 to 555 and 991 to 1001 (Fig. 3). Based upon the existence of conserved motifs between the tobamoviral 126K protein and those from other RNA viruses, two functional domains have been defined. The first one, in the amino part of the protein, has homology with the nsP1 protein of alphaviruses and with the amino part of other proteins implicated in the

Results and Discussion


Sequence determination and terminal non-coding regions Fig. 1 shows the strategy used to determine the sequence of the genomic RNA of PMMV-S from a set of overlapping cDNA clones. They contain sequences representing all but the first 34 nucleotides located at the 5' end of PMMV-S RNA. Most of the sequence was obtained from at least two independent cDNA clones. The nucleotides of the 5' and 3' non-coding regions of PMMV-S RNA have previously been sequenced directly on the viral RNA (Avila-Rinc6n et al., 1989). As with other tobamoviruses, PMMV-S possesses a 69 nt leader sequence, devoid of G residues, termed the ~) fragment (Richards et al., 1977; Avila-Rinc6n et al., 1989). Its 3' non-coding region is 199 nt long. It was previously proposed that some structural features in the tRNA-like conformation of PMMV-S RNA such as two unpaired nucleotides connecting the aminoacyl and anticodon arms could be related to its lower replicability observed in tobacco plants (Avila-Rinc6n et al., 1989) as described for certain chimeric tobamoviruses (Ishikawa et al., 1988). The determined nucleotide sequence of the cDNA clones coincides with that of the RNA except for a t;ingle base transition at position 6181 which would change the C/G pair (6197/6181) situated at the beginning of the V stem in the proposed secondary structure (Avila-Rinc6n et al., 1989) to a C/A pair. This nucleotide substilution was present in three of four sequenced cDNA clones. In other parts of the genome no sequence heterogeneity was found in the clones analysed. The only nucleotide difference was found in clone 4, in which the insertion of a T between nt 5385 and 5386 could lead to a truncated protein.

PMMV-S

RNA

sequence

2877

126K

183K

3oK
P

IV-we-1

EH
5r

II

,,
I

NsBABSa K

f' ff (

AE A

BS BcNsBc

H .Bs N

Sc

It I
3000

II

I'll

III
4000

,1If
5000

f
85

,
6000

r3,

1000

20'00
EC-8
d t

D j

75 H-92 174

U-11;17 BKB-8;3

Fig. 1. Genomic organization, partial restriction map and sequencing strategy for PMMV-S cDNA. Open boxes drawn approximately to scale represent the coding regions for the 126K, 183K, 30K and coat protein (CP) gene products. Arrows represent the strategy followed to determine the sequence of the overlapping cDNA clones used (EC-8, 75, H-92, U-I 1, U-17, BKB-8, BKB-3, 85, 4 and 174). Abbreviations for restriction sites are : A, AvaI ; B, Bgll I ; Bc, BclI ; E, EcoR V ; H, HindlII ; K, KpnI ; N, NsiI ; S, SalI; Sa, SacI ; Sc, SaclI.

Table 1. P e r c e n t a g e s o f s e q u e n c e identity b e t w e e n the P M M V - S other t o b a m o v i r u s e s Gene 126K Virus* ToMV TMV TMGMV a TMGMVb CGMMV-W SHMV N:~ 68.8 67.9 62.4 - . . A~ 74.7 73.3 62.0 . . . . N 73-7 73.0 68.4 . . 183Kt A 82.0 80.0 73-2 N 65.8 63.4 63-1 64.7 46.0 38.5

g e n e s a n d those f r o m

30K A 64-5 67.4 61.5 65.0 33-5 26-7 N 67.5 65.6 65.6 65.6 48.1 46-4

CP A 73.9 72.0 70.1 70.1 36.5 40.8

* Data are from Ohno et al. (1984), Goelet et at. (1982), Meshi et at. (1981, 1982), Saito et aL (1988) and Meshi et al. (1983), for ToMV, TMV, SHMV (30K and coat protein), CGMMV 30K and CGMMV-W CP, respectively. Data reported for two isolates of TMGMV (TMGMVa and TMGMVb) are from Solis & Garcia-Arenal (1990) and Nejidat et al. (1991), respectively. t The sequences analysed correspond only to the readthrough part of the 183K protein. :~N and A, nucleotide and amino acid sequence homologies, respectively. No sequence data are available for comparison.

replication of R N A viruses with a m o n o p a r t i t e or d i v i d e d g e n o m e (Ahlquist e t al., 1985). I n this region, R o z a n o v e t al. (1990) have identified two conserved sequence motifs defined by the presence of an i n v a r i a n t His in the first m o t i f a n d the sequence AspoX-X-Arg in the second one, that are located at a m i n o acid positions 76 to 81 a n d 134 to 138 in the 126K/183K p r o t e i n of T M V (Fig. 3), respectively. By analogy with the nsP1 p r o t e i n of S i n d b i s virus (Mi e t al., 1989), this d o m a i n

m a y be responsible for the m e t h y l t r a n s f e r a s e activity necessary for the cap f o r m a t i o n of the g e n o m i c a n d s u b g e n o m i c R N A s . O f the two a m i n o acid sequence motifs described by R o z a n o v e t al. (1990), the predicted 126K/183K p r o t e i n from P M M V - S possesses both, except for a c o n s e r v a t i v e s u b s t i t u t i o n lie to Val at position 135 (Fig. 3). This d o m a i n is the best conserved with respect to T M G M V . T h e second f u n c t i o n a l d o m a i n in the 126K/183K

2878

E, Alonso and others

1 GT~.I.~.I.I.I C ~ C ~ C ~ C ~ C A C A A A C ~ C ~ C ~ C A ~ A CA

~12~183K C~ CAAAATAC~CTAC~T GG C~ACACAC~C~GCTACC~C GCCGCA~AGC~GTACTCTCC~ M A Y T Q Q A T N A A L A S T L R

121 GG~T~CCCC~GGTG~C~TC~GCT~TCGGA~CTGTACG~ TCAGCGGTCG~c~TGC~TGcACATGACCGCAGGCCC~GG~I'I'I'I~GGTC~T~GC~G G N N P L V N D L A N R R L Y E S A V E Q C N A H D R R P K V N F L R S I S E E 241 CA~CGC~ATCGC~CT~GGCCTACCCTGAG~CC~TCACG~CTAC~CACGCAG~CGCTGTGCACAGTCTCGCAGGTGGAC~CGGTC~GG~CTAG~TAC~GATGATG QTL A T K A Y P E F Q T F Y N T Q N A V H S L A G G L R S L E L E Y L M M 361 CAGATCCCCTACGG~C~CGACATATGATATCGGGGG~TTTTGCTGCTCACATG~GGTCGTGACTACG~CA~GCTGCATGCCT~CATG~C~ACGT~CGT~TGCGT Q I P Y G S T T Y D I G G F A A H M F K G B D Y V H C C M P N M D L R D V M R 481 CAC~TGCTCAAAAGGATAGCA~G~CTGTACC~C~GC~GCGCAAAAGAAAAAGGT~TACCGCCATATCAAAAGCCATGC~GAT~TACACGGAC~TCCGC~T~A H N A Q K D S I E L Y L S K L A Q K K K V I P P Y Q K P C F D K Y T D D P Q S V

601 GTGTGCTCG~CC.~-~-~.CAGCACTGCG~GG~G~CGcACTG~ACGGAT~GTATACGCTGT~GC~GCACAG~ATAC~CA~CcAGCAGATG~GGGG~G~C~TG V C S K P F Q H C E G V S H C T D K V Y A V A H S L Y D I P A D F G A A L L 721 AGGAG~TG~CATGTCTGCTATGCTGCC~CCACTTTTCTGAG~TC~CTTTTAG~GA~CGTATGTCAGTC~GACGACATAGGCGC~C~CTCGA~GAGGGCGATATG~G R R N V H V C Y A A F H F S E N L L L E D S Y V S L D D I G A F F R E G D M L 841 ~C.F~-~CTTTTGTAGCAGAGAGTAC~ATACTCA~CCTATAGT~TGTGC~GTATGTGTGT~GAC~AC~CCCCGC~CTAGTAGAG~GTGTACATG~GGAG~ NFS V A E S T L N Y T H S y S N V L K Y V K T Y F P A S S R v Y M K E F 961 ~GGT~CTAGGGT~TAC~GGTTTTGT~G~TC~GG~AGATACC~TGTACTATATAGAGGTGTATACCACAGAGGTGTAGAC~GGAGCAA~ACAGTGC~TGG~AT L V T R V N T W F C K F S R L D T F V L Y R G V Y H R G V D K E Q F Y S A M E D 1081 GC~GGCA~ACAAAAAGAC~GGC~TGATG~TAGCG~G~TCCTC~AGAGGA~CATcGTCTG~A~GG~CC~G~TATGGT~TAGTACC~G~C A W H Y K K T L A M M N S E R I L L E D S S S V N Y W F P K M K D M V I V P L F 1201 GACGTATC~ACAG~CGAGGGGAAAAGG~AGC~G~GGAGGTCATGGTCAGC~GGAC~CG~ATACTGTGC~TCATA~CGCACATAcCAGTCG~GCGc~AC~AC D V S L Q N E G K R L A R K E V M V S K D F V Y T V L N H I R T Y Q S K A L T Y 1321 GC~TGTA~ATCG~CG~GAGTC~AT~GATC~GAGTGAT~TC~TGGGGTGACTGCGcGCTCAGAGTGGGATGTGGAT~GGC~G~GCAGTCCCTGTC~TGAc~zT~.~C A N V L S F V E S I R S R V I I N G V T A R S E W D V D K A L Q S L S M T F F 1441

~GCAGACc~GGCCATGCTC~GGATGACcTCGTGGTTCAG~CC~GTGCATTCC~TCGCTCACTG~TATGTCTGGGATGAGA~ACTGCTGCTTTT~c~G~i~i~i. L Q T K L A M L K D D L V V Q K F Q V H S K S L T E Y V W D E I T A A F H N C F

1561 C C T A ~ T C ~ G G A G A G G ~ C ~ G ~ C T C A T ~ C T G ~ T C G G A A A A G G C T C ~ G ~ G T A C C T G A ~ G T A T G T ~ C ~ C C A C G A T A ~ G G ~ G G A G T A C ~ G P T I K E R L I N K K L I T V S E K A L E I K V P D L Y V T F H D R L V K E Y K 1681 Tc~CGGTGG~TGCcGGTACTGGACG~AAAAAGAGC~GG~G~GCAG~GTGATGTAC~TGCT~GTCAG~TCTC~C~GAcAGTGAC~G~GATG~TG~ SSV M P V L D V K K S L E E A E V M Y N A L S E I S I L K D S K F D V D V 1801

.~.~.~WCCCG~TGTGT~TACA~AGGCGTAGATcCA~GGTGGcAGC~GGT~TGGTAGCTGTGG~C~TGAGAGTGG~GACC~cG~GA~GGcCTACC~GCAAAT
F S R M C N T L G V D P L V A A K V M V A V V N E S G L T L T F R P T E A N
. . .

1921 GTCGCAC~GCA~GC~CCGAC~ACATC~GGAGG~GG~cG~G~GA~GTGTCGTCAGACGTAG~TGAGTCCTC~Tc~GG~GTGG~CGAAAATCA~GA~CTATG V A L A L Q P T I T S K E E G S L K I V S S D V G E S S I K E V V R K S E I S M 2041 ~GGTCT~CAGGC~CACAGTGTCCGATGAG~CC~G~GTACAG~TCGAGTCG~GCAGCAG~CATATGGTATCCACAGAGACGA~ATCCGT~CAGATGCATGCGATG L G L T G N T V S D E F Q R S T E I E S L Q Q F H M V S T E T I I R K Q M H A M 2161 2281 V Y T G P L K V Q Q C K N Y L D S L V A S L S A A V S N L K K I I K D T A A I D

GTGTATACTGGTCCGCTAAAAG~C~c~Gc~G~cTA~TAGAcAGccTGGTAGCCTcGCTcTcTGcTGcGGTATc~ccTG~GAT~Tc~GAcACAGCTGCTATA~T CTcGA~C~GGAAAAA~GGAGTCTACGACGTGTGCC~G~TGGTTGGTG~ccTcTATcA~AGGACATGC~GGGGTGTGGTGATGGACT~GACTAT~GTGC~G~ L E T K E K F G V Y D V C L K K W L V K P L S K G H A W G V V M D S D Y K C F V

2401 GCGC~CTCACATAC~TGGC~G~CA~GTGTGCGGAGAGACATGGCGTAGAGTCGCAGTGAGcTCCG~TC~GGTGTA~CAGATATGGGG~GAT~GAGcTATACGCTcTGTG A L L T Y D G E N I V C G E T W R R V A V S S E S L V Y S D M G K I R A I R S V 2521 C~GACGGTG~CCCCATAT~GCAGTGC~GG~ACAC~GTTGATGGTG~CCTGG~GCGG~GAC~GGAGA~C~CGAGGGTC~C~GACG~GATCTAG~CTG L K D G E P H I S S A K V T L V D G V P G C G K T K E I L S R V N F D E D L V L


. .

2641 GTACCAGGAAAACAGGCTGCTG~TGAT~G~G~GGGC~CAG~TGG~TCGTGG~GACC~GGAG~TGT~GGACGGTAGACTC~C~TG~ACGGTC~GGT V P G K Q A A E M I R R R A N S S G L I V A T K E N V R T V D S F L M N Y G R G 2761 CCGTGCC~TACAAAAGGCTG~CTGGAT~GGTCT~TG~ACAC C~TGG~GTG~TTTTCTGG~GGCATGTCTCTATGCTCCGAGGC'I"F~G~ATG~cCCAG~G p C Q Y K R L F L D E G L M L H P G C V N F L V G M S L C S E A F V Y G D T Q Q 2881 A~CC~ACATC~CA~G~G~CTTTTCCCTATCCT~GCA~GAGTC~CTCGAGGTCGATGCTG~G~CTCGCAG~C~CG~GCGGTGTC~GCT~TATCACC~C~C IPY N R V A T F P Y P H L S Q L E V D A V E T R R T T L R C P A D I T F F 3001 ~TCAG~ACG~GGGC~G~ATGTGCACATC~GTG~ACACGCTCGGTGTCACACGAGGTCATCC~GGTGCAGCGGT~T~TCCAGTGTCT~CCAC~GGG~G L N Q K Y E G Q V M C T S V T R S V S H E V Q G A A V M N P V K P L K G K 3121 GTGA~ACA~CACTCAGT~C~GTCA~GCTGCTCTCGAGGGG~ACG~GATGTGCATACCG~CATGAGGTGC~GGGG~CG~G~GACGTCTCAcTAGT~GG~CG VIT T Q S D K S L L L R G Y E D V H T V E V Q G E T F E D V S L V R L T 3241 CC~CACCCGTGGG~T~CAAAGCAGAGTCCGCACCTG~GGTCTCA~GTCTAGGCATAC~GGTCGATC~TA~ACACAG~GTGCTAGATGCAGTCG~GTG~A~ P T P V G I I S K Q S P H L L V S L S R H T R I K Y Y T V V L D A V V S V L R 3361 ~T~G~GTGTGT~GTAG~ACCTG~AGATATGTAC~G~GATGTGTCGACTC~TAG~ACAGATAG~TCGGTGTAC~GGTG~CC.~.~.~.~.CGTCGCAG~c~AAAA DLE V S S Y L L D M Y K V D V S T Q * Q L I E S V Y K G V N L F V A A P K 3481 ACAGGAGATG~CTGACATGC~TA~A~ACGAC~GTGT~GCCGGG~CAGTACTATACTC~TGAGTATGATGCTGT~CTATGcAAATACG~AG~TAG~G~TGTC~G T G D V S D M Q Y Y Y D K C L P G N S T I L N E Y D A V T M Q I R E N S L N V K
. . .

3601 GA~GT~G~GGATATGTCG~TCGGTGCCTC~CCGAGAG~TCTGAGACGACA~G~CCTGTGATCAGGACTGCTGCTGAAAAACCTCGAAAACCTG~G~GGAAAA~G D C V L D M S K S V P L P R E S E T T L K P V R T A A E K P R K P G L L E N L
. .

3721 GTCGCGATGATCAAAAG~C~CTCTCCCG~AGTAGGGGTTG~GACATCG~GACACCGC~CTCTAGTAGTAGAT~G~GATGCATAC~GAAAAGAAA V A M I K R N F N S P E L V G V V D I E D T A S L V V D K F F D A Y L I K E K K

P M M V - S R N A sequence

2879

3 8 4 1

AAACCAAAAAATATA~CT~TG~TTTC~GGGCGAG~GGAAAGATGGATCGAAAAG~GAG~GT~C~GGCCA~GGCTGATTTTGACTTTA~GA~ACC~GCCG~GAT K P K N I P L L S R A S L E R W I E K Q E K S T I G Q L A D F D F D L P A V D

3961 4081

~TACAGGCACATGATC~GCAGCAGC~GAAACAG~G~GGAT~AGTA~AAACTG~TACCcGGC~G~AAACTA~GTGTATCATAGC~GAAAATC~TGCGC.~-~-~.~.~.GGT Q Y R H M I K Q Q P K Q R L D L S I Q T E Y P A L Q T I V Y H S K K I N A L F G
CCTGTA'I'I'I'I'CA~6~CAGCTGCTA~GAC~GACAG~6~GA~CATGTTTTATAC~GG AAAACGCCTACACAGATCG~G~'I'I~'I~CTCA~TCTGGACTCT~T PVF E L T R Q L L E T D S S R F M F Y T R K T P T Q I E E F F S D L D S N
. .

4201 G~CCTATGGACATA~A~GCTAGACA~CC~GTATGAcAAATCACAG~CG~CA~GTGCAGTCGAGTATGAGA~GGAAAAGG~AGGC~AGACGA~C~GGCTG~ V P M D I L E L D I S K Y K S Q N E F H C A V E Y E I W K R L G L D D F L A E 4321 G~GGAAACACGGGCAT~GG~GAC~CG~GAAAGACTACACAGCCGG~TAAAAACGTG~GTGGTA~CAGAGGAAAAGCGGTGATGTCAC~ACA~GGAAACACGATCA~ V W K H G H R K T T L K D T A G I K T C L W Q R K S G D V T T I G N T I I

4441 A~GCTGCATGTCTGTCCTCTATGCTACCGATGGAGAGA~G~AAAGGTGCCTTTTGTGGTGATGATAGTATA~ATAC~CCAAAGGGCACTGA~CCCC~TA~C~CAGGGC IAA L S S M L P M E R I K G A F C G D D I L Y F P K G T D P D I Q Q G 4561 GCAAACC~CTCTGG~TTT~G~GCC~G~G~AGG~GAGATATGG~ACTTTTGCGGTAGGTACAT~CACCATGAcAGAGGCTGTA~GTATA~ATGAcC6TCTAAAA~G ANL W N F E A K L F R K R Y G Y F C G R Y I H H D R G C I V y D p L K L

4681 A T C T C G A A A C T C G G T G C A A A A C A C A T C ~ G ~ T A G A G ~ C A ~ A G A G G ~ A G G A C C T C T C ~ G T G A T G ~ G C T G G G T C G ~ G ~ C ~ G T G C G T A C T A T A ~ C A ~ C G A C ISK G A K H I K N R E H L E E F R T S L C V A G S L N N C A Y Y T H L N D

~3OK

4801 GCTGTCGGTGAGG~A~GACCGCACCTC~GG~CG~T~TATAGAGCA~AGTT~GTAC~GTGTGATA~ AGG~A~CAAACA~Gr~2"~GGAGT~TGGCG~AGTA AVG V I K T A P L G S F V Y R A L V K Y L D K R L F Q T L F L E * M A L V 4921 G T C ~ G G A C G A C G ~ G A ~ C T G A G ~ C A T C ~ G T C T G C C G C T G A G ~ C ~ C T G C T G ~ A T G A C ~ C G G T C ~ G A c G G T A C G ~ T T T C G A A A G ~ G A C A A A G T G A ~ G C A VKD V K I S E F I N L S A A E K F L P A V M T S V K T V R I S K V D K V I A 5041 ATGGAAAACGA~CG~ATCCGATGTG~GC~AAAGGTGTAAAGCTTGTT~GGATGG~ATGTGTG~AGcAGGG~AG~GTGTCCGGGGAGTGG~CCTAcCCGAC~CTGC M E N D S L S D V N L L K G V K L V K D G Y V C L A G L V V S G E W N L P D N C 5161 AGAGGTGGAGT~GCG~GTTTGG~GAT~GAG~TGCAAAGAGATGACG~GC~CAC~GGATC~ATAG~CCAGTGCAGCT~GAAACGATTTGC~CAAA~GATCCCG~T R G G V S V C L V D K R M Q R D D E A T L G S Y R T S A A K K R F A F K L I P N 5281 TATAGCA~ACTACCGCCGATGCTGAGAGAAAAGTTTGGC~GTTTTAG~TA~A~GGTG~GCCATGGAAAAGGGTTTCTGTCCT~ATCTTTGGAGTTTGTCTCAG~GTA~ Y S I T T A D A E R K V W Q V L V N I R G V A M E K G F C P L S L E F V S V C I

5401 GTACACAAATCC~TATAAAA~AGGC~GAGAGAGAAAATTACTAGTGTGTCAG~G~GGACCCG~G~C~ACAG~GCAGTCG~GATGAG~CATCG~TCAG~CC~TGGCT V H K S N I K L G L R E K T S V S E G G P V E L T E A V V D E F I E S V P M A 5521 5641 5761 5881 GACA~ACGTAAATTTCGC~TC~TCT~GAAAGG~GT~T~GTATGTAGGT~GAGAAATGAT~T~GGG~G~T~GG~GGG~GCTGTTTGAT~GG~AG~GGG D R L R K F R N Q S K K G S N K Y V G K R N D N K G L N K E G K L F D K V R I G ~CPCAGAACTCGGAGTCATCGGACGCC~GTc~C~CGTT~F~CTATGGC~ACACAG~CCAGTGCC~TC~AGTGTA~AGG~CTGTATGGGCTGATCCA~A~G~Ac~ Q N S E S S D A E S S S F * M A Y T V S A N Q L V Y L G V W A D P L E L Q A ATCTGTGTAC~CGGCG~AGGC~TCAG~CAAACAC~CAGGCTAG~CTACGG~C~CAG~G~CTCTGATGTGTGG~GACTA~CCGACCGCTACAG~AGA~CCTGcTA L C T S A L G N Q F Q T Q Q A R T T V G Q Q F S D V W K T P T A T V R F P A CTGGTTTCAAAGTTTTCCGATAT~TGC~GTGCTAGA~CTCTAGTGTCGGCAC~CTCGGAGCC~GATACTAG~CAGGAT~TAG~G~GAAAATCCGCAAAATCCTAC~CTG G F K V F R Y N A V L D S L V S A L L G A F D T R N R I I E V E N P Q N P T T

6001 CC~GACGC~TGCGAC~GGCGGGTAGACGATGCGACGGTGGCC~AGGGCCAGTAT~GT~CCTCATG~TGAG~AG~CGTGGCACGGG~TGTAC~TC~GCTCTG~CG E T L D A T R R V D D A T V A I R A S I S N L M N E L V R G T G M Y N Q A L F 6121 6241

AGAGCGC~GTGGACTCACCTGGGCTAC~TC~AAACATGATGGCATAAAT~G~G~CG~A~AAACGTCCGTGG~GAGTACGAT~CTCGTA S A S G L T W A T T P *

GTQ'I'I~I'I'I'CCCTCCAC~

ATCGAAGGG~GTCG~GGGATGG~CGC~A~ATACATGTGTGACGTGTA~GCG~CGACGT~A~i.i.f~CAGGGG~CG~TCC~C~CCG~CGCGGGTAGCGG~CCA~H

Fig. 2. The entire nucleotide sequence of the PMMV-S genome, shown as DNA, and deduced amino acid sequences of the ORFs. Arrows indicate initiation codons. Terminationcodons and the amber readthroughcodon are indicated by an asterisk. The predicted position ~r the origin of assembly is underlined. protein has been mapped to amino acids 833 to 1086, and is known as the helicase domain (Hodgman, 1988; Gorbalenya & Koonin, 1989; Habili & Symons, 1989). It is implicated in nucleic acid unwinding, and possibly in other processes such as recombination and transcription. Of the six conserved motifs common to Sindbis-like viruses in which the tobamovirus group has been classified (Goldbach & Wellink, 1988), the NTP-binding activity has been ascribed to domains I and II (amino acids 833 to 850 and 902 to 913, respectively). The domain I motif is strictly conserved in the 126K protein of PMMV-S with respect to those of ToMV and TMV, but differs from that of T M G M V in which the consensus Thr at position 842 has been substituted by Tyr, and the following Glu-Ile-Leu-Ser sequence has evolved to GlyAsp-Phe-Glu, the first substitution being of the semiconservative type and the other three of the nonconservative type. It is therefore possible that the amino acid exchanges occurring in this domain in T M G M V could result in its lower replicability, in comparison with TMV or ToMV (Wetter, 1986). In the domain II to V motifs (amino acid positions 902 to 913,930 to 940, 966 to 974, 1038 to 1055 and 1070 to 1085, respectively), all of the amino acid substitutions that occur in the PMMV-S protein with respect to the ToMV and TMV ones are of the conservative type: Ile to Leu (position 906), Tyr to Phe (position 930), al to lle (position 974), Tyr to Phe (position 1048), Ala to Glu (position 1049), Val to Leu (position 1070), Lys to Arg (position 1079) and Leu to Ile (position 1081). The same occurs with the amino acid substitutions in T M G M V (Fig. 3). The carboxyl end of the 183K protein results from

2880

E. Alonso and others


. . . . I I

PMMV-S ToMV TMV


~

m A Y T Q Q A T N A A L A S T L R G N N P L V N D L A N R R L Y E SAVEQCNAHDRRPKVNFLRS I S EEQTL IATKAYPEFQ ITFY N~QNAV~S L A G G L R S L E L E Y L M M Q I P T SS LE V T K DT DEF R SKVV I I T TS LD V S K DT EF R SKV R I I

100

msns.

LESVSXT

,D~

EF

sz~

~vs.

, V
197

PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S ToMV TMV TMGMV PMMV-S
ToMV

YGS%~YD I G G N F A A H M F K G R D Y V H C C M P N M D L ~ - - - - ~ A Q K D S I ELYL SKLAQKKKVI PPYQKPCFDK~'TDDPQSVVC SKPFQH -CE -GVSHC -TDKVY L S L A L V II ~G R ERGN HV NF EA R A E M NE HDT T- R-HSQE Y GR L S L A L V II ~G R ERGG TV NF EA R AEI E D A HNT T M - R - H Q P M Q Q S G R P L L I iI ~G M R SRSN EF REA NR A E A NE C T D- R I H P P E N - S G R R AVALHSLYD IPADEF G A A L L R R ~ d V C Y A A F H F SENLLLED SYVSLD D I GAFF SRE GDMLNF SFVAE S T L N Y T H S Y S N V L K Y V C K T Y F P A S SREVYMKEF I I K H N E N C Q D R T AS S I N I I K T N E N C D KT AS I N VH ISK I SILA A DQTE T NE T K DVS F AD K K I H V S I F LVTRVNTWF C KF SRLDTFVL YRGVYHRGVD KEd F Y S A M E D A W H Y K K T L A M M N SERI LLED S S SVNYWF P K M K D M V I V P L F D V S L Q N E G K R L A R K E V M V S K I L K A K S K C R I ET-S -T L I L K A KS S T C R I ET-S -T L T K V YI KS RQV C SD E FA F T A I F R TA F EG I-T-S KMT S I NR DFVYTVLNH I R T Y Q S K A L T Y A N V L S F V E S IRSRVI INGVTARSEWDVDKALLQSL S M T F F L Q T K L A M L K D D L W Q K F Q V H S K S L T E Y W W D E ITAAFHNCF A S S H V LIS ALGP TVSQH SL G A F A S Y H V LIS SLG TVCQH SL G A A Q I P A Q I MG RCLD TTS LI VGKF G V PTI KERL INKKL ITVSEKALE IK V P D L Y V T F H D R L V K E Y K S SVEMPVL DVKKSL E EAEVMYNAL SE I S I LKDSDKFDVDVF S R M C N T L G V D P L V A A K V M V S R KIT N R S M D IR KM T E L V N Q QS E MT I SV L R R AGD R T A D A IR KM T L V RE Q QS E MT I V S R ILD N K I W K F A TK E L H I D Q D L GA N IAK KD KA D S D R I A V V S N E S G L T L T F E R P T E A N V A L A L Q P T IT S K E E G S L K IVS SDVGE S S I REVVRKSE ISML GLTGNTVSDEFQRSTE IE SLQQFHMVSTET I I R K Q M H A M M Q DSEKASD- A VVT R E P GSMARG LQLA S D V P E S S Y T E E ATASSL H CSI M DQEKAS - A VVT RE E P M GSMARG LQLA A DHPESSYSKNE E ATADSL SSI AE R DK E K KS -A EAVVC EPT EE NVNKFS-IAE G R L P V C A - E S H G L T N A N L E L Q L ND K A C V D S V T ASV V Y T G P L K V Q Q C K N Y L D S L V A S L SAAVSNLKKI IKDTAAI DLETKEKF GVYDVCLKKWLVKPLSKGHAWGVVMD SDYKCFVALLTYD GEN IVCGETWRRVA M FI V L RQ L AS R SAN ETHAR YH EH EFG ITCDN I M FI V L RQ L ASR I TA S ETHAR YH E EQGV TCDD S M V A T C SL EVGY SDSR V W T L AA S L YKG M T S E D R M TESD V S S E S L V Y S D M G K I R A I R S V L K 6 G E P H I S SAKVT~ VDGVPGCGKTKE IL ~ V N F D E D L V L V P GKQAAEM IRRRANSSGL I V A T K E N V R T V D S F L M N Y G R G V A L TL RL V V | -E I R A I D K V A L TL RL RN V V | I I D K M F KS DTM IA LQNL KTMR EPT M V Y GDF~FDL I A R MD L H-PKP-CQ~FLDEGLML+GCVNFLVGMSLCS~VYGDTQQ AR F I I ~ E DI I Y TR F i I ~ A EIi Y R-SH I I ~ LI G DID YI IP ~ NRVATFPYPKHL SQLEVDAVETRRTTLRCPAD IT F F L N Q K Y E G Q V M C T S S V T R S V | TG A FAK E V H R H EKE | SG A FAK E V HY RB F S KK ~ QN FEK Q E M G VN QS A TT T Q

297

397

497

597

697

797

897

996

S H E V I Q G A A V M N P V S K P L K ~ K V I T F T Q S D K S L L L S R G Y E D ~ T V H E V Q G E T F E D V S ~ u v * 1 u T P T P V G I I S K Q S & L L V S L S R H T R S I K Y ~ ~ V V L D A V V S V L 1096 Q MVS SI IL EA A| YA | S ARD I V K L M PL II Q MVG I ~I H IL EA S| YS | S AGD | V A C L M PL II S M G KG L S IV A FE EEK K ~ N I ] A LTL S |V A T K F PL QII R D L E C V S S Y L L D M Y K V D V S T Q * Q L Q I E S V Y K G V N L F V A A P K T G D V S D M Q Y Y Y D K C L P G N S T I L N E Y D A V T M Q I R E N S L N V K D C V ~DMSKSVPLPRESETT 1196 R AG * VD F NF I F L N KLTDI A A KDVKP KL AG * D F S I F MM NF RLTDI A A KDQIKP S SSLLLF E M EAGSR* MDA F H T S FP L F V K RL D F I M K VKPC L K P V I R T A A E K P R K P G L L E N L V A M I K R N F N S P E L V G V V D IEDTASLVVDKFFDAYL IKEKKKP -KNIP-LLSRASLERWIEKQEKSTI GQLADFDF IDLP 1294 I MV M QT S N S L R N FS- F E N A QV V I MV M QT A S II N S L R N VS- F E N L QV V E L P AA A D T TI S V S F KE YTAGVMTKD M M L NRKEVLLDD NYN T
. . . . . . I .

A V D Q Y R H M I K Q Q P K Q R L D L S IQTEYPALQTIVYHSKKI NALF GPVF SELTRQLLET ID S SRFMFYTRKTPTQ IEEFF SDLDS N~P MDILELDISKYDK~ A K I L DS L F A D G ~ V V A K T I L DSV L F A D G ~ V I K A K N Q GI-LAG L AF KE L F E Q ~ V

1394

N E F H C A V E Y E I W K R L G L D D F L A E V W K H G H R K T T L K 6 Y T A G + T C L W Y Q R K S G D V T T F I G N T I I I A A C + S M L P M E R L I ~ ~-FCGDDSIL FPKGT6FPDI 1494

R
R

Z
FE NE G

Q
Q Q I

|
| | I

v
V V

s ~
~ ~

z
KI KV

'.

CEY

TMV TMGMV PMMV-S TOMV TMV TMGMV


Pm4v-s

L V

CE L

LWNFEAKL ~ M H~ ~C M M ~ Q ~Q ~

YGYFC GRYI IHHDRGC IVYYDPLKII SKLGAKH IKNP~HLEEFRTSL CDVAGSLNN- CAYYTHLNDAVGEVI KTAPL GSFVY 1593 V DWD R E Q D H P V DW R V QD W H P K A DYD L V C G W LGFPQ A IK H ID AF

Tom,
TMV TMG~4V

m~vz~cDZ~FQ~Z KS s v as
KS S NCVN F V F RE R

DGSSC
IDGSSC NGC

(16121 (16161

(1616)
(1610)

Fig. 3. Alignment of the deduced amino acid sequences of the 126/183K proteins from different tobamoviruses. Source of amino acid sequences as in Table 1. Only amino acid exchanges are indicated. Gaps are indicated by ( - ) . Numbering corresponds to that from PMMV-S. Numbers in parentheses indicate the total length of each protein. The sequence motifs defined by Gorbalenya & Koonin (1989), Poch et al. (1989), and Rozanov et al. (1990) are boxed.

PMMV-S

R N A sequence

2881

readthrough of the UAG stop codon at nt 3421 to 3423. The suppression of a termination codon is a widespread phenomenon among animal and plant RNA viruses, and is related to the regulation of the expression of the different components of the viral RNA polymerase (Ishikawa et al., 1986; Strauss et al., 1988). The fact that the surrounding nucleotide sequences (ATAGCAATTACAG) at positions 3420 to 3432 (Fig. 2) are strictly conserved in all the tobamoviruses reveals the functional importance of this particular region in their genomes, as it also occurs in alphaviruses (Strauss et al., 1988). The so-called polymerase module is found in the carboxy portion of the protein, in which four domains (A to D) have been defined (Poch et al., 1989). This module is common to all of the DNA- and RNA-dependent RNA polymerases, and it is known to be involved in the elongation of pre-existing chains (Quadt & Jaspars, 1989). The Gly-Asp-Asp motif first identified by Kamer & Argos (1984) is found in the C domain surrounded by hydrophobic residues. The alignment of the PMMV-S 183K protein with those from other tobamoviruses (Fig. 3) also shows that all the amino acid substitutions in these domains are of the conservative type. The only non-conservative differences between TMGMV and PMMV-S occur in the D domain, Cys to Gly (position 1496) and Asn to Leu (position 1500). The higher sequence variability in this readthrough part of the 183K protein is found in the region located at amino acids 1250 to 1291. There is also lower amino acid sequence identity in both the N- and C-terminal segments (Fig. 3), a feature common to other RNA viruses (Haseloff et al., 1984; Allison et al., 1989). It remains to be determined whether the attenuated biological behaviour of PMMV-S in tobacco, in comparison with TMV and ToMV, could be ascribed to regions of maximum sequence heterogeneity or to segments of the non-highly conserved sequence of the 126K/183K protein, as previously described for the attenuated L11A strain ofToMV (Nishiguchi et al., 1985) in which the amino acid substitutions responsible for this characteristic have been mapped in regions, not highly conserved, of the 126K protein (amino acid residues 348, 759 and 894). It is also unknown whether the ability of PMMV-S to break the resistance against tobamoviruses conferred by the E 1 and L 2 genes in pepper (Boukema et al., 1980; Garcia-Luque et al., 1990) is due to any of the amino acid changes which take place in this protein, as described for the Ltal strain of ToMV, in which two amino acid substitutions (Glu to Gin and Tyr to His) at positions 979 and 984, respectively (980 and 985 in the PMMV-S protein) have been identified as those responsible for the ability of this strain to break the Tm-l resistance gene in tomato (Meshi et al., 1988). However, since none of the resistance conferred by the L 1 and L z genes in pepper is expressed in protoplasts (unpublished

results), it is plausible to consider that other regions of the PMMV-S genome may be implicated also.
Organization o f the 3 0 K protein gene

The third ORF of PMMV-S encodes the 30K protein (Fig. 2). Translation initiates at nt 4909 and terminates at nt 5682; thus the coding region for this protein overlaps with neither the coding region for the 183K protein nor with that for the coat protein. Its putative translation product is 257 amino acids long with a calculated Mr of 28 347. Therefore, this is the first tobamovirus in which none of the reading frames overlap, since in ToMV, TMV, cucumber green mottle mosaic virus (CGMMV) and TMGMV the 5' end of the genes encoding the 30K protein overlap with the 3' end of those encoding the 183K proteins (Goelet et al., 1982; Ohno et al., 1984; Saito et al., 1988; Solis & Garcia-Arenal, 1990), and in sunn-hemp mosaic virus (SHMV) as well as in CGMMV (Meshi et al., 1982; Saito et al., 1988) their 30K-coding regions overlap at the 3' end with their coat protein genes. The 30K proteins of tobamoviruses are responsible for cell-to-cell spread of the viral infection (Deom et al., 1987; Meshi et al., 1987), by modifying the plasmodesmata (Wolf et al., 1989). Although the exact mechanism of action is unknown (Hull, 1989), a domain responsible for binding to nucleic acids which maps between amino acid positions 65 and 87 has been defined (Citovsky et al., 1990). As with the consensus sequences in the 126K/183K proteins, the amino acid substitutions that take place in this region of the PMMV-S 30K protein are of the conservative type with respect to TMV, ToMV and TMGMV, except for the semi-conservative exchange Ala to Val (position 70) in the ToMV 30K protein (Fig. 4). As in other tobamoviruses (Ohno et al., 1984; Solis & Garcia-Arenal, 1990) the PMMV-S 30K protein is encoded in the least conserved part of the entire genome, both at the nucleotide and amino acid levels (Table 1). Its alignment with those from the most closely related tobamoviruses (Fig. 4) shows that the PMMV-S 30K protein shares a higher degree of amino acid sequence identity with that of TMV than with those of ToMV or TMGMV, in contrast to other proteins encoded by PMMV-S. It contains two well conserved regions located at amino acid positions 46 to 125 and 151 to 204. In the first one, only one non-conservative amino acid substitution takes place in the PMMV-S 30K protein with respect to ToMV and TMV (Arg to Tyr, both at position 109) and TMGMV (Ala to Cys at position 113). In the second well-conserved region, all of the amino acid changes among the TMV, ToMV and PMMV-S 30K proteins are of the conservative type, but it is less conserved compared to TMGMV, with three non-

2882

E. Alonso and others

PMMV-S ToMV TMV TMGMV a TMGMV b

i00 GK N N GK N N VSLR T D KS L D TKM I D KQDEI SMF P SMF P F K S MV S MC S IMVH E IMVH E IM V K


IMvK

E E D

IE~ IDSI ~

V T D

M I

]E |E |K

A A S

vsL~T

~QDEI

],~s
200

PMMV-S ToMV TMV TMGMV a TMGMV b

Y A Y A A HAP C

Q VV Q VV N S

G A

K Q SE

KNI MKN KHP

KN K S A Y N K SA K E Y

~ N ~RN I V I N VRK

~D M N ~D M R L ~D S I

S E E K

AHAPc

,~s

sE

~P

E~

iv

I NVK

RL~F~S~

~I

M NI M D[ VDE I

PMMV-S ToMV TMV TMGMV a TMGMV b

~V ~I I VK [VK

A .......... TKS R P - K N NNL G R S G G P K - - - P S F ~ E V E E ~ L - E D E A T V D D Y] a .......... SRTG SDVRKG NSSNDRSVPNKNYRNV D~GGMSFKK~LDDD ATV D 1 E . . . . . . . . . . V P E N - K E M - - - V G N N NN ..... K K I - - - N S g K GF I - ~ E I E D N V S D D E I S T I E R F R K T K K G K K R K K E K - K R V - - - V G N S NN ..... K K I - - - N S G ~ GL V - E ~ I E D N V S D D E I S T J

(264) (268) (256)

(266)

Fig. 4. Alignment of the amino acid sequence of the 30K protein of PMMV-S with those from the most related tobamoviruses. Numbering, symbolsand sourceof the amino acid sequencedata as in Fig. 3 and Table 1. The amino acid sequencedomainsdefinedby Saito et al. (1988) are boxed.

conservative substitutions (Leu to Lys, Thr to Leu and Ala to Lys at positions 172, 179 and 192, respectively) (Fig. 4). Several amino acid changes in the central segment of the tobamoviral 30K proteins have been identified in temperature-sensitive mutants defective in cell-to-cell movement (Ohno et al., 1983; Zimmern & Hunter, 1983) as well as in the Ltbl strain of ToMV, known to overcome the resistance conferred by the Tm-2 gene in tomato plants (Meshi et al., 1989). This capability resides in two amino acid substitutions, at positions 68 and 133 (Cys to Phe and Glu to Lys, respectively). In this region of the PMMV-S 30K protein (Fig. 4), there are three non-conservative amino acid changes with respect to the proteins of TMV (Glu to Met, Ala to Lys and Lys to Ala at positions 133, 147 and 150, respectively) and of ToMV (Ala to Lys, Ala to Lys and Lys to Ala at positions 130, 147 and 150) but only one (Val to Pro at position 136) with respect to T M G M V . Although some of these substitutions may be of a compensatory type, they could be responsible for the ability of PMMV-S to overcome the pepper L 1 and L 2 resistance genes. The carboxy region of the tobamoviral 30K proteins are the most variable in terms of length or amino acid sequence. However, Saito et al. (1988) found that all of these proteins have a particular charge distribution with a basic domain flanked by two acidic domains. In this sense, the content of acidic amino acids (Glu, Asp) in the extreme C terminus of the 30K protein of PMMV-S is lower (four) than that of T M V and ToMV (six and seven, respectively). These changes could be involved in the adaptability of PMMV-S to its pepper host, although they may only represent, as stated above, the high degree of variability in this area of the tobamoviral 30K proteins.

Coat protein gene

The fourth ORF of PMMV-S encodes the coat protein. It ranges from nt 5685 to 6158, with an intergenic region of two nucleotides between the 30K and coat protein ORFs (Fig. 2). The resulting protein consists of 156 amino acids, with a calculated Mr of 17110. This value differs from the previous report of 158 amino acids for the coat protein of an Italian isolate of P M M V (Wetter et al., 1984), determined by amino acid analysis. Although PMMV-S and P M M V are different isolates, which can be distinguished by their responses in Capsicum spp. with different resistance genes and therefore have been identified as different pathotypes (Garcia-Luque et al., 1990), sequencing of the Italian P M M V coat protein gene has shown that it also consists of 156 amino acids (M. L. Ferrero, I. Garcia-Luque, E. Alonso, A. de la Cruz, J. F. Rodriguez, M. T. Serra & J. R. Diaz-Ruiz, unpublished results). The alignment of the deduced amino sequence for the coat protein gene of PMMV-S with those of other tobamoviruses (Fig. 5) shows that there is a strict conservation of those amino acid sequence motifs (36 to 41, 88 to 94, 113 to 120) which correpond to the RNAbinding site in the coat protein (Altschuh et al., 1987). The lower amino acid sequence identities are located at the N, C and central regions of the proteins (Fig. 5). It remains an open question as to whether any of these changes also affect the ability of PMMV-S to overcome the resistance conferred by the L 1 and L 2 genes in pepper plants, as described for several mutant strains of TMV and ToMV in which changes in their coat proteins make these mutants either able to be localized by the N' gene in N. syh,estris or to escape its action (Knorr &

P M M V - S RNA sequence

2883

PMMV-8 ToMV TMV ~G~ PMMV-S

AYTVssANQLX/YLGSVWADPLELQNLCTSALGNQ S SIT PS F F S I L V NS S SITTPS F F S A I I N P INPSF SA~: VQI . sNPTTAETLDA~

]
I

QQQF SDVWKT IPTATVRFPATGFKVFRYNAVLD SLVSALLG E PF QS GDVY Y PIT I IV R E PS QV DSD Y P T I [ AA p v s M sDYY s~ ~ I F ~ , s ~ I (156)

VENPQ I Q

i00

QA

DQP

[RASISNLMNELVRGTGMYNQALFESASGLTWATTP

~o~
TMV

~
A N T IVN ~

sAN
SAN N

v
IV A I

,.

M
S T

V~SAAS
V TSG AT V T AT

11581
(158) (158)

TMGMV

S RSS F G

Fig. 5. Alignment of the coat protein sequence of PMMV-S with those from the more closely related tobamoviruses. Numbering and source of amino acid sequences as in Fig. 3 and Table 1. The RNA-binding domains are boxed.

Dawson, 1988; Saito et al., 1988, 1989; Culver & Dawson, 1989). Based upon nucleotide sequence homology with the TMV origin of assembly (Zimmern, 1977), the predicted position of this region in PMMV-S is located between nt 5458 and 5517, in accordance with the absence of encapsidation of its coat protein m R N A as shown by electron microscopy observation (Wetter et al., 1984) and electrophoretic analysis of the virion particles (GarciaLuque et al., 1990).
Functional and evolutionary considerations

The authors thank M. V. Lafita for typing the manuscript. B.W. and E.A. were supported by fellowships from MEC and Fundacibn Ram6n Areces, respectively. The work was supported by grants from PLANICYT (AGR88-0082) and Fundaci6n Ram6n Areces.

References
AHLQUIST, P., STRAUSS,E. G., RICE, C. M., STRAUSS,J. H., HASELOFF, J. & ZIMMERN,D. (1985). Sindbis virus proteins nsP1 and nsP2 contain homology to non-structural proteins from several RNA plant viruses. Journal of Virology 53, 536-542. ALL1SON,R. F., JANDA,M. & AHLQUIST,P. (1989). Sequence of cowpea chlorotic mottle virus RNAs 2 and 3 and evidence of a recombination event during bromoviral evolution. Virology 172, 321-330. ALONSO, E., GARCiA-LUQUE, I., AVILA-RINCON, M. J., WICKE, B., SERRA, M. T. & DiAZ-RUiZ, J. R. (1989). A tobamovirus causing heavy losses in protected pepper crops in Spain. Journal of Phytopathology 125, 67-76. ALTSCHUH, D., LBSK, A. M., BLOOMER, A. C. & KLUG, A. 0987). Correlation of co-ordinated amino acid substitutions with function in viruses related to TMV. Journal of Molecular Biology 193, 693-707. AVILA-RINCON, M. J., FERRERO, M. L., ALONSO, E., GARCiA-LUQUE,I. & DiAZ-RUiZ, J. R. (1989). Nucleotide sequences of 5' and 3' noncoding regions of pepper mild mottle virus strain S RNA. Journal of General Virology 70, 3025-3031. BEIER, H., BARCISZEWSKA,M., KRUPP, G., MITNACHT, R. & GROSS, H. J. (1984). U A G readthrough during TMV RNA translation: isolation and sequence of two tRNAs (Tyr) with suppressor activity from tobacco plants. EMBO Journal 3, 351-356. BOUKEMA, I. W., JANSEN, K. & HOFMAN, K. (1980). Strains of TMV and genes for resistance in Capsicum. Synopses 4th Meeting Eucarpia Capsicum Working Group ( Wageningen), pp. 44-48. CITOVSKY, V., KNORR, D., SCHUSTER, G. & ZAMBRYSKI,P. (1990). The P30 movement protein of tobacco mosaic virus is a single-strand nucleic acid binding protein. Cell 60, 637-647. CULVER, J. N. & DAWSON, W. O. (1989). Point mutations in the coat protein gene of tobacco mosaic virus induce hypersensitivity in Nicotiana sylvestris. Molecular Plant-Microbe-lnteractions 2, 209 213. DEOM, C. M., OLIVER, M. J. & BEACHY,R. N. (1987). The 30-kilodalton gene product of tobacco mosaic virus potentiates virus movement. Science 237, 389-394. FRAILE, A. & GARCiA-ARENAL, F. (1990). A classification of the tobamoviruses based on comparisons among their 126K proteins. Journal of General Virology 71, 2223-2228. GARCiA-LUQUE, 1., SERRA, M. T., ALONSO, E., WICKE, B., FERRERO, M. L. & DiAz-Ruiz, J. R. (1990). Characterization of a Spanish strain of pepper mild mottle virus (PMMV-S) and its relationship to other tobamoviruses. Journal of Phytopathology 129, 1-8. GIBBS, A. (1986). Tobamovirus classification. In The Plant Viruses,vol. 2, pp. 168-178. Edited by M. H. V. Van Regenmortel & H. FraenkelConrat. New York: Plenum Press.

The determination of the nucleotide sequence of PMMV-S R N A has allowed us to confirm that the entire genome of this virus has diverged from other related tobamoviruses at a similar rate. The grouping of the tobamoviruses based on the amino acid composition of their coat proteins (Gibbs, 1986) and on the basis of the peptide pattern of the 126K proteins (Fraile & GarciaArenal, 1990) corresponds well to what is deduced from the entire genome, i.e. PMMV is located in the same cluster as TMV, ToMV and TMGMV, being more closely related to ToMV and TMV. These data also confirm the relationship between T o M and PMMV-S previously found by serological analysis (Wetter et al., 1984; Alonso et al., 1989). The possession by PMMV-S R N A of all the conserved sequence motifs necessary for replication and virion stability, its biololgical properties such as the diminished capability to replicate and/or accumulate in tobacco plants, its ability to overcome the tobamoviral resistance genes in pepper and its inability to infect tomato plants, should be ascribed to the requirements for the establishment of a functional interaction(s) with the host factor(s) known to be necessary for the efficient multiplication of the viruses in their host plants. Whether these requirements are related to changes at the amino acid level (polymerase, 30K and/or coat proteins) or at the nucleotide level (3' non-coding region) is under current study.

2884

E. Alonso and others


A. L. & BEACHY, R. N. (1991). Transfer of the movement protein gene between two tobamoviruses: influence on local lesion development. Virology 180, 318-326. NISHIGUCHI, M., Kmucm, S., Kmo, Y., OHNO, T. MESHI, T. & OKADA, Y. (1985). Molecular basis of plant viral virulence; the complete nucleotide sequence of an attenuated strain of tobacco mosaic virus. Nucleic Acids Research 13, 5585-5590. OHNO, T., TAKAMATSU,N., MESHI, T., OKADA, Y., NXSHIGUCHI,M. & KlrlO, Y. (1983). Single amino acid substitution in 30K protein of TMV defective in virus transport function. Virology 131, 255 258. OHNO, T., AOYAGI, M., YAMANASHI,Y., SAITO, H., IKAWA, S., MESHI, T. OKADA,Y. (1984). Nucleotide sequence of the tobacco mosaic virus (tomato strain) genome and comparison with the common strain genuine. Journal of Biochemistry 96, 1915-1923. PALUKAITIS,P. & ZAITLIN, M. (1986). Tobacco mosaic virus: infectivity and replication. In The Plant Viruses, vol. 2, pp. 105-131. Edited by M. H. V. Van Regenmortel & H. Fraenkel-Conrat. New York: Plenum Press. POCH, O., SAUVAGET, I., DELARUE, M. & TORDO, N. (1989). Identification of four conserved motifs among the RNA-dependent polymerase encoding elements EMBO Journal 8, 3867-3874. QUADT, R. & JASPARS,E. M. J. (1989). RNA polymerases of plus-strand RNA viruses of plants. Molecular Plant-Microbe Interactions 2, 219223. RICHARDS, K. E., GUILLEY, H., JONARD, G. & KEITH, G. (1977). Leader sequence of 71 nucleotides devoid of G in tobacco mosaic virus RNA. Nature, London 267, 548-550. ROZANOV, M. N., KOONIN, E. V. & GORBALENYA,A. E. (1990). Nterminal domains of large putative NTPases of 'Sindbis-like' plant viruses share amino acid motifs and may be RNA methyltransferases. Abstracts, VIIIth International Congress of Virology (Berlin), p. 377. SAITO, T., IMAI, Y., MESHI, T. & OKADA, Y. (1988). Interviral homologies of the 30K proteins of tobamoviruses. Virology 167, 653 656. SAITO, T., YAMANAKA,K., WATANABE,Y., TAKAMATSU,N., MESHI, T. & OKADA, Y. (1989). Mutational analysis of the coat protein gene of tobacco mosaic virus in relation to hypersensitive response in tobacco plants with the N' gene. Virology 173, 11-20. SOLiS, I. & GARCiA-ARENAL, F. (1990). The complete nucleotide sequence of the genomic RNA of the tobamovirus tobacco mild green mosaic virus. Virology 177, 553-558. STRAUSS, J. H. & STRAUSS, E. G. (1988). Evolution of RNA viruses. Annual Review of Microbiology 42, 657 683. STRAUSS, E. G., LEVINSON,R., RICE, C. M., DALRYMPLE, J. dr. STRAUSS, J. H. (1988). Non-structural proteins nsP3 and nsP4 of Ross River and O'Nyong-nyong viruses: sequence and comparison with those of other alphaviruses. Virology 164, 265-274. WETTER, C. (1986). Tobacco mild green mottle mosaic virus. In The Plant Viruses, vol. 2, pp. 205 219. Edited by M. H. V. Van Regenmortel & H. Fraenkel-Conrat. New York: Plenum Press. WETTER, C. & CONTI, M. (1988). Pepper mild mottle virus. CM1/AAB Descriptions of Plant Viruses, no. 330. WETTER, C., CONTI, M., ALTSCHUH, D., TABILLION, R. & VAN REGENMORTEL, M. H. V. (1984). Pepper mild mottle virus, a tobamovirus infecting pepper cultivars in Sicily. Phytopathology 74, 405-410. WOLF, S., DEOM, C. M., BEACHY, R. N. & LUCAS, W. J. (1989). Movement protein of tobacco mosaic virus modifies plasmodesmatal size exclusion limit. Science 246, 377-379. YOUNG, N., FORNEY, J. & ZAITLIN, M. (1987). Tobacco mosaic virus replicase and replicative structures. Journal of Cell Science Supplement 7, 277 285. ZIMMERN,D. (1977). The nucleotide sequence at the origin for assembly on tobacco mosaic virus RNA. Cell 11, 463-482. ZIMMERN, D. & HUNTER, T. (1983). Point mutation of the 30K open reading frame of TMV implicated in temperature sensitive assembly and local lesion spreading of mutant Ni2519. Embo Journal 3, 18931900.

GOELET, P., LOMONOSSOFF, G. P., BUTLER, P. J. G., AKAM, M. E., GAIT, M. J. & K.ARN, J. (1982). Nucleotide sequence of tobacco mosaic v i rus R N A. Proceedings of the Na tional Academy of Sciences, U.S.A. 79, 5818-5822. GOLDBACH, R. & WELLINK, J. (1988). Evolution of plus-strand RNA viruses. Intervirology 29, 260-267. GORBALENYA,A. E. & KOONIN, E. V. (1989). Viral proteins containing the purine NTP-binding sequence pattern. Nucleic Acids Research 17, 7735-7762. GUBLER, U. & HOFFMAN, B. J. (1983). A simple and very efficient method for generating cDNA libraries. Gene 25, 263-269. HABILI, N. & SYMONS,R. H (1989). Evolutionary relationship between luteoviruses and other RNA plant viruses based on sequence motifs in their putative RNA polymerases and nucleic acid helicases. Nucleic Acids Research 17, 9543-9555. HASELOFF, J., GOELET, P., ZIMMERN, D., AHLQUIST, P., DASGUPTA,R. & KAESBERG,P. (1984). Striking similarities in amino acid sequence among non-structural proteins encoded by RNA viruses that have dissimilar genomic organization. Proceedings of the National Academy of Sciences, U.S.A. 81, 4358-4362. HODGMAN, T. C. (1988). A new superfamily of replicative proteins. Nature, London 333, 22-23. HULL, R. (1989). The movement of viruses in plants. Annual Review of Phytopathology 27, 213-240. ISHIKAWA, M., MESHI, T., MOTOYOSHI, F., TAKAMATSU,N. & OKADA, Y. (1986). In vitro mutagenesis of the putative replicase genes of tobacco mosaic virus. Nucleic Acids Research 14, 8291-8305. ISHIKAWA, M., MESHI, T., WATANABE, Y. & OKADA, Y. (1988). Replication of chimeric tobacco mosaic viruses which carry heterologous combinations of replicase genes and 3' non-coding regions. Virology 164, 290-293. KAMER, G. & ARGOS, P. (1984). Primary structural comparison of RNA-dependent polymerases from plant, animal and bacterial viruses. Nucleic Acids Research 12, 7269-7282. KNORR, D. A. & DAWSON, W. O. (1988). A point mutation in the tobacco mosaic virus capsid protein gene induces hypersensitivity in Nicotiana sylvestris. Proceedings of the National Academy of Sciences, U.S.A. 85, 170-174. MANIATIS, T., FRITSCH, E. F. & SAMBROOK, J. (1982). Molecular Cloning." A Laboratory Manual. New York: Cold Spring Harbor Laboratory. MAXAM, A. M. & GILBERT, W. (1980). Sequencing end-labeled DNA with base-specific chemical cleavages. Methods in Enzymology 65, 499-560. MESHI, T., OHNO, T., IBA, H. & OKADA,Y. (1981). N ucleotide sequence of a cloned cDNA copy of TMV (cowpea strain) RNA, including the assembly origin, the coat protein cistron, and the 3' non-coding region. Molecular and General Genetics 184, 20-25. MESHI, T., OHNO, T. & OKADA,Y. (1982). Nucleotide sequence of the 30K protein cistron of cowpea strain of tobacco mosaic virus. Nucleic AcMs Research 10, 6111-6117. MESHI, T., KIYAMA, R., OHNO, T. & OKADA, Y. (1983). Nucleotide sequence of the coat protein cistron and the 3' non-coding region of cucumber green mottle mosaic virus (watermelon strain) RNA. Virology 127, 54~64. MESHI, T., WATANABE, Y., SAITO, T., SUGIMOTO, m., MAEDA, T. & OKADA, Y. (1987). Function of the 30-kD protein of tobacco mosaic virus: involvement in cell-to-cell movement and dispensability for replication. EMBO Journal 6, 2557-2567. MESHI, T., MOTOYOSHI, F., ADACHI, A., WATANABE,Y., TAKAMATSU, ]N[. & OKADA,Y. (1988). Two concomitant base substitutions in the putative replicase genes of tobacco mosaic virus confer the ability to overcome the effects of a tomato resistance gene, Tin-l. EMBO Journal 7, 1575-1581. MESHI, T., MOTOYOSHI,F., MAEDA,T., YOSHIWOKA, S., WATANABE,H. & OKADA,Y. (1989). Mutations in the tobacco mosaic virus 30-kD protein gene overcome Tin-2 resistance in tomato. Plant Cell 1,515522. ML S., DURBIN, R., HUANG, H. V., RICE, C. M. & STOLLAR, V. (1989). Association of the Sindbis virus RNA methyltransferase activity mith the non-structural protein nsP1. Virology 170, 385-391. NEJIDAT, m., CELLIER, F., HOLT, C. A., GAFNY, R., EGGENBERGER,

(Received 3 June 1991; Accepted 20 August 1991)

Potrebbero piacerti anche