Sei sulla pagina 1di 5

Proc. Natl. Acad. Sci. USA Vol. 81, pp.

2752-2756, May 1984 Biochemistry

Human transferrin: cDNA characterization and chromosomal localization*


(iron binding protein/intragenic duplication/chromosome mapping)

FUNMEI YANGt, J. B. LUMt, JOHN R. MCGILLt, CHARLEEN M. MOOREt, SUSAN L. NAYLORt, PETER H. VAN BRAGTt, W. DAVID BALDWINt, AND BARBARA H. BOWMANt
tDivision of Genetics, The University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78284; and tDepartment of Human Genetics, Roswell Park Memorial Institute, Buffalo, NY 14263

Communicated by Eloise R. Giblett, January 23, 1984

Transferrin (Tf) is the major iron binding ABSTRACT protein in vertebrate serum. It shares homologous amino acid sequences with four other proteins: lactotransferrin, ovotransferrin, melanoma antigen p97, and HuBlym-l. Antigen p97 and the Tf receptor genes have been mapped on human chromosome 3. The goal of the study described here was to initiate the characterization of the Tf gene by identifying and characterizing its cDNA and mapping its chromosomal location. Recombinant plasmids containing human cDNA encoding Tf have been isolated by screening an adult human liver library with a mixed oligonucleotide probe. Within the 2.3 kilobase pairs of TfcDNA analyzed, there is a probable leader sequence encoded by 57 nucleotides followed by 2037 nucleotides that encode the homologous amino and carboxyl domains. During evolution, three areas of the homologous amino and carboxyl domains have been strongly conserved, possibly reflecting functional constraints associated with iron binding. Chromosomal mapping by in situ hybridization and somatic cell hybrid analysis indicate that the Tf gene is located at q21-25 on human chromosome 3, consistent with linkage of the Tf, Tf receptor, and melanoma p97 loci.

method (8) by BioLogicals (Ottawa, Canada). Sixteen mixed oligonucleotides were synthesized as shown:
A T A 3' TAC-TTG-CGG-TTC-TAC-AT 5' T C Purified oligonucleotides were labeled at the 5' end with [y32P]ATP and polynucleotide kinase (9) to screen the cDNA library. Screening of cDNA Clones. The cDNA library, kindly provided by Stuart H. Orkin (Harvard Medical School, Boston) was constructed from human liver RNA as described (10). The cDNA library was incubated overnight on L agar plates containing 10 jig of tetracycline per ml and was transferred to nitrocellulose filters. The plasmids were amplified with 250 ,tg of chloramphenicol per ml and the filters were prepared for hybridization as described by Grunstein and Hogness (11). The filters were hybridized at 37C with 32P-labeled oligonucleotide mixed probes as described by Wallace, et al. (12). The hybridization mixture contained 0.9 M NaCl, 0.09 M Tris HCl (pH 7.5), 0.006 M EDTA, 0.5% NaDodSO4, Denhardt's (13) solution (5 x strength), 100 ,ug of denatured Escherichia coli DNA per ml, and 6.4 ng of 5' end-labeled oligonucleotide mixed probes per ml having a specific activity of -7 x 108 cpm/,ug. Preparation of Plasmid DNA. Bacterial clones were grown in M9 medium (14) supplemented with 0.2% Casamino acids (Difco), 0.5% glucose, 0.01 M MgSO4, and 5 jig of tetracycline per ml. When the optical density of the culture reached 0.8 OD600, 100 ,ug of chloramphenicol per ml was added and the culture was incubated overnight. Plasmid DNA was isolated as described (15) and purified on two consecutive ethidium bromide/CsCl gradients. Restriction Endonuclease Mapping and DNA Sequence Determination. Restriction endonuclease fragments were first labeled either at the 5' end with [y-32P]ATP by using T4 polynucleotide kinase (9), at the 3' end with [a-32P]dNPTs by using the Klenow fragment of E. coli polymerase I, or at the 3' end with cordycepin[a-32P]triphosphate by using terminal deoxynucleotide transferase (16). The labeled DNA fragments were then cleaved with a second endonuclease, and the uniquely labeled fragments were separated by polyacrylamide or agarose gel electrophoresis. Labeled DNAs were recovered by electroelution and subjected to sequence analysis by the method of Maxam and Gilbert (17). Homology of nucleotide sequences in the NH2 and COOH domains was evaluated by calculation of an "accident probability," Pa, the probability that a homology equal to or greater than that being considered might arise accidentally (18). A homology
Abbreviations: Tf, transferrin; bp, base pair(s). *A preliminary report of this research has been presented (35).

Transferrin (Tf) carries ferric iron from the intestine, reticuloendothelial system, and liver parenchymal cells to all proliferating cells in the body. The family of Tf-like proteins represents the product of an intragenic duplication followed by a series of independent gene duplications (1-4). Serum Tf (1), hen ovotransferrin (2, 3), lactotransferrin (4), melanoma antigen p97 (5), and a transforming protein from chicken lymphoma ChBlym-1 (6) share strong amino acid sequence homologies. A transforming protein from Burkitt lymphomas recently described (7) may also belong to the Tf family. There is also significant internal homology in the amino-terminal (NH2) and carboxyl-terminal (COOH) domains of Tf, lactotransferrin, and ovotransferrin (1-4). For example, the NH2 and COOH domains of human Tf reveal 40% identity when the NH2 domain (residues 1-336) and the COOH domain (residues 337-678) are compared (1). In the study described here the cDNA encoding human Tf is characterized and the Tf gene is mapped to chromosome 3 as the initiatory study toward analyzing the expression of the genes controlling this family of proteins. MATERIALS AND METHODS Synthesis of Oligonucleotides of Mixed Sequence. The sequence of amino acid residues 309-314 (Met-Asn-Ala-LysMet-Tyr) from the NH2 terminus of human Tf (1) was chosen for construction of a 17-mer oligonucleotide probe. The probe was synthesized by the solid-phase phosphite triester
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. 1734 solely to indicate this fact.

2752

Biochemistry: Yang et aL.


L-eader Sequence
I Amino Acid Residue -19 I

Proc. NatL. Acad. Sci. USA 81 (1984)


N-Domain
C- Domain 44 336 337
343 Amino Acids 679

2753

Untranslated Region

-Poly ATail
336 Amino Acids

T--dG~dC tail

TdGdC tail

too bp
extent of sequence

FIG. 1. Structure of human Tf cDNA and strategy for determining the nucleotide sequence of Tf cDNA. Arrows indicate direction and analysis; solid dots indicate labeling sites. Only the positions of relevant restriction sites are indicated.
TCT CCT CCC TCC TCA GCC CGC ACC CCC AAC ATC ACC CTC CCC CTC CCA CCC CTC CTC CTC TCC CCC CTC CTC CCC CTC TCT CTC CCT.GTC CCT CAT MAA ACT CTC ACA TCC TCT CCA 1 10 -10 -19 MET ARC LEU ALA VAL GLY ALA LEU LEU VAL CYS ALA VAL LEO GLY LEU CYS LEU ALA VAL PRO ASP LYS THR VAL ARC TRP CYS ALA

CTC TCC CAC CAT CAC CCC ACT MCG TCC CAC ACT TTC CCC CAC CAT ATC AMA ACC CTC ATT CCA TCC CAT CCT CCC ACT CTT CCT TCT CTC MCG AMA CCC TCC TAC CTT CAT TCC ATC ACC 40 30 20 50 VAL SER GLU HIS GLU ALA THE LYS CYS GIN SHE PHE ARC ASP HIS MET LYS SEE VAL ILE PRO SEE ASP GLY PRO SEI9 VAL ALA CYS VAL LYS LYS ALA SHE TYR LEO ASP CYS ILE ARC

CCC ATT CCC CCA MAC CAA CCC CAT CCT CTC ACA CTC CAT CCA CCT TTC CTC TAT CAT CCT TAC TTC CCT CCC MAT MAC CTC MAC CCT CTC CTC CCA CAC TTC TAT CCC TCA AAA CAC CAT 80 90 70 60 ALA ILE ALA ALA ASN GLU ALA ASP ALA VAL THE LEO ASP ALA CLY LEO VAL TYR ASP ALA TYR LEO ALA PRO ASN ASH LEO LYS PRO VAL VAL ALA GLU-PHE TYR GLY SER LYS GLU ASP
CCA CAC ACT TTC TAT TAT CCT CTT CCT CTC CTC AAC AG CAT ACT CCC TTC CAC ATC MAC CAC CTT CGA CCC MCG AA TCC TCC CAC ACC CCT CTA CCC ACC TCC CCT CCC TCC MAC ATC 130 120 110 100 PRO GLH THE PHE TYR TYE ALA VAL ALA VAL VAL LYS LYS ASP SER GLY PHE GIN HET ASH GUI LEO ARC GLY LYS LYS SEP CYS HIS THE CLY LEO GLY ARC SER ALA GLY TEP ASH ILE
CCC ATA CCC TTA CTT TAC TCT CAC TTA CCT CAC CCA CGT MAA CCT CTT CAC MAA CCA CTC CCC MAT TTC TTC TCC CCC ACC TCT CCC CCT TCT CCC CAT CCC ACC CAC TTC CCC CAC CTC 170 160 150 140 PRO ILE CLY LEO LEU TYE CYS ASP LEO PRO GUI PRO ARC LYS PRO LEO CLU LYS ALA VAL ALA ASH PHE PHE SHE GLY SER CYS ALA PRO CYS ALA ASP GLY THE ASP PHE PRO GLH LEO

TCT CMA CTC TCT CCA CCC TCT CCC TCC TCC ACC CTT MAC CMA TAC TTC CCC TAC TCC CCA CCC TTC MCG TCT CTC MCG CAT CCT CCT CCC CAT CTC CCC TTT CTC MAC CAC TCC, ACT ATA
CYS GLH LEO CYS PRO CLY CYS GLY CYS SER THE LEO ASH CLU TYR PHE GLY TYE SER GLY ALA PHE LYS CYS LEO LYS ASP CLY ALA GLY ASP VAL ALA PHE VAL LYS HIS SHE THE ILE

180

190

200

210

TTT CAC MAC TTC CCA MAC MCG CCT CAC ACC CAC CAC TAT CAC CTC CTT TCC CTA CAC MAC ACC CCC MCG CCC CTA CAT CAA TAC MCG CAC TCC CAC TTC GCCC CAC CTC CCT TCT CAT ACC 240 250 230 220 PH! GLO ASH LEO ALA ASH LYS ALA ASP ARC ASP GIN TYE GLU LEO LEO CYS LEU ASP ASH THE ARC LYS PRO VAL ASP CLU TYR LYS ASP CYS HIS LEO ALA GUIN VAL PRO SHE HIS THE

CTC CTC CCC CGA. ACT ATC CCC CCC MAC CAC CAC TTC ATC TCC CAC CTT CTC MAC CAC CCC CAC CMA CAT TTT CCC MAA CAC MAA TCA MAA CMA TTC CMA CTA TTC ACC TCT CCT CAT CCC .290 280 270 260 VAL VAL ALA ARC SHE HE? CLY GLY LYS GLU ASP LEO ILE TRP GLU LEO LEO ASH GUI ALA GIN GLU HIS PHE CLY LYS ASP LYS SER LYS GLU PRE GUI LEO PHE SER SER PRO HIS CLY

MAC CAC CTC CTC TTT MCG CAC TCT CCC CAC CCC TTT TTA MAA CTC CCC CCA ACC ATC CAT CCC MCG ATC TAC CTC CCC TAT CAC TAT CTC ACT CCC ATC CCC MAT CTA CCC CAA CCC ACA 330 320 310 300 LYS ASP LEO LEO PHE IYS ASP SER ALA HIS CLY PHI LEU LYS VAL PRO PRO ARC MET ASP ALA LYS HE? TYE LEO CLY TYR CLU TYR VAL THE ALA ILE ARC ASH LEO ARC GLU GLY THE
TCC CCA CMA CCC CCA ACA CAT CMA TCC MCG CCT CTC MCG TCC TCT CCC CTC ACC CAC CAC CAC ACC CTC MAC TCT CAT CAC TCC ACT CTT MAC ACT CTA CCC AMA ATA CAC TCT CTA TCA 370 360 350 340 CYS PRO GUI ALA PRO THE ASP GLU CYS LYS PRO VAL LYS TRP CYS ALA LEO SHE HIS HIS GLU ARC LEO LYS CYS ASP CLU TRP SER VAL ASH SHE VAL GLY LYS ILE GLU CYS VAL SER
ALA GUI THE THE

CCA CAC ACC ACC CMA CAC TCC ATC CCC MCG ATC ATC MAT CCA CMA CCT 380 CLU ASP CYS ILE ALA LYS ILE MET ASH GLY GLU ALA AAC TAC MAT MCG ACC CAT MAT TCT CAC CAT ACA CCA CAC CCA CCC TAT 420 ASH TYE ASH LYS SHE ASP ASH CYS GLU ASP THE PRO GUI ALA GLY TYE
HIS THE ALA VAL

ASP ALA ME? SER LEO ASP GLY

CAT CCC ATC ACC TTC CAT GCA CCC TTT CTC TAC ATA CCC CCC MCG TCT CCT CTC CTC CCT CTC TTC CCA CMA 410 400 390 CLY PHE VAL TYE ILE ALA CLY LYS CYS GLY LEO VAL PRO VAL LEO ALA GUI TTT CCT CTA CCA CTC CTC MC MAA TCA CCT TCT CAC CTC ACC TCC CAC MAT CTC MAA CCC MCG MC TCC TCC
430 440

PHE ALA VAL ALA VAL VAL LYS LYS SHE ALA

CAT ACC CCA CTT CCC ACA ACC CCT CCC TCC MAC ATC CCC ATC CCC CTC 460 CLY ARC THE ALA GLY TSP ASH ILE PRO ME? CLY LEO
CAC TCC ACT CTC TCT

ASP SEE SEE LEO CYS LYS LEO CYS ME?

CLV LYS CLY ASP VAL CCC TTT GTC MAA CAC CAC ACT CTC CCA CAC MAC ACT CCC CCA MAA AAC CCT CAT CCA TCC CCT MCG MT CTC MAT CM MAA CAC TAT CAC TTC CTC TCC CTT CAT CCT ACC ACC MAA CCT 540 550 560 570 ALA PHE VAL LYS HIS GUI THE VAL PRO GUIN ASH THE GLY CLY LYS ASH PRO ASP PRO TRP ALA LYS ASH LEO ASH CLU LYS ASP TYR CLU LEO LEO CYS LEUI ASP GLY THE ARC LYS PRO CTC CAC CAC TAT CCC MAC TCC CAC CTC CCC ACA CCC CCC MAT CAC GCT CTC CTC ACA CCC MAA CAT MCG CM CCT TCC CTC CAC MCG ATA TTA CGT CMA CAC CAC CAC CTA TTT CCA ACC 580 590 600 610 ARC ALA PRO ASH HIS ALA VAL VAL THE ARC VAL~ CLV CLV TYE ALA ASH CYS HIS LEO ALA LYS ASP LYS CLU ALA CYS VAL HIS LYS ILE LEO ARC GUI GLN GUI HIS LEO PHE GLY SHE
VAL

MCG CTC TCT ATC CCC TCA CCC CTA MAC CTC TCT 500 CLY SHE CLY LEO] ASH LEO CYS

SER ASP LEO THE TSP ASP ASH LEO CTC TAC MAT MCG ATC MAC CAC TCC ACA TTT CAT CMA TTT TTC ACT CMA CCT TCT 470 480 LEU TYR ASH LYS ILE ASH HIS CYS ARC PHE ASP CLU PHE PHE SHE CLU GLY CYS CMA CCC MAC MC MAA CAC CCA TAC TAC CCC TAC ACA CCC CCT TTC ACC TCT CTC 510 520 CLU PRO ASH ASN LYS CLV CLY TYR TYR CLY TYE THE GLY ALA PHE ARC CYS LEU

LYS CLY LYS LYS SEE CYS


ALA PRO

450

CCC CCT CCC TCT MCG CLY SHE IYS CTT CAC MCG CCA CAT

AMA 490 LYS


CTC
530

MAC CTA ACT CAC TCC TCC CCC MAC TTT TCT TTC TTC CCC TCC CMA ACC MAC CAC CTT CTC TTC ACA CAT CAC ACA CTA TCT TTC CCC AMA CTT CAT CAC ACA 620 630 640 ASH VAL THE ASP CYS SEE CLY ASH PHE CYS LEO PHE ARC SEE CLV THE LYS ASP LEO LEO PHE ARC ASP ASP THE VAL CYS LEO ALA LYS LEO HIS ASP ARC TTA GGA CMA CM TAT CTC MCG CCT CTT CCT MAC CTC ACA MAA ICC TCC ACC TCA TCA CTC CTC GAA CCC TCC ACT TTC CCT ACA CCT TMA AAT CTC ACA CCT 660 670 679 LEO CLY GLU CLU TYR VAL LYS ALA VAL CLY ASH LEO ARC LYS CYS SHER THE SHER SEE LEOi rEU CLV1 ALA~j vCE THEm POEr ARC aR PROt

MAC

ACA TAT CM

MAA

ASH THE TYRl GLU ACC CCT CCC ACC

TAC 650 LYS TEE

MCG CTC

AAC ATC CCA ACC CAG ATC ATC CAT GAG TTT GCC CTC GTT TCA CTC CCC CM CTC CTT TCT TTA TTA TTC ATT TTA TAT TTC PoJyA (29)

CCT MC CAC CTC TGT CTT CAC ACC TCT CTC TTC CCA TCT CTC CTC MC MA MA TM MA

FIG. 2. Complete nucleotide sequence of the cDNA and deduced amino acid sequence of human Tf. The amino acid sequence corresponding to the entire human Tf is numbered according to the system described in ref. 1. The sequence that corresponds to the signal peptide appears before the NH2-terminal valine in residue 1.

2754

Biochemistry: Yang et aL
Table 1. Homology summary

Proc. NatL. Acad. Sci. USA 81 (J984)

Location DNA Amino acid (amino acid residue) Homology block homology (%) Pa (0.25) homology (%) A 91-136 vs. 422-468 99/138 (72) 3.0 x 10-19 30/46 (65) B 184-217 vs. 513-545 65/102 (64) 4.4 x 10-11 19/34 (59) 220-254 vs. 556-590 C 1.4 x 10-10 65/105 (62) 23/35 (66) Pa (0.25) is the accident probability calculated for random-match probability, P, of 0.25. A homology over 100 nucleotides with a Pa of <7 x 10-4 is considered significant (18).
over

100 nucleotides with a Pa of <7

10-4 was considered

significant.

Gene Mapping. The in situ procedure of Harper and Saunders (19) was used for mapping the Tf gene. The recombinant plasmid containing a 1500-base-pair (bp) insert of cDNA encoding human Tf was labeled by nick-translation using [3H]dATP (40 Ci/mmol; 1 Ci = 37 GBq), [3H]dCTP (25 Ci/mmol), and [3H]dTTP (40 Ci/mmol). Specific activities of 1-2 x 107 cpm/gg of DNA were achieved. After development the hybridized preparations on slides were stained with 0.25% Wright's stain for G-banding. Chromosome mapping using human-mouse somatic cell hybrids that segregate human chromosomes was also carried out (20). Specific translocations involving human chromosome 3 were utilized in some hybrids for sublocalization of
the gene (21). TSL-2 contains the pter
-.

p21 region of chro-

mosome 3 in a 17/3 translocation chromosome. The XTR series of hybrids segregate X;3 translocation chromosomes:

XTR-22 contains the X/3 chromosome (with the q21 qter region of 3) and XTR-3BSAgB has the 3/X chromosome (the
--

pter

-.

q21 region of 3). DNAs from the cell hybrids

were

digested with EcoRI and were analyzed by Southern filter hybridization. The cDNA insert encoding Tf was isolated by PstI digestion, agarose electrophoresis, and electroelution of the two human fragments of -750 and 700 bp. The nicktranslated probe was hybridized to the cell hybrid DNA as described (21).

RESULTS AND DISCUSSION


Twenty cDNA clones isolated from a library containing human liver cDNA contained sequences of the Tf gene as deduced from the amino acid sequence (1) and the sequence corresponding to one of the synthesized oligonucleotide probes constructed according to the sequence beginning at Met-309 (3' ATG-GAT-GCC-AAG-ATG-TA 5'). Although

the second amino acid in the sequence was found to be aspartic acid, rather than asparagine (1), this probe was adequate to detect the cDNA encoding Tf. One cDNA clone contained the entire coding for the Tf sequence. It was characterized by cleavage with a series of restriction endonucleases following the strategy outlined in Fig. 1. The complete sequence of 2324 bp was established by analyzing each of the two DNA strands or by determining the sequence of the same strand two or more times. The complete 2324-bp cDNA sequence contains a single reading frame (Fig. 2). The ATG coding -19 methionine probably serves as the site of translation initiation as it is the first one encountered. It is followed 2091 nucleotides later by a TAA termination triplet. The valine designated as amino acid 1 is based on the amino acid sequence of human serum Tf (1). This valine is preceded by 19 amino acids that probably constitute a signal peptide involved in the secretion of Tf. A 19-residue leader sequence was also reported in the nucleotide sequence of chicken ovotransferrin mRNA (2, 22). Nucleotides encoding the leader sequence of ovotransferrin are 56% homologous to these described here in human Tf. Although the ChBlym-J transforming gene in chicken lymphoma DNA (6) is homologous with the NH2-terminal region of human Tf, its leader sequence bears little, if any, homology with that of Tf. The human Tf cDNA sequence predicts the amino acid sequence reported by MacGillivray, et al. (1), with exceptions commonly encountered in amino acid sequence analyses. The nucleotide sequence predicted Gln-245, instead of glutamic acid, and predicted aspargine, instead of aspartic acid, at residue 417. Aspartic acid, instead of asparagine, was deduced at residue 310. Residues 361 and 362 were AsnSer, instead of Ser-Asp. Residue 539 was proline, instead of threonine, and residue 542 was threonine, instead of proline. Residues 572 and 653 were predicted to be glutamic acid, instead of glutamine.

Table 2. Segregation of the Tf gene with human chromosomes in human-mouse somatic cell hybrids Human chromosome Hybrid Tf 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X
WIL-14 WIL-2 WIL-6 WIL-13
+
-

Translocation chromosomes

REW-1l
WIL-7 WIL-8X JWR-26C JWR-22H ATR-13 . NSL-9

+ + +

+ + + + + -

JSR-17S XER-7
XER-9 XER-11 EXR-SCSAz TSL-2 XTR-22

- + + - - + - + - + - + - + + + + + - + + + - - + - + + .+ +- + + + + -- - + .... - + + - + + - + - + + + + + + - - + + + + + - - + + + + + + + - + + + + + + - + + + + + + + + + - + + + - - + + ---+ -+ + + + + - + + + - + + + + + +. .+ - - + - + + + + + + + + + - + - - + + + + + + + + + + + + + + + + + + + + + + + + + + - + - + -- - +- + + + + + + -+ + -+ + ++ + + + + + + + + + + + + +- + + + + + + - + - +- - + + - -+ + ....+ -+ - + + + - + - + + ......+ -

+ + + + + + + + + + +

+ + + -

+ +
+ + . + + + + + + -

- + - + - + - + - + + -

-+ --

lp2/1 S/X 17/9

XTR-3BSAgB

+ + + + + -

+ + + -

7q-

ni/X
11/X 11/X, X/11 X/11
17/3 X/3

3/X

Biochemistry: Yang et aL

Proc. NatL Acad. Sci. USA 81 (1984)

2755

a
25
U)

20
0) 4)

U1)

15

4)

10
0

Ch

5
L

I 1~~~~
1/2 3

III
M

F E C D Chromosome group

b
i 411-.
1
0

.10

c
*4,

NJ 1%1.

S...
0*0

I
,R.

4.

N
0*
.

.A.

1'

FIG. 3. Localization of the human Tfgene on chromosome 3. (a) Distribution of labeled sites in 60 cells by chromosome groups illustrating significant number of grains on chromosome. (b) Representative human metaphase cell (x 1200) showing labeling of 3q21-25 (arrows). (c) Distribution of labeled sites on chromosome 3 from 60 cells (74% of grains localized to 3q21-25).

Three genetic variants of human Tf, TfD1 (23), TfDchi (24), and TfB2 (25), have been characterized and found to
differ at the

(His
A -+

--

Gly), 300 following amino acids: 277 (Asp Glu). All can be explained by Arg), and 652 (Gly
----

the nucleotide sequence (Fig. 2) to be mutational transitions


G, A -+ G, G A, respectively, in the second nucleo-

tide of each of the three codons (Fig. 2). A comparison of the nucleotide sequences encoding the NH2 and COOH domains of Tf indicates that during evolution selection acted more strongly on some regions of the Tf exon(s) than on others. When the cDNA sequence of the NH2 and COOH domains were compared by statistical analysis to identify regions of extensive internal nucleotide sequence homology (18), three sequences, designated as homology blocks A, B, and C in Table 1, demonstrated 72%, 64%, and 62% identity, respectively. The basis for evolutionary constraints on the codons contained in the homology blocks, A, B, and C is unknown; however, the areas do contain codons for tyrosine and histidine residues predicted to be functional sites important in iron binding (1). For example, of the eight pairs of tyrosine residues conserved in the

NH2 and COOH amino acid domains of both human (1) and hen Tf (3), six pairs, 95-426 and 136-468 in A, 185-514 and 188-517 in B, and 223-558 and 238-574 in C, are encoded by nucleotides in the three homology blocks. Similarly, all of the three pairs of histidines conserved in human (1) and hen (3) Tf, 119-451 in A, 207-535 in B, and 249-585 in C, are contained in the homology blocks. Diversity in the nucleotide sequence of the NH2 and COOH domains in Tf (50%o) offers sharp contrast to the low rate of divergence (17o) seen in codons encoding the same amino acids in homologous domains of the haptoglobin gene Hpa2 (26). The Hpa2 gene, also a product of intragenic duplication, arose in human populations (27), whereas the Tf gene is the product of an ancient duplication estimated to have occurred in prochordates 500 million years ago (28). The extensive divergence found in the nucleotide sequence of the two domains reflects this. Chromosomal mapping of the Tf gene was accomplished by somatic cell analysis and in situ hybridization. The Tf gene was mapped to human chromosome 3 by Southern blot analysis of human-mouse somatic cell hybrids after EcoRI digestion. The presence or absence of fragments detected by

2756

Biochemistry: Yang et aL

Proc. Natl. Acad. Sci. USA 81 (1984)


171-173. 6. Goubin, G., Goldman, D. S., Luce, J., Neiman, P. E. & Cooper, G. M. (1983) Nature (London) 302, 114-119. 7. Diamond, A., Cooper, G. M., Ritz, J. & Lane, M. A. (1983) Nature (London) 305, 112-116. 8. Alvarado-Urbina, G., Sathe, G. M., Liu, W.-C., Gillen, M. F., Duck, P. D., Bender, R. & Ogilvie, K. K. (1981) Science 214, 270-274. 9. Maniatis, T., Fritsch, E. F. & Sambrook, J. (1982) Molecular Cloning, A Laboratory Manual (Cold Spring Harbor Laboratory, Cold Spring Harbor, NY), pp. 122-127. 10. Prochownik, E. V., Markham, A. F. & Orkin, S. H. (1983) J. Biol. Chem. 258, 8389-8394. 11. Grunstein, M. & Hogness, D. S. (1975) Proc. Natl. Acad. Sci. USA 72, 3961-3965. 12. Wallace, R. 1a., Johnson, M. J., Hirose, T., Miyake, T., Kawashima, E. H. & Itakura, K. (1981) Nucleic Acids Res. 9, 879-894. 13. Denhardt, D. T. (1966) Biochem. Biophys. Res. Commun. 23, 641-646. 14. Clewell, D. B. & Helinski, D. R. (1972) J. Bacteriol. 110,

the Tf cDNA probe was determined for 19 characterized hybrids (Table 2). The Tf gene segregated with the chromosome 3 markers, aminoacyiase-1, DNA segment D3S1, DNA segment D3S2, the somatostatin gene, and a karyotypically normal chromosome 3. Furthermore, one hybrid retaining a region of chromosome 3 as a translocation, XTR-22, localized the Tf locus to a region between 3q21 -> qter. For all other chromosomes there were several discordant clones. In situ hybridization of Tf cDNA with human chromosomal spreads yielded a high percentage of hybridization signals on prometaphase and metaphase mitoses. Fig. 3a is a histogram presenting data from analysis of 60 chromosomal spreads. Fifty-eight percent of cell spreads had labeling on chromosome 3 with the peak band of grain distribution assigned as the gene locus. Fig. 3b is a photograph of a representative metaphase chromosomal spread. The localization of the Tfgene is estimated to be 3q21-25; 74% of all grains on chromosome 3 were localized to this region. Earlier work of Naylor, et al. (29) predicted that in mice Tf mapped to chromosome 9, a homologue of human chromosome 3. The genes encoding the Tf receptor and melanoma antigen p97 have also been assigned to chromosome 3 (30-32), and the Tf receptor gene has been localized to the 3q22 -+ ter region (33). A heretofore unmapped linkage group (34) consisting of ceruloplasmin, pseudocholinesterase-1, a2Hs-glycoprotein (and Tf) can now be mapped to chromosome 3. In summary, Tf cDNA has been identified and characterized from a human liver cDNA library. It encodes a 19-residue leader sequence followed by the homologous NH2 and COOH domains. Although the NH2 and COOH domains have diverged during evolution, three regions demonstrate a high percentage of nucleotide identity, possibly reflecting constraints on regions encoding residues important in iron binding. The cDNA encoding Tf has been utilized for chromosomal mapping by in situ hybridization on human mitotic chromosome spreads and somatic cell hybrid analysis. Results indicate that Tf maps to the long arm of human chromosome 3, within the region 3q21-25. We thank Dr. S. H. Orkin (Harvard Medical School, Boston) for the human liver cDNA library, Dr. Mary Harper (National Institutes of Health, Bethesda) for helpful suggestions and aid in chromosomal mapping, Dr. Chin Lin for computer analysis of homology, Judith Bergeron for technical help, and Mrs. Betty Russell for preparation of the manuscript. This work was supported in part by Grants HD16584 and GM33298 from the National Institutes of Health, an Institutional Grant from the American Cancer Society, a Grant-inAid from the American Heart Association with funds contributed in part by Texas Affiliate, and Grant 1-692 from the National Foundation Match of Dimes.
1. MacGillivray, R. T. A., Mendes, E., Shewale, J. G., Sinha, S. K., Lineback-Zins, J. & Brew, K. (1983) J. Biol. Chem. 258, 3543-3553. 2. Jeltsch, J. M. & Chambon, P. (1982) Eur. J. Biochem. 122, 291-295. 3. Williams, J., Elleman, T. C., Kingston, I. B., Wilkins, A. G. & Kuhn, K. A. (1982) Eur. J. Biochem. 122, 297-303. 4. Metz-Boutigue, M. H., Mazurier, J., Jolles, J., Spik, G., Montreuil, J. & Jolles, P. (1981) Biochim. Biophys. Acta 670, 243-

Doolittle, R. F. & Dreyer, W. J. (1982) Nature (London) 296,

1135-1146.
15. Blair, D. G., Sherratt, D. J., Clewell, D. B. & Helinski, D. R. (1972) Proc. Natl. Acad. Sci. USA 69, 2518-2522. 16. Tu, C. P. D. & Cohen, S. H. (1980) Gene 10, 177-183. 17. Maxam, A. M. & Gilbert, W. (1980) Methods Enzymol. 65, 498-560. 18. Sargent, T. D., Yang, M. & Bonner, J. (1981) Proc. Natl. Acad. Sci. USA 78, 243-246. 19. Harper, M. E. & Saunders, G. F. (1981) Chromosoma 83, 431439. 20. Shows, T. B., Naylor, S. L. & Sakaguchi, A. Y. (1982) in Advances in Human Genetics, eds. Hirschhorn, K. & Harris, H. (Plenum, NY), Vol. 12, pp. 341-452. 21. Naylor, S. L., Elliott, R. W., Brown, J. A. and Shows, T. B. (1982) Am. J. Hum. Genet. 34~235-244. 22. Thibodeau, S. N., Lee, D. C. & Palmiter, R. D. (1978) J. Biol. Chem. 253, 3771-3774. 23. Wang, A.-C., & Sutton, H. E. (1965) Science 149, 435-437. 24. Wang, A.-C., Sutton, H. E. & Howard, P. A. (1967) Biochem. Genet. 1, 55-59. 25. Wang, A.-C., Sutton, H. E. & Riggs, A. (1966) Am. J. Hum. Genet. 18, 454-458. 26. Yang, F., Brune, J. L., Baldwin, D., Barnett, D. R. & Bowman, B. H. (1983) Proc. NatI. Acad. Sci. USA 80, 5875-5879. 27. Smithies, O., Connell, G. E. & Dixon, G. H. (1962) Nature (London) 196, 232-236. 28. Williams, J. (1982) Trends Biochem. Sci. 7, 394-397. 29. Naylor, S. L., Elliot, R. W., Brown, J. A. & Shows, T. B (1982) Am. J. Hum. Genet. 34, 235-244. 30. Plowman, G. D., Brown, J. P., Enns, C. A., Schroder, J., Nikinmaa, B., Sussman, H. H., Hellstrom, K. E. & Hellstrom, J. (1983) Nature (London) 303, 70-72. 31. Goodfellow, P. N., Banting, G., Sutherland, R., Greaves, M., Solomon, E. & Povey, S. (1982) Somatic Cell Genet. 8, 197. 32. Enns, C. A., Suomaloinen, H. A., Gebhardt, J. E., Schroder, J. & Sussman, H. H. (1982) Proc. Natl. Acad. Sci. USA 79, 3241-3245. 33. Miller, Y. E., Jones, C., Scoggin, C., Morse, H. & Seligman, P. (1983) Am. J. Hum. Genet. 35, 573-583. 34. McKusick, V. A. & Conneally, P. M. (1984) Cytogenet. Cell Genet. 37, 207-209. 35. Yang, F., Lum, J., Baldwin, W. D., Brune, J. L., van Bragt, P. & Bowman, B. H. (1983) Am. J. Hum. Genet. 35, 184A (550) (abstr.).

254.

5. Brown, J. P., Hewick, R. M., Hellstrom, I., Hellstrom, K. E.,

Potrebbero piacerti anche