Sei sulla pagina 1di 5

ID CO4A1_DROME Reviewed; 1779 AA.

AC P08120; A4V070; Q9VMV4;


DT 01-AUG-1988, integrated into UniProtKB/Swiss-Prot.
DT 04-DEC-2007, sequence version 3.
DT 01-OCT-2014, entry version 138.
DE RecName: Full=Collagen alpha-1(IV) chain;
DE Flags: Precursor;
GN Name=Cg25C; Synonyms=DCg1; ORFNames=CG4145;
OS Drosophila melanogaster (Fruit fly).
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta;
OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha;
OC Ephydroidea; Drosophilidae; Drosophila; Sophophora.
OX NCBI_TaxID=7227;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA].
RC STRAIN=Oregon-R;
RX PubMed=3142875;
RA Blumberg B., Mackrell A.J., Fessler J.H.;
RT "Drosophila basement membrane procollagen alpha 1(IV). II. Complete
RT cDNA sequence, genomic structure, and general implications for
RT supramolecular assemblies.";
RL J. Biol. Chem. 263:18328-18337(1988).
RN [2]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RA Blumberg B.;
RL Thesis (1987), University of California Los Angeles, United States.
RN [3]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RA Mackrell A.J.;
RL Thesis (1992), University of California Los Angeles, United States.
RN [4]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Berkeley;
RX PubMed=10731132; DOI=10.1126/science.287.5461.2185;
RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D.,
RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F.,
RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N.,
RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X.,
RA Brandon R.C., Rogers Y.-H.C., Blazej R.G., Champe M., Pfeiffer B.D.,
RA Wan K.H., Doyle C., Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G.,
RA Abril J.F., Agbayani A., An H.-J., Andrews-Pfannkoch C., Baldwin D.,
RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M.,
RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S.,
RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P.,
RA Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I.,
RA Cherry J.M., Cawley S., Dahlke C., Davenport L.B., Davies P.,
RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M.,
RA Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P.,
RA Durbin K.J., Evangelista C.C., Ferraz C., Ferriera S., Fleischmann W.,
RA Fosler C., Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K.,
RA Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M.,
RA Harris N.L., Harvey D.A., Heiman T.J., Hernandez J.R., Houck J.,
RA Hostin D., Houston K.A., Howland T.J., Wei M.-H., Ibegwam C.,
RA Jalali M., Kalush F., Karpen G.H., Ke Z., Kennison J.A., Ketchum K.A.,
RA Kimmel B.E., Kodira C.D., Kraft C.L., Kravitz S., Kulp D., Lai Z.,
RA Lasko P., Lei Y., Levitsky A.A., Li J.H., Li Z., Liang Y., Lin X.,
RA Liu X., Mattei B., McIntosh T.C., McLeod M.P., McPherson D.,
RA Merkulov G., Milshina N.V., Mobarry C., Morris J., Moshrefi A.,
RA Mount S.M., Moy M., Murphy B., Murphy L., Muzny D.M., Nelson D.L.,
RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M.,
RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G.,
RA Reinert K., Remington K., Saunders R.D.C., Scheeler F., Shen H.,
RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T.J.,
RA Spier E., Spradling A.C., Stapleton M., Strong R., Sun E.,
RA Svirskas R., Tector C., Turner R., Venter E., Wang A.H., Wang X.,
RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J.,
RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A.,
RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L.,
RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S.C., Zhu X.,
RA Smith H.O., Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.;
RT "The genome sequence of Drosophila melanogaster.";
RL Science 287:2185-2195(2000).
RN [5]
RP GENOME REANNOTATION.
RC STRAIN=Berkeley;
RX PubMed=12537572; DOI=10.1186/gb-2002-3-12-research0083;
RA Misra S., Crosby M.A., Mungall C.J., Matthews B.B., Campbell K.S.,
RA Hradecky P., Huang Y., Kaminker J.S., Millburn G.H., Prochnik S.E.,
RA Smith C.D., Tupy J.L., Whitfield E.J., Bayraktaroglu L., Berman B.P.,
RA Bettencourt B.R., Celniker S.E., de Grey A.D.N.J., Drysdale R.A.,
RA Harris N.L., Richter J., Russo S., Schroeder A.J., Shu S.Q.,
RA Stapleton M., Yamada C., Ashburner M., Gelbart W.M., Rubin G.M.,
RA Lewis S.E.;
RT "Annotation of the Drosophila melanogaster euchromatic genome: a
RT systematic review.";
RL Genome Biol. 3:RESEARCH0083.1-RESEARCH0083.22(2002).
RN [6]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 762-1230.
RX PubMed=6210912; DOI=10.1073/pnas.79.6.1761;
RA Monson J.M., Natzle J.E., Friedman J., McCarthy B.J.;
RT "Expression and novel structure of a collagen gene in Drosophila.";
RL Proc. Natl. Acad. Sci. U.S.A. 79:1761-1765(1982).
RN [7]
RP NUCLEOTIDE SEQUENCE [MRNA] OF 1065-1779.
RX PubMed=3106346;
RA Blumberg B., Mackrell A.J., Olson P.F., Kurkinen M., Monson J.M.,
RA Natzle J.E., Fessler J.H.;
RT "Basement membrane procollagen IV and its specialized carboxyl domain
RT are conserved in Drosophila, mouse, and human.";
RL J. Biol. Chem. 262:5947-5950(1987).
RN [8]
RP NUCLEOTIDE SEQUENCE [MRNA] OF 1356-1779.
RC TISSUE=Larva;
RX PubMed=3109906; DOI=10.1111/j.1432-1033.1987.tb11480.x;
RA Cecchini J.-P., Knibiehler B., Mirre C., le Parco Y.;
RT "Evidence for a type-IV-related collagen in Drosophila melanogaster.
RT Evolutionary constancy of the carboxyl-terminal noncollagenous
RT domain.";
RL Eur. J. Biochem. 165:587-593(1987).
CC -!- FUNCTION: Collagen type IV is specific for basement membranes.
CC -!- SUBUNIT: Trimers of two alpha 1(IV) and one alpha 2(IV) chain.
CC Type IV collagen forms a mesh-like network linked through
CC intermolecular interactions between 7S domains and between NC1
CC domains.
CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space, extracellular
CC matrix, basement membrane.
CC -!- DOMAIN: Alpha chains of type IV collagen have a non-collagenous
CC domain (NC1) at their C-terminus, frequent interruptions of the G-
CC X-Y repeats in the long central triple-helical domain (which may
CC cause flexibility in the triple helix), and a short N-terminal
CC triple-helical 7S domain.
CC -!- PTM: Prolines at the third position of the tripeptide repeating
CC unit (G-X-Y) are hydroxylated in some or all of the chains.
CC -!- PTM: Type IV collagens contain numerous cysteine residues which
CC are involved in inter- and intramolecular disulfide bonding. 12 of
CC these, located in the NC1 domain, are conserved in all known type
CC IV collagens.
CC -!- SIMILARITY: Belongs to the type IV collagen family.
CC {ECO:0000255|PROSITE-ProRule:PRU00736}.
CC -!- SIMILARITY: Contains 1 collagen IV NC1 (C-terminal non-
CC collagenous) domain. {ECO:0000255|PROSITE-ProRule:PRU00736}.
CC -----------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution-NoDerivs License
CC -----------------------------------------------------------------------
DR EMBL; J02727; AAA28423.1; -; mRNA.
DR EMBL; M23704; AAA28404.1; -; mRNA.
DR EMBL; M96575; AAB59184.1; -; Genomic_DNA.
DR EMBL; AE014134; AAF52204.1; -; Genomic_DNA.
DR EMBL; AE014134; AAN10519.1; -; Genomic_DNA.
DR EMBL; AE014134; AAN10520.1; -; Genomic_DNA.
DR EMBL; V00200; CAA23486.2; -; Genomic_DNA.
DR EMBL; M28334; AAA28422.1; -; mRNA.
DR PIR; A31893; A31893.
DR RefSeq; NP_723044.1; NM_164615.2.
DR RefSeq; NP_723045.1; NM_164616.2.
DR RefSeq; NP_723046.1; NM_164617.2.
DR UniGene; Dm.5178; -.
DR ProteinModelPortal; P08120; -.
DR SMR; P08120; 1557-1777.
DR BioGrid; 59908; 9.
DR DIP; DIP-59819N; -.
DR MINT; MINT-1738460; -.
DR STRING; 7227.FBpp0078641; -.
DR PaxDb; P08120; -.
DR PRIDE; P08120; -.
DR EnsemblMetazoa; FBtr0079001; FBpp0078640; FBgn0000299.
DR EnsemblMetazoa; FBtr0079002; FBpp0078641; FBgn0000299.
DR EnsemblMetazoa; FBtr0079003; FBpp0078642; FBgn0000299.
DR GeneID; 33727; -.
DR KEGG; dme:Dmel_CG4145; -.
DR CTD; 33727; -.
DR FlyBase; FBgn0000299; Cg25C.
DR eggNOG; NOG12793; -.
DR GeneTree; ENSGT00740000114967; -.
DR InParanoid; P08120; -.
DR KO; K06237; -.
DR OMA; GVPGQKX; -.
DR OrthoDB; EOG7RZ5P3; -.
DR PhylomeDB; P08120; -.
DR Reactome; REACT_180755; Collagen biosynthesis and modifying enzymes.
DR Reactome; REACT_214037; ECM proteoglycans.
DR Reactome; REACT_215998; Assembly of collagen fibrils and other
multimeric structures.
DR Reactome; REACT_224951; Anchoring fibril formation.
DR ChiTaRS; Cg25C; drosophila.
DR GenomeRNAi; 33727; -.
DR NextBio; 784979; -.
DR PRO; PR:P08120; -.
DR Bgee; P08120; -.
DR GO; GO:0005587; C:collagen type IV trimer; NAS:FlyBase.
DR GO; GO:0005201; F:extracellular matrix structural constituent;
IEA:InterPro.
DR GO; GO:0007391; P:dorsal closure; TAS:FlyBase.
DR GO; GO:0035848; P:oviduct morphogenesis; IMP:FlyBase.
DR Gene3D; 2.170.240.10; -; 1.
DR InterPro; IPR016187; C-type_lectin_fold.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR001442; Collagen_VI_NC.
DR Pfam; PF01413; C4; 2.
DR Pfam; PF01391; Collagen; 18.
DR SMART; SM00111; C4; 2.
DR SUPFAM; SSF56436; SSF56436; 2.
DR PROSITE; PS51403; NC1_IV; 1.
PE 2: Evidence at transcript level;
KW Basement membrane; Collagen; Complete proteome; Disulfide bond;
KW Extracellular matrix; Glycoprotein; Hydroxylation; Reference proteome;
KW Repeat; Secreted; Signal.
FT SIGNAL 1 23
FT PROPEP 24 ? N-terminal propeptide (7S domain).
FT /FTId=PRO_0000005754.
FT CHAIN ? 1779 Collagen alpha-1(IV) chain.
FT /FTId=PRO_0000005755.
FT DOMAIN 1555 1778 Collagen IV NC1. {ECO:0000255|PROSITE-
FT ProRule:PRU00736}.
FT REGION ? 1545 Triple-helical region.
FT CARBOHYD 72 72 N-linked (GlcNAc...). {ECO:0000305}.
FT DISULFID 1570 1659 Or C-1570 with C-1656.
FT {ECO:0000255|PROSITE-ProRule:PRU00736}.
FT DISULFID 1603 1656 Or C-1603 with C-1659.
FT {ECO:0000255|PROSITE-ProRule:PRU00736}.
FT DISULFID 1615 1621 {ECO:0000255|PROSITE-ProRule:PRU00736}.
FT DISULFID 1678 1774 Or C-1678 with C-1771.
FT {ECO:0000255|PROSITE-ProRule:PRU00736}.
FT DISULFID 1712 1771 Or C-1712 with C-1774.
FT {ECO:0000255|PROSITE-ProRule:PRU00736}.
FT DISULFID 1724 1731 {ECO:0000255|PROSITE-ProRule:PRU00736}.
FT CONFLICT 383 383 K -> E (in Ref. 1; AAA28404/AAA28423 and
FT 2; AAB59184). {ECO:0000305}.
FT CONFLICT 437 437 G -> D (in Ref. 1; AAA28404/AAA28423 and
FT 2; AAB59184). {ECO:0000305}.
FT CONFLICT 948 948 L -> S (in Ref. 6; CAA23486).
FT {ECO:0000305}.
FT CONFLICT 997 997 T -> S (in Ref. 1; AAA28404/AAA28423 and
FT 2; AAB59184). {ECO:0000305}.
FT CONFLICT 1329 1333 AGEPG -> PESR (in Ref. 1; AAA28404/
FT AAA28423 and 2; AAB59184). {ECO:0000305}.
FT CONFLICT 1358 1358 Q -> K (in Ref. 8; AAA28422).
FT {ECO:0000305}.
FT CONFLICT 1361 1361 Q -> K (in Ref. 8; AAA28422).
FT {ECO:0000305}.
FT CONFLICT 1374 1374 T -> I (in Ref. 8; AAA28422).
FT {ECO:0000305}.
FT CONFLICT 1424 1424 N -> T (in Ref. 8; AAA28422).
FT {ECO:0000305}.
FT CONFLICT 1497 1497 R -> L (in Ref. 1; AAA28404/AAA28423 and
FT 2; AAB59184). {ECO:0000305}.
FT CONFLICT 1508 1512 ETGNV -> RAGQR (in Ref. 8; AAA28422).
FT {ECO:0000305}.
FT CONFLICT 1530 1530 E -> K (in Ref. 8; AAA28422).
FT {ECO:0000305}.
FT CONFLICT 1600 1602 Missing (in Ref. 1; AAA28404/AAA28423, 2;
FT AAB59184 and 8; AAA28422). {ECO:0000305}.
FT CONFLICT 1737 1737 M -> I (in Ref. 8; AAA28422).
FT {ECO:0000305}.
SQ SEQUENCE 1779 AA; 174300 MW; 6770F18AE40A313E CRC64;
MLPFWKRLLY AAVIAGALVG ADAQFWKTAG TAGSIQDSVK HYNRNEPKFP IDDSYDIVDS
AGVARGDLPP KNCTAGYAGC VPKCIAEKGN RGLPGPLGPT GLKGEMGFPG MEGPSGDKGQ
KGDPGPYGQR GDKGERGSPG LHGQAGVPGV QGPAGNPGAP GINGKDGCDG QDGIPGLEGL
SGMPGPRGYA GQLGSKGEKG EPAKENGDYA KGEKGEPGWR GTAGLAGPQG FPGEKGERGD
SGPYGAKGPR GEHGLKGEKG ASCYGPMKPG APGIKGEKGE PASSFPVKPT HTVMGPRGDM
GQKGEPGLVG RKGEPGPEGD TGLDGQKGEK GLPGGPGDRG RQGNFGPPGS TGQKGDRGEP
GLNGLPGNPG QKGEPGRAGA TGKPGLLGPP GPPGGGRGTP GPPGPKGPRG YVGAPGPQGL
NGVDGLPGPQ GYNGQKGGAG LPGRPGNEGP PGKKGEKGTA GLNGPKGSIG PIGHPGPPGP
EGQKGDAGLP GYGIQGSKGD AGIPGYPGLK GSKGERGFKG NAGAPGDSKL GRPGTPGAAG
APGQKGDAGR PGTPGQKGDM GIKGDVGGKC SSCRAGPKGD KGTSGLPGIP GKDGARGPPG
ERGYPGERGH DGINGQTGPP GEKGEDGRTG LPGATGEPGK PALCDLSLIE PLKGDKGYPG
APGAKGVQGF KGAEGLPGIP GPKGEFGFKG EKGLSGAPGN DGTPGRAGRD GYPGIPGQSI
KGEPGFHGRD GAKGDKGSFG RSGEKGEPGS CALDEIKMPA KGNKGEPGQT GMPGPPGEDG
SPGERGYTGL KGNTGPQGPP GVEGPRGLNG PRGEKGNQGA VGVPGNPGKD GLRGIPGRNG
QPGPRGEPGI SRPGPMGPPG LNGLQGEKGD RGPTGPIGFP GADGSVGYPG DRGDAGLPGV
SGRPGIVGEK GDVGPIGPAG VAGPPGVPGI DGVRGRDGAK GEPGSPGLVG MPGNKGDRGA
PGNDGPKGFA GVTGAPGKRG PAGIPGVSGA KGDKGATGLT GNDGPVGGRG PPGAPGLMGI
KGDQGLAGAP GQQGLDGMPG EKGNQGFPGL DGPPGLPGDA SEKGQKGEPG PSGLRGDTGP
AGTPGWPGEK GLPGLAVHGR AGPPGEKGDQ GRSGIDGRDG INGEKGEQGL QGVWGQPGEK
GSVGAPGIPG APGMDGLPGA AGAPGAVGYP GDRGDKGEPG LSGLPGLKGE TGPVGLQGFT
GAPGPKGERG IRGQPGLPAT VPDIRGDKGS QGERGYTGEK GEQGERGLTG PAGVAGAKGD
RGLQGPPGAS GLNGIPGAKG DIGPRGEIGY PGVTIKGEKG LPGRPGRNGR QGLIGAPGLI
GERGLPGLAG EPGLVGLPGP IGPAGSKGER GLAGSPGQPG QDGFPGAPGL KGDTGPQGFK
GERGLNGFEG QKGDKGDRGL QGPSGLPGLV GQKGDTGYPG LNGNDGPVGA PGERGFTGPK
GRDGRDGTPG LPGQKGEPGM LPPPGPKGEP GQPGRNGPKG EPGRPGERGL IGIQGERGEK
GERGLIGETG NVGRPGPKGD RGEPGERGYE GAIGLIGQKG EPGAPAPAAL DYLTGILITR
HSQSETVPAC SAGHTELWTG YSLLYVDGND YAHNQDLGSP GSCVPRFSTL PVLSCGQNNV
CNYASRNDKT FWLTTNAAIP MMPVENIEIR QYISRCVVCE APANVIAVHS QTIEVPDCPN
GWEGLWIGYS FLMHTAVGNG GGGQALQSPG SCLEDFRATP FIECNGAKGT CHFYETMTSF
WMYNLESSQP FERPQQQTIK AGERQSHVSR CQVCMKNSS
//