Sei sulla pagina 1di 30

REGULATION OF GENE EXPRESSION

KRISHNA BAKSI
DEPARTMENT OF ANATOMY & CELL BIOLOGY

2015
The regulation of gene expression is a critical component in regulating cellular metabolism,
and in orchestrating and maintaining the structural and functional differences that exist in cell
during development. Moreover, the high energetic cost of protein synthesis, regulation of
gene expression is essential for the optimal use of the available energy.

There are at least six potential points at which the amount of protein can be regulated (Fig.
27-1):

i. Synthesis of the primary RNA transcription


ii. Posttranscriptional processing of the mRNA
iii. Protein synthesis (translation)
iv. Posttranslational modification of protein
v. Protein degradation
GENE REGULATION

It is known that the amount of a given gene product varies under different conditions in a cell.
For example, a bacterium E. coli contains genes for about 3,000 different proteins, but it does
not need to synthesize all of these proteins at the same time. Therefore, it regulates the
number of molecules of these proteins that are made. For example, the regulation of the
number of -galactosidase molecules in cells. This enzyme converts the disaccharide
lactose to the monosaccharides, glucose and galactose. When E. coli is growing in a
medium containing glucose as the carbon source, -galactosidase is not required, and only
about five molecules of the enzyme are present in the cell. But, when the lactose is the sole
carbon source, 5,000 or more molecules of -galactosidase are produced in the cell. If the
lactose is removed from the medium, the synthesis of this enzyme stops rapidly (Fig. 32-2
and Fig. 32-3).

The compound, isopropylthiogalactoside (IPTG) which is not


metabolized by -galactosidase can serve as inducers, and
is called gratuitous inducers.

Fig. 27-6 The activities of galactosidase permease, and -


galactosidase in lactose metabolism in E. coli. The conversion of
lactose metabolism in transglycosylation is a minor reaction catalyzed by
-galactosidase.
GENE REGULAITON

The type of regulation and degree of regulation of a gene expression vary as per the function
of the protein. For example,

1. HOUSE KEEPING GENE: Some gene products are required all the time, and their genes
are expressed at a more or less constant level in all the cells. Many of these genes are for
enzymes that catabolism steps in the central metabolic pathways, such as citric acid cycle.
These genes are called housekeeping gene.

CONSTITUTIVE GNE EXPRESSION: Constant, seemingly unregulated expression of


gene is called constitutive gene expression.

2. The amounts of other gene products rise or fall in response to molecular signals:

INDUCTION: When transcription of the structural gene of an operon increases in response


to the presence of a specific substrate in the medium, the effect is known as induction. The
increase in transcription of the -galactosidase gene in the presence of lactose is an example
of induction.

INDUCERS: The small molecules which are responsible for the induction is called
inducers. They are the substrate or the product of the metabolic pathway. The compound,
isopropylthiogalactoside (IPTG) which is not metabolized by -galactosidase can serve as
inducers, and called gratuitous inducers. These types of inducers are valuable compounds to
study the system.

INDUCIBLE: Gene products that increase in concentration under prescribed molecular


circumstance are called inducible. For example, the expression of DNA repair enzyme.

REPRESSION: It is the process in which nutritional changes quickly turn off the synthesis
process. For example, if tryptophan is supplied in the medium, bacteria do not need to make
it themselves and the expression of the enzymes for these metabolic pathways are stopped.

REPRESSIBLE: The gene products that decrease the concentration are referred as
repressible. For example, the presence of ample amount of tryptophan leads to the repress
the genes for the enzymes involved in tryptophan synthesis.

COREPRESSORS: The small molecules that are responsible for repression to occur are
known as co repressors.
THE UNIT OF TRANSCRIPTION IN BACTERIA: THE OPERON:

E. coli chromosome is a circular double-stranded DNA molecule of about 4 million base pairs.
Most of its genes are not distributed randomly throughout the DNA, but the genes that code
for the enzymes of a specific metabolic pathway are clustered in one region of the DNA. In
addition, the genes for associated structural proteins, such as the 70 or more proteins that
comprise the ribosome, are frequently adjacent to one another. Members of a set of
clustered genes are usually co-ordinately controlled; they are transcribed together to form
polycistronic mRNA species that contains the coding sequences for several proteins. The
term operon is used to describe the complete regulatory unit of a set of clustered genes. Or
an operon is a coordinated unit of gene expression. An operon includes the adjacent
structural genes that code for the related enzymes or associated protein, a regulator gene,
or genes that code for regulator protein(s), and the control elements that are sites on the
DNA near the structural genes at which the regulatory proteins act. Fig. 30-7 shows the
general organization of operons model proposed by Jacob and Monod.

Fig. 30-7 The general organization of operons. Operons consist of transcriptional control
regions, and a set of related structural genes, all organized in a contiguous linear array along
the chromosome. The transcriptional control regions are the promoter and the operator,
which lie next to, or overlap, each other, upstream from the structural genes they control.
Operators may lie at various positions relative to the promoter, either upstream or
downstream. Expression of the operon is determined by access of RNA polymerase to the
promoter, and occupancy of the operator by regulatory proteins influences this access.
Induction activates transcription from the promoter; repression prevents it.
THE LACTOSE OPERON OF E. coli IS SUBJECTED TO NEGATIVE REGULATION:

The lactose operon contains three adjacent structural genes as shown in Fig. 8.2. LacZ
codes for the enzyme -galactosidase, lacY codes for a permease that occurs in the cell
membrane and participates in the transport of sugars, including lactose, across the
membrane. The third gene, lacA codes for -galactosidase transacetylase which transfers
the acetyl group form acetyl-CoA to -galactosidase.

A single mRNA species containing the coding sequences of all three structural genes is
transcribed from a promoter that is present just upstream from the lacZ gene. The induction
of these three genes occurs during the initiation of their transcription. Without the inducer,
transcription of the gene cluster occurs only at a very low level. In the presence of the
inducer, transcription begins to a transcription terminator located slightly beyond the end of
lacA. Therefore, the genes are co-ordinately expressed; either all the three or none. The
lactose mRNA is very unstable, it degrades with a half-life of about 3 min Therefore, and
expression of the operon can be altered very quickly.

THE REPRESSOR OF THE LACTOSE OPERON IS A DIFFUSIBLE PROTEIN: The regulator


gene of the lactose operon, lacI, is situated outside the lacZYA region, but affects the
transcription of all three structural genes. It codes for a protein whose only function is to
control the transcription initiation of the three lac structural genes. It is called the lac
repressor. It is not obligatory that a regulatory gene be physically close the gene cluster that
it regulates. In some of the other operons, it is not. Transcription of lacI is not regulated; this
single gene is always transcribed by its own promoter at a low rate that is relatively
independent of the cells status. Therefore, the affinity of the lacI promoter for RNA
polymerase seems to be the only factor involved in its transcription initiation.

The lac repressor is initially synthesized as monomer of 360 amino acids that form a
tetramer, the active form of the repressor. It has a strong affinity for a specific DNA sequence
that lies between lacP and the start of lacZ. This sequence is called the operator, and
designated as lacO. The operator overlaps the promoter so that the presence of the
repressor bound to the operator physically prevents RNA polymerase from binding to the
promoter and initiating transcription.

The repressor also has a strong affinity for the inducer molecules of the lac operon. Each
monomer has a binding site for an inducer molecule. The binding of inducer to the
monomers lowers its affinity for the operator sequence (Fig. 8.3). This result that the
repressor no longer binds to the operator so that RNA polymerase can begin transcription
from the promoter.
The lac Operon is Subject to Positive Regulation: CATABOLIC REPRESSION:
Other factors, besides lactose affect the expression of the lac genes. such as the availability
of glucose. Glucose, metabolized directly by glycolysis, is E. colis preferred energy source.
Expressing the genes for proteins that metabolize sugars such as lactose or arabinose is
wasteful if glucose is abundant.

What happens to the expression of the lac operon when both glucose and lactose are
present? Another regulator mechanism, called catabolite repression, prevents expression
of the genes for catabolism of lactose, in the presence of glucose, even when the secondary
sugars are also present. The effect of glucose is mediated by cAMP and a protein called
cAMP receptor protein, or CRP (the protein is also called CAP, for catabolite gene activator
protein). This homodimer has binding sites for DNA and cAMP. Binding is mediated by a
helix-turn-helix motif within the DNA-binding domain of the protein. When glucose is absent,
CRP binds to a site near the lac promoter and stimulates RNA transcription 50-fold. CRP is
therefore, a positive regulatory element responsive to glucose levels, whereas the Lac
repressor is a negative regulatory element responsive the lactose. The two act in concert;
CRP has little effect on the lac operon when the Lac repressor is blocking transcription, and
dissociation of the repressor from the lac operator has little effect on transcription of the lac
operon unless CRP is present to facilitate transcription, when CRP is not bound, the wild-type
lac promoter is a relatively weak promoter. The complex of RNA polymerase and the
promoter does not form readily unless CRP is present.

The promoter sequence of the lac operon contains recognition sites for RNA polymerase and
a regulatory protein, CAP (Fig. 27-17):
The effect of glucose on CAP is mediated by cAMP (Fig. 27-18): CAP binding occurs
when cAMP concentrations are high. In the presence of glucose, the concentration of cAMP
declines, preventing CAP binding, and thereby decreasing the expression of the lac operon.
Strong induction of the operon therefore requires both the presence of lactose (to inactivate
the repressor) and the absence or low concentration of glucose (to increase the
cAMPconcentration and facilitate CAP binding).
THE TRYPTOPHAN OPERON IN E. coli

Tryptophan is essential for bacterial growth; it is needed for the synthesis of all proteins
containing tryptophan. Therefore, if tryptophan is not supplied in sufficient quantity by the
medium, the cells must synthesize it. But lactose is not absolutely required for the cells
growth; many other sugars can substitute for it. Therefore, synthesis of the enzymes
involved in the synthesis of tryptophan is regulated differently than the synthesis of the
proteins coded for in the lac operon.

THE TRYPTOPHAN OPERON IS CONTROLLED BY A REPRESSOR PROTEIN:

Fig. 27-23 shows the five-steps involved in the synthesis of tryptophan from chorismic acid.
The tryptophan operon contains the five structural genes that code for the three enzymes
(two of which have two different subunits). The promoter where transcription begins, and an
operator to which binds a repressor
protein that is coded by the unlinked
trpR gene, present upstream from this
gene cluster. Transcription of the
lactose operon is generally turned
off unless it is induced by the small
molecule, inducer. The tryptophan
operon is always turned on unless it
is repressed by the presence of a small
molecule, corepressor (a term used to
distinguish it from the repressor
protein). Hence, the lac operon is
inducible, and the trp operon is
repressible. When the tryptophan
operon is actively transcribed, it is said
to depressed; the trp repressor is not
preventing RNA polymerase from
binding to the promoter.

The biosynthetic pathway for


tryptopan is regulated by mechanisms
that affect both the synthesis and the activity of the enzymes that catalyze the pathway. For
example, anthranilate synthetase, which catalyses the first step of the pathway, is coded by
the trpE and trpD genes of the trp operon. The number of molecules of this enzyme that is
present in the cell is determined by the transcriptional regulation of the trp operon. However,
the catalytic activity of the enzyme is regulated by feedback inhibition. This is a common
short-term means of regulating the first committed step in a metabolic pathway. Therefore,
as the concentration of tryptophan builds in the cells, it begins binding to anthranilate
synthetase, and immediately decreases its activity on the substrate, chorismic acid. In
addition, tryptophan also acts as a corepressor to shut down the synthesis of new enzyme
molecules from the trp operon.
The trp repressor is a tetramer of four identical subunits of about 100 amino acids each.
Under normal conditions about 20 molecules are present in the cell. The repressor by itself
does not bind to the trp operator. It must be complex with tryptophan in order to bind to the
operator, and therefore, acts in vivo only in the presence of tryptophan. This is exactly
opposite of the lac repressor, which binds to its operator only in the absence of its inducer.
Moreover, the trp repressor also regulated transcription of trpR, its own gene. As the trp
repressor accumulates in the cell, the repressor-tryptophan complex binds to a region
upstream of this gene turning off its transcription, and maintaining the equilibrium of 20
repressors per cell. In addition, the repressor-tryptophan complex represses transcription of
another gene, araH. This gene is not linked to any of the other genes of the trp operon. But
it codes for one of the three enzymes that catalyze the first steps in the common pathway of
aromatic amino acid biosynthesis. Therefore, the trp repressor influences the level of other
amino acids.
TRYPTOPHAN BIOSYNTHESIS IS REGULATED BY TRANSCRIPTION ATTENUATION:
Transcription attenuation describes a process in which transcription is initiated normally but is
abruptly halted before the operon genes are transcribed. The frequency with which
transcription is attenuated depends on the available concentration of tryptophan. The basis
for the mechanism, as worked out by Charles Tanofsky, is the very close coupling between
transcription and translation in bacteria.
The short open reading frame (regulatory sequence 1 or leader peptide) is the key element in
sensing tryptophan concentrations (Fig. 28-21). It can be thought of as tryptophan-sensitive
timing mechanism that determines whether sequence 3 pairs with sequence 4 (attenuating
transcription) or with sequence 2 (allowing transcription to continue). This open reading
frame includes two codons for tryptophan. When tryptophan concentrations are high,
concentrations of charged tryptophan tRNA (Trp-tRNATrp) are also high. Translation will
follow closely that of transcription, proceeding rapidly past the Trp codons and into sequence
2 before sequence 3 is synthesized by RNA polymerase. In this case sequence 2 is covered
by the ribosome and thus is rendered unavailable for pairing to sequence 3 when it is
synthesized; the attenuator structure (sequences 3 and 4) is formed and transcription is
halted. When tryptophan concentrations are low, however, the ribosome stalls at the two Trp
codons because charged tRNATrp is unavailable. Sequence 2 remains free as sequence 3 is
synthesized, these two sequences can base-pair, and transcription can proceed (Fig. 28-21).
In this way, the proportion of transcripts that are attenuated increases as tryptophan
concentration increases.
MOST EUKARYOTIC PROMOTERS ARE POSITIVELY REGULATED: The regulation of
transcription in eukaryotes differs in three important ways from that typically found in bacteria:

First, eukaryotes make use of gene regulatory proteins that can act even when they are
bound to DNA thousands of nucleotide pairs away from the promoter that they influence,
which means that a single promoter can be controlled by al almost unlimited number of
regulatory sequences scattered along the DNA.
Second, eukaryotic RNA polymerase II, which transcribes all protein-coding genes,
cannot initiate transcription on its own. It requires a set of protein called general transcription
factors, which must be assembled at the promoter before transcription can begin. (The term
general refers to the fact that these proteins assemble on all promoters transcribed by RNA
polymerase II; in this they differ from gene
regulatory proteins, which act only at particular
gene). This assembly process provides, in
principle, multiple steps at which the rate of
transcription initiation can be speeded up or
slowed down in response to regulatory signals,
and many eukaryotic gene regulatory proteins
influence these steps.
Third, the packaging of eukaryotic DNA into
chromatin provides opportunities for regulation
not available in bacteria.
The extensive use of positive regulatory mechanisms is probably a consequence of the larger
size of the eukaryotic genome. Negative regulatory elements appear to less common,
although many eukaryotic regulatory proteins can be either activators or repressors under
some circumstances.

Modular design of eukaryotic transcription factors: Eukaryotic transcription factors have


multiple domains that carry out specific interactions. These domains include DNA recognition
domains that participate in site-specific binding and activation domains that contact general
transcription factors, RNA polymerase II or other regulators of transcription. In addition,
many transcription factors have domains that bind to coactivators. Examples of coactivators
include steroid hormones and cAMP that, when bound, change the ability of the transcription
factor to either bind to DNA or serve as an activator. Although their overall amino acid
sequence and composition uniquely identify each transcription factor, the domains involved in
each activity can be grouped into a few characteristic motifs.
Eukaryotic gene expression can be regulated by intercellular and intracellular signals: The
effects of steroid hormones (and of thyroid and retinoid hormones, which share their mode of
action) provide additional well-studied examples of the modulation of eukaryotic regulatory
proteins by direct interaction with molecular signals. Unlike other types of hormones,
hormones of the steroid type do not bind to plasma membrane receptors. Instead, they
interact with intracellular receptors that are themselves transcriptional trans-activators.
Steroid hormones too hydrophobic to dissolve readily in blood (estrogen, progesterone, and
cortisol, for example) travel on specific carrier proteins from the point of their release to their
target tissues. In the target tissue, the hormone passes through the plasma membrane by
simple diffusion and binds to its specific receptor protein in the nucleus (Fig. 28-33). The
hormone-receptor complex acts by binding to highly specific DNA sequences called hormone
response elements (HREs) and altering gene expression. Hormone binding triggers changes
in the conformation of the receptor proteins so that they become capable of interacting with
additional transcription factors. The bound hormone-receptor complex can either enhance or
suppress the expression of adjacent genes.

The DNA sequences (HREs) to which hormone-receptor complex binds are similar in length
and arrangement, but different in sequence, for the various steroid hormones. Each receptor
has a consensus HRE sequence (Table 28-4) to which the hormone-receptor complex binds
well, each consensus sequence consisting of two six-nucleotide sequences, either
contiguous or separated by three nucleotides, in tandem or in a palindromic arrangement.
The hormone receptors have a highly conserved DNA-binding domain with two zinc fingers
(Fig. 28-33). The hormone-receptor complex binds to the DNA as a dimmer, with the zinc
finger domains of each monomer recognizing one of the six-nucleotide sequences. The
ability of a given hormone to act through the hormone-receptor complex to alter the
expression of specific gene depends on the exact sequence of the HRE, its position relative
to the gene and the number of HREs associated with the gene.
Activation of transcription of the LDL receptor gene illustrates many features found in
eukaryotic gene regulation: The transcription control of the gene for the low density
lipoprotein (LDL) receptor is shown in Fig. 3.80. The gene is transcribed in response to the
lact of cellular cholesterol. Increased transcription of the gene leads to an increased amount
of the LDL receptor protein and enhanced uptake of LDLs and their cholesterol in the blood.
DNA-BINDING MOTIFS IN GENE REGULATORY PROTEINS:

How does a cell determine which of its thousands of genes to transcribe? Transcription of
each gene is controlled by a regulatory region of DNA near the site where transcription
begins. Some regulatory regions are simple, and act as switches that are thrown by a single
signal. Other regulatory regions are complex, and act as tiny microprocessors, responding to
a variety of signals that they interpret and integrate to switch the neighboring gene on or off.
Whether simple or complex, these switching devise consists of two fundament types of
components:

i. short stretches of DNA of defined sequence, and


ii. Gene regulatory proteins that recognize and bind to them.
The outside of the DNA Helix can be Read by Proteins: Gene regulatory proteins must
recognize specific nucleotide sequences embedded within this structure. It was originally
thought that these proteins might require direct access to the hydrogen bonds between base
pairs in the interior of the double helix to distinguish between one DNA sequence and
another. It is now clear, however, that the outside of the double helix is studded with DNA
sequence information that gene regulatory proteins can recognize without having to open the
double helix. The edge of each base pair is exposed at the surface of the double helix,
presenting a distinctive pattern of hydrogen bond donors, hydrogen bond acceptors, and
hydrophobic patches for proteins to recognize in both the major and minor groove (Fig. 9-4).
But only in the major groove are the patterns unique for each of the four base-pair
arrangements (Fig. 9-5). For this reason gene regulatory proteins generally bind to t he
major groove.
Although the patterns of hydrogen bond donor and acceptor groups are the most important
features recognized by gene regulatory proteins, they are not the only ones: the nucleotide
sequence also determines the overall geometry of the double helix.
Gene Regulatory Proteins Contain Structural Motifs That Can Read DNA Sequence

Molecular recognition in biology generally relies on an exact fit between the surfaces of two
molecules, and the study of gene regulatory proteins has provided some of the clearest examples of this
principle. A gene regulatory protein recognizes a specific DNA sequence because the surface of the
protein is extensively complementary to the special surface features of the double helix in that region.
In most cases the protein makes a large number of contacts with the DNA, involving hydrogen bonds,
ionic bonds, and hydrophobic interactions. Although each individual contact is weak, the 20 or so
contacts that are typically formed, the protein-DNA interface add together to ensure that the interaction
is both highly specific and vary strong (Fig. 9-9). In fact, DNA-protein interactions are among the
tightest and most specific molecular interactions known in biology.
The Helix-Turn-Helix Motif is one of the Simplest and Most Common DNA-binding Motifs

The first DNA-binding protein motif to be recognized was the Helix-turn-helix. Originally identified
in bacterial proteins, this motif has since then found in hundreds of DNA-binding proteins from both
prokaryotes and eukaryotes. It is constructed from two helices connected by a short extended chain
of amino acids, which constitutes the turn. The two helices are held at a fixed angle, primarily
through interactions between the two helices.

Fig. 7-14 Some helix-turn-helix DNA-binding proteins. All of the proteins bind DNA as dimmers in which the
two copies of the recognition helix (red cylinder) are separated by exactly one turn of the DNA helix (3.4 nm).
The other helix of the helix-turn-helix motif is colored blue, as in Fig. 7-13. The lambda repressor and Cro
protein control bacteriophage lambda gene expression, and the tryptophan repressor and the CAP control the
expression of sets of E. coli genes.
Home domain Proteins Constitute a Special Class of Helix-Turn-Helix Proteins

In the fruit fly Drosophila, the homeotic selector genes, play a critical part in orchestrating fly
development. Nucleotide sequences of several homeotic selector genes contain an almost identical
stretch of 60 amino acids that defines this class of proteins and it termed the home domain. The tree-
dimensional structure of the home domain showed that it contains a helix-turn-helix motif. Home
domain proteins have been identified virtually in all eukaryotic organisms, from yeast to human. Fig.
7-16 shows the home domain bond to its specific DNA sequence.

Fig. 7-16 A home domain bound to its specific DNA sequence. Two different views of the same structure
are shown. (A) The home domain is folded into three helices, which are packed tightly together by
hydrophobic interactions. The part containing helix 2 and 3 closely resembles the helix-turn-helix motif. (B)
The recognition contacts with the major groove of DNA. The asparagine (Asn) of helix 3, for example, contacts
an adenine, as shown in Fig. 7-12. Nucleotide pairs are also contacted in the minor groove by a flexible arm
attached to helix 1. The home domain shown here is from yeast resembles home domains from many
eukaryotic organisms.
DNA-binding Zinc Finger Motifs:

The helix-turn-helix motif is composed solely of amino acids. A second important group of
DNA-binding motifs adds one or more zinc atoms as structural components. It is a simple
structure, consisting of a helix and sheet held together by the zinc (Fig. 7-18B). Fig. 7-18
shows other examples of zinc finger protein. Like the helix-turn-helix proteins, these proteins
usually form dimmers that allow one of the two helices of each subunit to interact with the
major groove of the DNA. These two types of proteins share two important features: both
use zinc as a structural element, and both use a helix to recognize the major groove of the
DNA.

Fig. 27-12 Zinc fingers. (a) A ribbon representation of a single zinc finger derived from the regulatory protein
Zif 268. The zinc atom is in orange and the amino acid residues that coordinate it (two His and two Cys) are
shown in red. (b) Three zinc fingers (light blue and gray) from Zif 368 are shown complexed with DNA. The
zinc atoms are again shown in orange.
The Leucine Zipper Motif Mediates Both DNA Binding and Protein Dimerization

Many gene regulatory proteins recognize DNA as homodimers, probably because this is a
simple way of achieving strong specific binding. Usually, the portion of the protein
responsible for dimerization is distinct from the portion that is responsible for DNA binding
(Fig. 7-14). One motif, however, combines these two functions in an elegant and economical
way. It is called the leucine zipper motif, so named because of the way the two helices,
one from each monomer, are joined together to form a short coiled-coil. The helices are held
together by interactions between hydrophobic amino acid sides chains (often on leucine) that
is extend from one side of each helix. Just beyond the dimerization interface the two
helices separate from each other to form a Y-shaped structure, which allows their side chains
to contact the major groove of DNA (Fig. 7-21).

Fig. 7-21 A leucine zipper dimmer bound to DNA. Two -helical


DNA-binding domains (bottom) dimerism through their -helical leucine
zipper region (top) to form an inverted Y-shaped structure. Each arm of
the Y is formed by a single helix, one from each monomer, that mediates
binding to a specific DNA sequence in the major groove of DNA. Each
helix binds to one-half of a symmetric DNA structure. The structure shown
is one of the yeast Gcn4 protein, which regulates transcription in response
to the availability of amino acids in the environment.

A striking feature of these helices is the frequent appearance of Leu residues; they tend to
occur as every seventh amino acid residue (Fig. 27-15).
Fig. 27-15 Leucine zippers. Comparison of amino acid sequences of several leucine zipper proteins. Note
the Leu (L) residues occurring every seventh residue in the dimerization region, and the number of Lys (K) and
Arg (R) residues in the DNA-binding domain.
Helix-loop-Helix Motif also mediates dimerization and DNA binding:

Another important DNA-binding motif, related to the leucine zipper, is the helix-loop-helix
(HLH) motif which should not be confused with the helix-turn-helix motif. An HLH motif
consists of a short helix connected by a loop to a second, longer helix. The flexibility of
the loop allows one helix to fold back and pack against the other. As shown in Fig. 7-25, this
two-helix structure binds both to DNA and to the HLH motif of second HLH protein. As with
the Leucine Zipper proteins, the second HLH protein can be the same (creating a
homodimer) or different (creating a heterodimers). In either case, two helices that extend
from the dimerization interface make specific contacts with the DNA.
REPETITIVE DNA SEQUENCES IN EUKARYOTES:

Another curiosity about mammalian DNA, and the DNA of higher organisms, is that, in
contrast to bacterial DNA, it contains repetitive sequences in addition to single copy
sequences. 25% of the genome consists of repetitive DNA, sequences that are repeated
over and over again in the genome, often thousands of times. There are two major classes
or repetitive DNA, dispersed repetitive DNA and satellite DNA. Satellite repeats are
clustered together in certain chromosome locations, where they occur in tandem (i.e., the
beginning of one repeat occurs immediately adjacent to the end of another). Dispersed
repeats, the name implies, tend to be scattered singly throughout the genome (they do not
occur in tandem).

Satellite DNA composes approximately 10% of the genome, and can be further subdivided
into several categories. Alpha-satellite DNA occurs as tandem repeats of a 171-bp
sequence that can extent several million bp or more in length. This type of satellite DNA is
found near the centromeres of chromosomes. Minisatellites are blocks of tandem repeats
whose total length is much smaller. These repeats, which are 20-70 bp in length, usually
have a total length of few thousand base pairs or so. Microsatellite DNA are smaller still: the
repeat units are usually only 2,3, or 4 bp in length, and the total length of the array is usually
less than a few hundred base pairs. Minisatellites and microsatellites are of special interest
in human genetics because they vary in length among individuals, making them highly useful
for gene mapping.

Dispersed repetitive DNA makes up about 15% of the genome, and these repeats fall into
two major categories, SINEs (short interspersed elements) and LINEs (long interspersed
elements). Individual SINEs range in size from 90 to 500 bp, while individual LINEs can be
as large as 7,000 kb. One of the most important types of SINEs is termed the Alu repeats.
These sequences contain the site for the restriction enzyme, Alu. These repeats constitute
about 2% to 3% of human DNA. A remarkable feature of Alu sequences is that they can
generate copies of themselves, which can then insert into other parts of the genome. This
insertion can sometimes interrupt a protein-coding gene, causing genetic disease.

Potrebbero piacerti anche