Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Introduction to Genomics
Course Instructor:
Dr. Anum Masood
1
Genome
Genome Size
• The complete DNA sequence defines what we call a genome.
• The genome is therefore the total genetic information that is carried
within the cell.
• That includes the DNA in the nucleus and DNA in any of the
organelles.
• This is new: turns out that some of the organelles also include DNA.
• DNA in the nucleus is most often not a single molecule, but rather
broken into pieces and organized within the chromosomes.
• But do not worry about chromosomes at this point (and at least not for a
few next lectures).
Genome
Genome Size
• So what is the total length of the DNA sequence?
• It depends on an organism.
• Prokaryote (bacteria) have the shortest genome.
• The length of the DNA sequence is expressed in the base pairs (bp),
which is a unit consisting of two nucleobases bound to each other by
hydrogen bonds.
• Simply, one base pair, one nucleotide on each strand.
• The total number of nucleotides in one of the strands is the size (length)
of the genome.
Genome
Genome Size
• Bacteria have genomes of length ranging from 0.5 to 13 Mbp.
• The unit Mbp means mega base pairs, or 106106 base pairs.
• Can we predict the protein and RNA sequences simply from DNA?
• Yes. Very useful activity in bioinformatics
DNA Coding Regions: Pretending
to Work with Protein Sequences
Beginning with the third position (GGA-AGT- . . .) again leads to an entirely different translation.
Turning DNA into proteins: Reading Frames
• Because of the triplet-based genetic code, a given DNA interval, on a given strand,
can theoretically be translated in three different ways
• Basically three perspectives that are known in the field as reading frames.
• Because the DNA can be used from both strands, a total of six possible reading
frames are possible for translating a DNA sequence into proteins.
• With very few exceptions (found in exotic viruses), only one of these six frames is
used for any given DNA coding region.
Open Reading Frame (ORF)
• An interval of DNA sequence that begins at Start Codon (ATG M=Methionine)
and remains free of STOP Codon (TAA, TGA, or TAG) is called an open reading
frame (ORF)
Six ORFs Example-1 Six ORFs Example-2
Turning DNA into proteins: The genetic code
• Some DNA sequences are not encoding proteins at all — and that
higher organisms have large pieces of noncoding DNA inserted within
their genes.
26