Sei sulla pagina 1di 2

A quick exercise in finding open reading frames (ORFs) in DNA sequence

remember: 1. ORFs start with a START codon (ATG or sometimes GTG) and end with a stop codon (TAA, TAG or TGA). 2. They potentially encode a protein sequence (i.e. they could be part of a real gene), but they could just be a chance consequence of the DNA sequence. 3. ORFs can be found in any of the six possible reading frames (3 from each strand). 4. The number of ORFs you find in a sequence is dependent on the minimum ORF size that you choose. Any ORF smaller than ~25 codons is unlikely to be a real gene. To search for ORFs: 1. Go to: tools.neb.com/NEBcutter2/index.php 2. Copy and paste the ~3 kb of DNA sequence given below into the box (numbers and spaces are ignored by the program). 3. Chose a minimum ORF length. 4. Click on submit 5. Note the number and sizes of the ORFs (note that if you click on an ORF you get the predicted amino acid sequence). 6. Try changing the minimum ORF length and re-running the analysis.

Test sequence:
1 aattcaaaaa aatttcgttg tggttttggt tccaaagata ttcaatacca ggtcgaatca 61 aaaaagactt tgaaagaatg ttaataaata atggtacaaa aattaaagta attaaacatc 121 ttacagaaac aatcacttga tatcgagaaa ttctaaattc ttgaataaca agcatttctg 181 cgccaggaag aagttgtttt cgaaatcttt caaaggtgcg taaaattgat cgtgggatta 241 atccagtttg ttctcttgtt cgttttttca tagaataaaa agtttttaaa aatttcttta 301 agtttttcct tttttttctt taagttttcc tttttcattt cgttttcttc cttctttttt 361 ttttttagaa aaaaaaagat caaaaagaaa ggcttctttt ctttttctca aaaaaaagaa 421 aagacttcta ttttttgatt caatgagttt ttcgttcctt aaaaaaaaaa tggaatcaga 481 aaagaaagac atgaaaaaaa gatttttttt cattttaaaa aaacttataa aagacacaaa 541 aacttaaggt tgggactaaa cttttagtat acaattgtgt caccaaactt gtcaagctcg 601 aaaacaactc ttttgcagta aaaaaaaatg cccgcaaata ctcaaaaata ttattgtaat 661 tagtttattt agtattgttt ttcttgctac tatttaggct cgttacatta aaattgtaat 721 ttggtttaat ttgtttttgc tgcagttttg aaaagctttt gaaaatttag gggcgaaata 781 atgtatttta gggtaatctg cttttatagc aagaaataaa aaaagagagg aaaagactaa 841 tttgactttt ttttttatta attgttacac tataaagtgt tgggaaaaaa aacctttttt 901 tttttttaac tctggaaaca attttaaatt aactttattt ctgcattaca aaataaacga 961 aagaaaaaac cactttttaa ttctttttaa aaattgatga aattacaaaa aatggttttt 1021 ggattagata acgtaacttt tctcacagat tgcattgatg atttttttaa tttatatgat 1081 gaatgttttg aaattttttt ctagttgctc ctaaaaaatc tctattcagc agaagagttg 1141 gaacaaaacc catcaaattt attttttaca aaagtgaaaa agcgaagtga aaatgattta 1201 gctaattttt tagactttga gttggaattt tgtgaaaaaa attatctttt ttttacttat 1261 ctttgggctg atttgagtat tcctgctgat caagattttt tggctaaaaa agtaaataaa 1321 gaacgaatct tacttcaaaa aaaattcaag ttctcattgg ttcttgtgca gaaaaaacta 1381 accattttaa agaaaaaaag cgagaaattc tgttagaaat ttataatttt tcgttgattt 1441 ttttgcgtga acagcttttt ttgagacata attttttaat taattttttt gcatcaaaaa

1501 accagaaaga aaaacaggcc ttttattttc aacaattatc cagtttagac tgatctatta 1561 ctctctatac ttgaagtaat cactttaaaa caaattttta aaactatata aaaaaaacca 1621 tgtcatttaa ataatttttt ttaagtttta aagttgatct cgaaaaaaaa atattttttt 1681 tcgatgaaca cccttaacct cttaaaacat caattttaaa ttcaaataaa aattctttaa 1741 aaaaaaacaa gaacctacaa atttctcttg gaaagaatat ttttttttag gaactataat 1801 ttatgtcgga tcttggttta attttacaaa acctcaaaaa aatttaactt attcaggctt 1861 aatgccgccc gcattagctg tccgaagtct gcaattttcc cgaagttatc gtgatagttc 1921 tgtttttttt tgctgttact gaggctcagg cgaagcaaga aagctcgact gcgagtagtg 1981 gtgaaagtca aaaggtggta aaacaaaaat cggcacaagc tggggaatcc caattttcag 2041 cagcacaaaa agcttggagc aaaaacggaa aaaaacaagg ttatcaaaat agacaaaata 2101 ccaaagcccg agaacaggca gatgctcaag ttgttcgagc agctgctcaa ttggggcacc 2161 gtatgatccc cgctttacta aagcgcctga aaaggctgaa ggtgtaaatt taaatgaaca 2221 agaggattct gctgaagctg cagctgccat aaatccaaaa aaggatctgg cagaagccac 2281 tgttgaaaga caacgtgaag agaatgcttc tattgagcaa gagtttttgg ctatcaccca 2341 cagattaggc aatgttacgc tatacaacaa taacgtaaac gcggagcttt taaagtttaa 2401 agctaatcca atttatgtgg atttattatc tgagaacgaa aaagtcttgg aaagtaaaaa 2461 aaagtttgcg cagggatacg aaaagttcgt agaggcggtt aaaagtcaca atacaacgcc 2521 tcttgaaaat acagaagccc gttcacattt tcaagaaaaa atcaatgtgt gctttaaagg 2581 aatgatggaa agccataaag agtttgttga actcaaaagc gagtatcgta agagtatgag 2641 agcttttgat cggttactaa aaatcagttg taaaattcac gacttaaaaa tatttggaaa 2701 tattattaca gttgacgaac tccttccacc gcctgagact tccgcacaat tcattgatga 2761 agcatctacc tcaaatcaaa gcttagcatc gcaagaacca aatcagaaca aacaagattt 2821 cagaaaagcg ataacacaac aaaacgctca caacagtgct ttagaaaaag aaatagctaa 2881 acaaacagct aataagacaa gcaaaaaaaa ttaaacccaa aacccagcgg ttttactgca 2941 cgccaaaaaa agataatagc tgagataaga gtaattatcg aacgccagat cgtaatcaaa 3001 tctctaaggc gtattttgat gctcgaactc cagcaaacac actttctcct gacgaactgc

Potrebbero piacerti anche