Sei sulla pagina 1di 4

Special Section: Innovative Laboratory ExercisesFocus on

Genomic Annotation
Introduction: Sequences and Consequences

Cheryl A. Kerfeld*

From the Department of Energy Joint Genome Institute, 2800 Mitchell


Drive, Walnut Creek, California 94598, Department of Plant and Microbial
Biology, University of California, Berkeley, California 94720, Synthetic
Biology Institute, University of California, Berkeley, California 94720
Copyright 2013 by The International Union of Biochemistry and Molecular
Biology, 41(1):1215, 2013
Keywords: Active learning; computational biology; computers in research
and teaching; curriculum design development and implementation;
genomics proteomics bioinformatics; inquiry based teaching

Almost a decade ago, Bio2010: Transforming Life Sciences Education for Future Research Biologists [1] called for a reformulation
of undergraduate life sciences majors instruction, citing that the
typical curriculum no longer reflected modern research in biology. It advocated the importance of giving students the experience
of real research to better understand biological principles and
because experiencing the power and beauty of creative inquiry
is the best way to engage students in learning about science.
Likewise, in the last decade, new sequencing technologies
and the development of computational tools to explore genomic
sequence data have transformed life sciences research.
Genomes are being sequenced at an increasingly fast pace
[2,3]. The results of this technological advance are increasingly
informing research. This is reflected in the number of citations
in the PubMed database for articles involving sequence analysis; there were less than 1,200 in 1992 and over 200,000 in
2011 (Fig. 1a). The ability to readily interpret genomic data
required concomitant advances in computational methods for
making sense of raw sequence data. The number of articles
published in computational biology has increased over the last
20 years with the greatest growth seen over the last decade
(Fig. 1b). As shown in Fig. 1c, there were at least 45,295
articles published related to bioinformatics in the last five years
(dark gray portion of pie graph). That is nearly double what
was published in the previous five years (22,721 articles as
shown in the light gray portion of the pie graph). Collectively,
these data and the tools to interpret them have enabled

*Address for correspondence to: Cheryl A. Kerfeld, Department of Energy


Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, California
94598; E-mail: ckerfeld@berkeley.edu.

This work is supported by U.S. Department of Energy Under Contract No.


DE-AC02-05CH11231 and by NSF (MCB0851094 and EF1105897).
Received 29 September 2012
DOI: 10.1002/bmb.20660
Published online in Wiley Online Library
(wileyonlinelibrary.com)

12

researchers in fields across the continuum of biological organizationfrom molecule to ecosystemto situate their experimental results in a genomic context. In a manner analogous to
the way the inventions of the telescope and the microscope
opened up unseen worlds for scientific exploration, genomics
and bioinformatics has created a new way to view life.
Genomics and bioinformatics have likewise created new
opportunities for the teaching of and learning in the life sciences.
Empowering students to work with real data and state-of-the art
computational tools can catalyze the reforms in undergraduate
life sciences instruction called for by Bio2010 [1] and Vision and
Change [4]. Methodologically, it requires only a computer and an
internet connection to introduce bioinformatics into a course.
Conceptually, bioinformatics starts with the chemical formula for
an organic molecule that can be used to trace connections from
predicted molecular behavior to organismal fitness. Accordingly,
genomics and bioinformatics can be used to illustrate and interconnect concepts across the life sciences curriculum.
At the same time, it can be used to give students an experience of research in the context of their courses. Working
with bioinformatics tools and genomics data can be scaled to
provide research experience to large numbers of students,
because it can be provided to students in parallel. Each student uses the same set of tools but has their own unique
sequence data set. It makes tractable the aim of giving all students a chance to build their understanding of the life sciences
with bioinformatics tools and real data.
To realize the potential for the use of genomics and bioinformatics across the life sciences curriculum, the Genomics
and Bioinformatics Education Program of the Department of
Energy Joint Genome Institute (JGI) created IMG-ACT (Integrated Microbial Genomes Annotation Collaboration Toolkit).
IMG-ACT is a fusion of a flexible rich text editor and web portal
through which students explore microbial genome datasets and
record their findings in an online notebook. IMG-ACT consolidates various databases and tools used in microbial genome
analysis [5]. It is structured as a series of modules which are
essentially guides to different types of bioinformatic analysis.

Biochemistry and Molecular Biology Education

FIG 1

Kerfeld

Number of NCBI PubMed articles published per year


indexed according to the National Library of Medicines Medical Subject Headings (MeSH) terms.
MeSH is a controlled vocabulary thesaurus with hierarchical organization used for describing the content
of a bibliographic reference. (a) Sequence Analysis.
New PubMed articles published per year for the last
20 years indexed with the MeSH term Sequence
Analysis, which falls hierarchically under Genetic
Techniques and encompasses DNA, RNA, and protein sequence analysis and annotation. (b) Computational Biology. New PubMed articles published per
year for the last 20 years indexed with the MeSH
term Computational Biology, which is the MeSH
index term for bioinformatics. Computational Biology
encompasses computational methods and computerbased techniques for solving biological problems. (c)
Total number of PubMed articles published in the
last 10 years using the MeSH index Computational
Biology (68,016). In the last five year period, from
2007 to 2011, the number of articles related to or
using bioinformatics doubled from the previous five
year period (20022006).

This modularity enables faculty to tailor the experience for


their particular pedagogical goals. IMG-ACT is useful for both
novices and experts alike; beginners may rely on the tutorials
that are only a click away, while experts appreciate the seamless workflow that enables them to compile the results of their
bioinformatic research in an organized form. For the instructor, IMG-ACT enables both assigning and viewing student work
on-line, making it feasible to involve large numbers of students
in data analysis. The potential uses of IMG-ACT are manifold
(Fig. 2). As of August 2012, 250 instructors at over 126 colleges and universities have used IMG-ACT with over 5,685
students.
In this issue, BAMBED features four articles by faculty
from diverse institutions recounting their use of genomics and
bioinformatics in their courses. They capture some of the multiplicity of ways that genomics and bioinformatics can be used
to realize specific pedagogical aims. Ditty et al., describe how
IMG-ACT serves as the link between students at the University
of St Thomas in St Paul, a Primarily Undergraduate Institution,
and UC Davis, a research intensive institution. This enabled
not only the connection of bioinformatics and wet lab experiments, but also an integration of teaching and research at
both institutions that led to a publication. At UCLA, IMG-ACT
likewise plays a key role in a laboratory-based research project that is driven by peer-to-peer learning. The UCLA project
illustrates how providing bioinformatics research experience
in conjunction with a wet lab course enables students to realize that they may have a predilection for computational or for
experimental work. Those who gravitate toward the computational prove to be excellent mentors for their peers, especially
adept at communicating the big picture and sharing their
enthusiasm for the research. At Austin College in San Antonio,
students from Biochemistry and Microbiology courses are
working together in a model interdisciplinary collaboration to
study amino acid biosynthesis in a bacterium from a remote
branch of the tree of life. They are discovering first-hand that
annotations are hypotheses that need to be tested in the laboratory and that nature has a seemingly endless number of variations on the textbook versions of metabolic pathways. At Salt
Lake Community College students also work in teams to study
how genes are identified and organized in a halophile, a type
of extremophile with a lifestyle of local interest. In this context,
conceptual understanding is explicitly framed by considering
the scientific method. Here too, communication among students is foregrounded, and those students who take a special
interest in their genomics and bioinformatics coursework have
the opportunity to build on it in an independent research
project.
The underlying themes threading through all of these
articles are collaboration and conceptual integration between
courses, between schools, across levels of students, and
between computational biology and wet lab research. In all
cases students are working together with real data and tools to
become knowledge producers. They are experiencing first-hand
how algorithms translate the principles of biology into mathematical form [6]. Active learning with genomics and

13

Biochemistry and
Molecular Biology Education

FIG 2

IMG-ACT serves as a hub to network courses, students, and research experience. [Color figure can be viewed in the online
issue, which is available at wileyonlinelibrary.com.]

bioinformatics provides an authentic research experience and


includes important lessons only available through working with
real data and all of its ambiguities; for example, it demonstrates the fallibility of gene annotations, the need to critically
think about the evidence that is returned from the internet.
The modern approach to understanding what is life has
been transformed by omics technologies from a reductionist focus on single molecules in isolation to an interdisciplinary
integration of experimental and in silico data. The breadth of
examples in these articles showcases the creativity of the faculty using these tools and demonstrates the power of genomics
and bioinformatics for linking theory and modern practice
across the life sciences curriculum. To keep pace with modern
research methods and expand its versatility, IMG-ACT is
under continual development. Opportunities to make explicit
connections between genomic data and evolution and the
studies of microbial communities are coming soon; a metagenome analysis toolkit as well as a module for exploring the evolutionary relationships between plant and cyanobacterial
genes for the photosynthetic apparatus are being built. The
articles in this issue are a snapshot of where we are now in
using genomics and bioinformatics in various courses and
types of institutions. They demonstrate how genomics and bioinformatics can be used in diverse curricular niches to help
students forge interconnections from molecular function to the
Darwinian definition of function, a curricular synthesis of
consequence.

14

Acknowledgements
The author would like to thank James Bristow and Edwin Rubin
whose visionary leadership at the JGI enabled the creation of
IMG-ACT. Likewise, Daniel Drell (DOE headquarters) and Jonathan Eisen (UC Davis) provided both inspiration and conceptual
support. IMG-ACT was created by the JGI Informatics Team;
David Hays and Rene Perrier, in particular, played key roles. The
author also thanks the Integrated Microbial Genomes Team,
especially Amy Chen, Konstantinos Mavrommatis and Ernest
Szeto for making possible the seamless integration of IMG EDU
with IMG-ACT. The IMG-ACT system was in large part designed
by faculty advisers who first met at the JGI in June 2007: Zhaohui
Xu (Bowling Green State University), Sharyn Freyermuth (University of Missouri-Columbia), Kelynne Reed (Austin College), Jayna
L. Ditty (The University of St. Thomas), Christopher Kvaal (St.
Cloud State University), Cheryl Bailey (University of Nebraska),
Sabine Heinhorst (University of Southern Mississippi), Kathleen
Scott (University of South Florida), Robert Britton (Michigan State
University), Erin Sanders (University of California, Los Angeles),
Rick Johns (Northern Illinois University), A. Malcolm Campbell
(Davidson College), Brad Goodner (Hiram College), and Stuart
Gordon (Hiram College). The creativity, commitment, and generosity with their time is the key reason for the success of IMG-ACT.
Seth Axen (JGI) and Jordan Moberg-Parker contributed figures
and statistics used in this article. Finally, the author acknowledges Desiree Stanley, Edwin Kim, and especially Seth Axen,

Sequences and Consequences

members of her team who have contributed to the creation and


management of the IMG-ACT system.

References
[1] The National Research Council. (2003) BIO 2010: Transforming Undergraduate Education for Future Research Biologists, The National Academies Press,
Washington, DC.
[2] Medini, D., Serruto, D., Parkhill, J., Relman, D., Donati, C., Moxon, R., Falkow,
S., and Rappuoli, R. (2008) Microbiology in the post-genomic era. Nat. Rev.
Microbiol. 6, 419430.

Kerfeld

[3] Schuster, S. C. (2008) Next-generation sequencing transforms todays biology. Nat. Method 5, 1618.
[4] AAAS. (2011) Vision and Change in Undergraduate Biology Education: A Call
to Action, AAAS Press, Washington, DC.
[5] Ditty, J. L., Kvaal, C. A., Goodner, B., Freyermuth, S. K., Bailey, C., Britton, R.
A., Gordon, S. G., Heinhorst, S., Reed, K., Xu, Z., Sanders-Lorenz, E. R.,
Axen, S. D., Kim, E., Johns, M., Scott, K. M., and Kerfeld, C. A. (2010) Incorporating genomics and bioinformatics across the life sciences curriculum:
Development and implementation of the integrated microbial genomes
annotation collaboration toolkit. PLoS Biol. 8, e1000448.
[6] Kerfeld, C. A. and Scott, K. M. (2011) Using BLAST to teach E-value-tionary
concepts. PLoS Biol. 9, e100101.

15

Potrebbero piacerti anche