BI Unit 1 Part-1
BI Unit 1 Part-1
ZHUMUR GHOSH
Scientist
Centre of Excellence in Bioinformatics
Bose Institute
Kolkata
BIBEKANAND MALLICK
Assistant Professor
Department of Life Science
National Institute of Technology
Rourkela, Odisha
© Oxford University Press
© Oxford University Press
Contents
INTRODUCTION
What is bioinformatics? Is it biology or informatics or an optimized blend of both?
Ask ten scientists and you will get ten different responses. There would be common
elements * computers and biological database * but the definition depends on who
defines it. This is the reason bioinformatics is compared to an amoeba.
The word ‘bioinformatics’ is a shortened form of ‘biological informatics’. An
unprecedented wealth of biological data has been generated by the Human Genome
Project (HGP) and sequencing projects in other organisms. The huge demand for the
analysis and interpretation of these data is being managed by the evolving science of
bioinformatics. Bioinformatics is defined as the application of computational and
analytical tools to capture and interpret the biological data.
This emerging field is turning out to be a well-opted career choice of the twenty-first
century. It is fuelled by the major gene-sequencing projects, now underway, that are
creating a demand for experts who understand both biology and computing and can
interpret the vast amount of data generated by this type of research. The HGP, for
example, has yielded data on more than three billion DNA sequences.
Bioinformatics is often focused on obtaining biologically oriented data * such as
nucleic acid (DNA/RNA) and protein sequences, structures, functions, pathways, and
interactions * organizing these data into databases, developing methods to get useful
information from these databases, and devising methods to integrate the related data
from disparate sources. These computer databases and algorithms are developed to
speed up and enhance biological research. Functional genomics, biomolecular
structure, proteome analysis, cell metabolism, biodiversity, downstream processing
in chemical engineering, drug design, and vaccine design are some of the areas in
which bioinformatics is an integral component.
library. This way you do not risk losing or destroying the book. In all eukaryotic cells,
DNA never leaves the nucleus; instead, the genetic recipe (the genes) is copied into
RNA, which in turn is decoded (translated) into proteins in the cytoplasm. The DNA
itself is not translated into proteins directly for several reasons. One is security. The
cytoplasm is a dangerous environment for DNA; the daily transcription of genes to
proteins would be harmful to the DNA, which has to stay intact to maintain life.
Therefore, the RNA works as a sort of throw-away version of the DNA (like the
copies from the reference book), good for limited work.
Another reason is to regulate the rate of protein synthesis. How does something as
seemingly simple as the DNA’s long sequence, composed of only four different letters
(A, T, G, C) get converted into so many different kinds of protein molecules that
perform the daily work in our body? It is important to understand this in detail (see
Figure 1.2).
The path from the DNA sequence to the protein sequence is a complex process
called the central dogma of biology. It is composed of two major steps, as shown in
Figure 1.2. The first step is transcription, in which the DNA is converted into a mature
messenger RNA (mRNA). The second is translation, in which the base sequence of the
mRNA is ‘read’ and converted into an amino acid sequence. The information
contained in the nucleotide sequence of the mRNA is read as three-letter words
(triplets) called codons.
In 1953, when the structure of the DNA molecule was published by Watson and
Crick, two questions were yet to be resolved. The first was the mechanism by which
DNA replicated itself and the second, how a sequence of four things (the DNA
bases * A, T, G, C) could encode a sequence of 20 things (the amino acids of protein).
After a decade’s search for the genetic code, finally a universal genetic code was
announced in 1966 in Volume 31 of the Cold Spring Harbor Symposia on
Quantitative Biology, contributed by Har Gobind Khorana, Severo Ochoa, Matthew
Meselson, Marshall Nirenberg, and Heinrich Matthaei. They determined that the
genetic code was composed of three-letter ‘words’ called codons and each codon codes
for a specific amino acid.
However, four DNA bases,
‘read’ three at a time, gave 64
possible codons. Since there
were only 20 amino acids,
this meant that more than
one codon could code for
the same amino acid. This
phenomenon is called degen-
eracy. Amino acids coded by
codons are linked together
during translation to form a
polypeptide chain that is la-
ter folded into a protein. In
Figure 1.2 Schematic representation of information flow from brief, we can say DNA makes
DNA to protein through transcription and translation RNA makes protein.
BRANCHES OF BIOINFORMATICS
A living cell is a system where cellular components such as genome, the gene
transcript, and the proteins interact with each other, and these interactions determine
the fate of the cell, e.g., whether a stem cell is going to become a liver cell or a cancer
cell. The characterization of these three types of components and the associated
development of analytical methods lead to the establishment of the three closely
related branches of bioinformatics: genomics,
transcriptomics, and proteomics (see Figure 1.4).
GENOMICS
DNA
Genomics
Genomics play a significant role in modern
C biological research in which the nucleotide se-
O Makes
M quences of all the chromosomes of an organism
P are mapped and the location of different genes
L
E
TRANSCRIPTOMICS and their sequences are thereby determined. This
X
I RNA involves extensive analysis of the nucleic acids
T through molecular biology techniques before the
Y
Makes data are ready for processing by computers. It is a
science that attempts to describe a living organ-
PROTEOMICS ism in terms of the sequence of its genome (its
constituent genetic material).
Protein
Earlier, it was not reliable to estimate the
Figure 1.4 The three major branches of number of genes in an organism based on the
bioinformatics number of nucleotide base pairs because of the
presence of high numbers of redundant copies of many genes. Genomics has helped to
rectify this problem. For example, it is now known that a human being has about
30,000 genes and not 1,00,000, as estimated earlier. Genomics uses the techniques of
molecular biology and bioinformatics to identify cellular components such as proteins,
rRNA, tRNA, etc., and analyse the sequences attributed to the structural genes,
regulatory sequences, and even non-coding sequences. Genomics is closely related to,
and sometimes considered a branch of genetics, the study of genes and heredity.
The first automatic DNA sequencer was developed in 1986 by Leroy Hood. This
paved the way for the official beginning of the HGP in 1990, which gave a boost to
genomics. A large number of bacterial genomes have already been fully sequenced and
put in the public domain. Haemophilus influenzae was the first bacterium to be
sequenced in 1995. The sequencing of bacterial genomes was followed by the first
sequenced eukaryotic organism, the unicellular genetic model system * Sacchar-
omyces cerevisiae (commonly known as baker’s yeast). In December 1998, the first
multicellular organism was added to the list, the nematode Caenorhabditis elegans,
which is now considered as a model organism to provide us with information about
unique functions in organisms of greater complexity. The sum of all these information
is enormous and its potential in our understanding of life processes can be explored
with the help of genomics, almost synonymous with bioinformatics.
Even if one can identify all the genes on a genome, the genes only indicate that, at
some point in time, it might be transcribed to produce active cellular components. It
contains no time-specific information on when and under what condition the gene will
be expressed. For example, a human genome contains about 30,00060,000 protein-
coding genes, but only a subset of them is expressed in a particular cell type at a
particular time. However, the state of the cell at time t depends much more on those
genes expressed at time t than the silent ones. Genomics leads to certain developments
that facilitate the generation of time-specific gene expression data.
Transcriptomics
Transcriptomics is the study of the transcriptome, which includes the whole set of
mRNA molecules (or transcripts) in one or a population of biological cells for a given
set of environmental circumstances. This study helps us to depict the expression level
of genes, often using techniques such as DNA microarrays, that is capable of sampling
tens of thousands of different mRNAs at a time. This kind of new technique has
helped biologists to routinely monitor the gene expression of cells over time or to
compare gene expression between the control cells and treatment cells.
Transcriptomics has a few limitations. The relative abundance of transcripts as
characterized by the sequential analysis of gene expression (SAGE) or microarray
experiments is not always a good predictor of the relative abundance of proteins. This
is because of the following:
1. Differential adaptation to the translational machinery.
2. Differential usage of amino acids of different abundances.
3. The lack of information on post-translational modification of amino acid
residues although post-transcriptional modifications such as acetylation, hydro-
xylation, glycosylation, phosphorylation, and cleavage are fundamental in
understanding the interactions of cellular components.
Proteomics
Proteomics represents the earliest attempt to identify a major sub-class of cellular
components * the proteins * and their interactions. Proteomics involves the
sequencing of amino acids in a protein, determining its 3D structure and relating it
to the function of the protein. Before computer processing comes into the picture,
extensive data, particularly through crystallography and nuclear magnetic resonance
(NMR), is required for this kind of a study. With such data on known proteins, the
structure and its relationship to the function of newly discovered proteins can soon be
understood. In such areas, bioinformatics has enormous analytical and predictive
potential. Metabolic proteins such as haemoglobin and insulin have been subjected to
intensive proteomic investigation. It focuses on identifying when and where the
proteins are expressed in a cell so as to establish their physiological roles in an
organism.
The term ‘proteomics’ was coined to make an analogy with genomics, and while it is
often viewed as the ‘next step’, proteomics is much more complicated than genomics.
Most importantly, while the genome is rather a constant entity, the proteome differs
from cell to cell and is constantly changing through its biochemical interactions with the
genome and the environment. A single organism has radically different protein
expressions in different parts of its body, in different stages of its life cycle and in
different environmental conditions. The complete set of proteins existing in an
organism throughout its life cycle or, on a smaller scale, the set of proteins found in a
particular cell type under a particular type of stimulation, is referred to as the proteome
of the organism or cell type, respectively.
Scientists feel that the bioinformatics of proteins is crucial since characterizing
thousands of proteins and their interactions is a difficult task.
To understand the cellular components and their interactions completely, one needs
integrated analyses of proteomic, genomic, and transcriptomic data * and a one-
word solution for all this is bioinformatics.
Apart from these three main branches, there are a few other accessory branches of
bioinformatics, which we shall now discuss.
Systems Biology
As there has been an explosion of biological data because of the HGP and followed by
other sequencing projects, the significant job of extracting relevant information out of
this plethora of data is taken up by bioinformatics. This is important for building up
meaningful models that fit well with the biological systems. Systems biology is that
predictive stem of bioinformatics which, with the aid of mathematical modelling,
simulation, and data analysis, generates predictive models of this experimentally
generated biological data.
In simpler words, systems biology deals with the system-level understanding of
biological systems. Unlike molecular biology, which focuses on molecules such as
sequences of nucleic acids and proteins, systems biology focuses on interactions
between the various components of a biological system, and how these interactions give
rise to the function and behaviour of that system (e.g., the enzymes and metabolites in a
metabolic pathway).
mass inoculation. It is now possible to identify the genes responsible for pathogenesis
in the genomes of parasites and to produce DNA vaccines based on this information.
The area concerned with genes responsible for the production of pharmaceutically
important compounds is sometimes distinguished as pharmacogenomics.
Metabolomics
Metabolomics is the systematic study of the unique chemical fingerprints that specific
cellular processes leave behind, specifically, it is the study of their small-molecule
metabolite profiles. A primary metabolite is directly involved in normal growth,
development, and reproduction. A secondary metabolite is not directly involved
in those processes, but usually has an important ecological function. Examples of
secondary metabolite include antibiotics and pigments. In a biological organism,
the metabolome represents the collection of all metabolites, which are the end
products of its gene expression. Thus, while mRNA gene expression data and
proteomic analysis do not tell the whole story of what might be happening in a cell,
metabolic profiling can give an instantaneous ‘snapshot’ of the physiology of
that cell.
The word metabonomics is also used, particularly, in the context of drug toxicity
assessment. There is some disagreement over the exact differences between ‘meta-
bolomics’ and ‘metabonomics’. In general, the term ‘metabolomics’ is more commonly
used.
Structural Genomics
Structural genomics is aimed at determining the 3D structures of gene products in an
efficient and high-throughput mode. Structural biology deals with the thorough
understanding of the structure and function of one, or may be a few proteins, whereas
structural genomics focuses on determining the structures of large numbers of proteins
or other macromolecules without prior regard to function. Structural genomics efforts
are producing a wealth of experimental data from NMR studies that are linked to
high-quality 3D structures of proteins. When the focus is on proteins, this effort may
be called structural proteomics.
Nutritional Genomics
A rapidly emerging area, nutritional genomics is the study and manipulation of genes
responsible for the synthesis of nutritionally important enzymes or other molecules,
often involving entire biosynthetic pathways. This will pave the way for inserting these
genes into crop plants to enrich them in special ways. The first example of such a bio-
fortified plant is Golden Rice, in which the biosynthetic machinery for b-carotene
(pro-vitamin A) is introduced into the rice genome (Oryza sativa) to express a new
feature in the rice grain. The genomes of the gene donors for golden rice, daffodil
(Narcissus pseudonarcissus), and the bacterium Erwinia uredovora, have not been
worked out. Nor was the genome of the rice plant available till the first successful
product was generated.
Cheminformatics
Drug design through bioinformatics is one of the most actively pursued areas of
research. Since the majority of drugs are low molecular weight (LMW) compounds
and many of them are primarily derived from biological sources, there has always been
a great interest in the study of LMW compounds of biological origin. Cheminfor-
matics (or chemoinformatics) deals with products of secondary metabolism, which are
often referred to as natural products. The physico-chemical properties and chemical
structures for over 100,000 natural products are available in different databases. For
most of them, the biological role in the organisms in which they are synthesized is
not known, but they have some kind of bioactivity against others. This bioactivity
can be turned into an advantage for therapeutic purposes, with the expertise of a
pharmacologist. Cheminformatics involves organizing chemical data in a logical form
to facilitate the process of understanding chemical properties and their relationship to
structures, and making inferences. It also helps us to assess the properties of new
compounds by comparing them with known compounds.
Glycomics
It deals with the application of bioinformatic procedures to carbohydrate research.
Glycomics is the future field of bioinformatics.
Molecular Phylogeny
Phylogeny is the study of the origin and evolution of organisms. It has been estimated
that four million organisms exist on earth, but not even a quarter of this number is
currently known to science. So it is necessary to classify and name them properly. This
would be very useful to understand the genetic and evolutionary relationships of
organisms so that they may be used in a profitable manner in biotechnology and
elsewhere. Biologists have constructed elegant systems of classification for the known
organisms, though problems persist. All this commendable work, with over three
centuries of history, was done using externally visible structural, chemical, or
functional attributes of organisms. This constitutes the field of taxonomy, which is
called systematics when the theory of organic evolution is applied to it.
With the advancements in molecular biology, biologists have used data from the
genetic material to characterize organisms and to verify their classification and
relationships, inferred on the basis of other evidence. Since it is impractical to use
entire genomes for this purpose, nucleotide sequences of genes in the genomes from
the mitochondria and chloroplasts are used. These nucleotide sequences are compared
using complex computer software. Extensive work was carried out this way,
comparing a large number of organisms. A number of systematists would benefit if
bioinformatics professionals provided them with computer-based services to analyse
their systematic data.
AIM OF BIOINFORMATICS
We have seen the various important ways in which bioinformatics can be used. The
aim of bioinformatics is fourfold and includes data acquisition, tool and database
development, data analysis, and data integration.
Data Acquisition
Data acquisition is primarily concerned with accessing and storing data generated
directly from the biological experiments. The data generated by various sequencing
projects have to be retrieved in the appropriate format, and be capable of being
linked to all the information related to the DNA samples, such as the species, tissue
type, and quality parameters used in the experiments. The data are organized in
different databases so that the researchers can access existing information and
submit new entries as and when they are produced. Examples of such database are the
Entrez Genome of NCBI (for genome data) and the Protein Data Bank (for 3D
macromolecular structures data). The information stored in these databases is useless
until it is analysed. Thus, the purpose of bioinformatics extends much further.
Tool and Database Development
Many laboratories generate large volumes of data such as DNA sequences, gene
expression information, 3D molecular structure, and high-throughput screening.
Consequently, they must develop effective databases for storing and quickly accessing
data. The other aim is to develop tools and resources that aid in the analysis of data.
For example, having sequenced a particular protein, it is of interest to compare it with
previously characterized sequences. Programs such as FASTA and PSIBLAST must
consider what comprises a biologically significant match. The development of such
resources requires expertise in computational theory along with a thorough under-
standing of biology.
For each type of data, a different database organization may have to be used.
A database must be designed to allow efficient storage, search, and analysis of the data
it contains. Designing a high-quality database is complicated by the fact that there
are several formats for many types of data and a wide variety of ways in which
the scientists may want to use the data. Many of these databases are best built
using relational database architecture, usually based on Oracle or Sybase. A strong
background in relational databases is a fundamental requirement for working in
database development. Having some background in molecular biology techniques
used to generate the data is also important. Most critical for the bioinformatics
specialist is to have a strong working relationship with the researchers who will be
using the database and the ability to understand and interpret their needs into
functional database capabilities.
Data Analysis
The third aim is to use these tools to analyse the data and interpret the results in a
biologically meaningful manner. Traditionally, biological studies examined individual
systems in detail, and compared those with a few related systems. In bioinformatics,
we can now conduct a global analysis of all the available data with the aim of
unveiling common principles that apply across many systems and highlight novel
features. Efficient analysis requires an efficiently designed database. It must allow
researchers to place their query effectively and provide them with all the information
they need to begin their data analysis. If queries cannot be performed, or if the
performance is too slow, the whole system breaks down since scientists will not be
inclined to use the database.
Once data are obtained from the database, the user must be able to easily transform
it into the format appropriate for the desired analysis tools. This can be challenging,
since researchers often use a combination of publicly available tools, tools developed
in-house, and third-party commercial tools. Each tool may have different input and
output formats. Starting in the late 1990s, there have been both commercial and in-
house efforts at pharmaceutical and biotech companies aimed at reducing the
formatting complexities. Such simplification efforts focus on building analytical
systems with a number of tools integrated within them such that the transfer of data
between tools appears seamless to the end user.
Bioinformatics analysts have a broad range of opportunities. They may write
specific algorithms to analyse data or they may be expert users of analysis tools,
helping scientists to understand how the tools analyse the data and how to interpret
the results. Knowledge of various programming languages such as Java, PERL, C,
C, and Visual Basic is useful, if not mandatory, for those working in this area.
Data Integration
Once information has been analysed, a researcher must often associate or integrate it
with the related data from other databases. For example, a scientist may run a series of
gene expression analysis experiments and observe that a particular set of 100 genes is
more highly expressed in a cancerous lung tissue than in a normal lung tissue. The
scientist may wonder which of the genes is most likely to be truly related to the disease.
To answer the question, the researcher needs to find out more information about those
100 genes, including any associated gene sequence, protein, enzyme, disease, metabolic
pathways, or signal transduction pathway data. Such information helps the researcher
to narrow down the list to a smaller set of genes. Doing this research requires
connections or links between the different databases and a good way to present and
store the information. An understanding of database architectures and the relation-
ship between the various biological concepts in the databases is the key to effective
data integration.
bioinformatics and an important part of the final goal of biomedical science, in general,
the complete molecular understanding of a living organism.
Database Building and Management
Whatever type of information is being generated, analysed, and finally interpreted, the
data have to be presented to the scientific community on the Internet. The
presentation of these data are challenging * the problems that arise extend from
the formalism of data submission to the intelligent and clear ways of presentation.
Database management is thus not only an engineering problem, but also a scientific
challenge.
Clinical Applications
The clinical applications of bioinformatics can be viewed in the immediate, short, and
long-term spans. The HGPs for sequencing human chromosomes were completed in
2003, producing a database of all the variations in sequences that distinguish us all.
The project could have considerable impact on people living in 2020, e.g., a complete
list of human gene products providing new drugs and gene therapy for single gene
diseases (http://www.ornl.gov/hgmis/medicine/tnty.html).
Basic bioinformatic tools are already accessed in certain clinical situations to aid in
diagnosis and treatment plans. For example, PubMed (http://www.ncbi.nlm.nih.gov/
pubmed) is accessed freely for biomedical journals cited in Medline, and Online
Mendelian Inheritance in Man (OMIM at http://www3.ncbi.nlm.nih.gov/Omim/) * a
search tool for human genes and genetic disorders * is used by clinicians to obtain
information on genetic disorders in the clinic or hospital setting.
An example of the application of bioinformatics in new therapeutic advances is the
development of new, designer targeted drugs such as Imatinib mesylate (Gleevec),
which interferes with the abnormal protein made in chronic myeloid leukaemia
(Imatinib mesylate was synthesized at Novartis Pharmaceuticals by identifying a lead
in a high throughput screen for tyrosine kinase inhibitors and optimizing its activity
for the specific kinases). The ability to identify and target specific genetic markers by
using bioinformatic tools facilitated the discovery of this drug. In the long term,
integrative bioinformatic analysis of genomic, pathological, and clinical data in
clinical trials will reveal potential adverse drug reactions in individuals by use of
simple genetic tests. Ultimately, pharmacogenomics (using genetic information to
individualize drug treatment) is much likely to bring about a new age of personalized
medicine. Patients will carry gene cards with their own unique genetic profile for
certain drugs aimed at individualized therapy and targeted medicine free from side
effects.
network is today recognized as one of the major scientific networks in the world
dedicated to providing state-of-the-art infrastructure, education, manpower, and tools
in bioinformatics.
The principal aim of the bioinformatics programme is to ensure that India emerges
a key international player in the field of bioinformatics. The following are the major
thrusts of the programme:
1. To undertake advance research in frontier areas of bioinformatics and
computational biology.
2. To develop world-class human resources.
3. To establish an effective academiaindustry interface.
4. To pursue and promote international cooperation with leading institutions,
organizations, and countries in the world.
5. To create world-class platforms for technology development, transfer, and
commercialization.
Training Activities on Bioinformatics
Short-term and long-term training courses in bioinformatics for scientists from
different disciplines in biology, statistics, and computer science are important and,
over the years, have been found highly useful. These activities, therefore, will be
intensified. Experts from other countries will be used as resource persons along with
Indian experts. The knowledge base will be upgraded and the knowledge of experts
converged from different disciplines to bioinformatics. However, training for proper
manpower at research level alone is not sufficient. Considering the importance of the
subject, some institutions and university departments have introduced bioinformatics
courses at different levels.
Research and Development
Different institutes of repute throughout India have set up the infrastructure
to pursue research activities in bioinformatics and biotechnology. Not only are
the biological sectors of the institutes
undergoing such activities, this inter-
disciplinary field (see Figure 1.5) also
Biology harnesses multi-disciplinary talents.
Hence, chemists, physicists, mathe-
Physics Mathematics maticians, statisticians, and computer
scientists are coming up to work hand
Bioinformatics (BI)
in hand with experimental biologists.
This is essential for managing data in
Chemistry
modern biology and medicine. Table
Statistics
1.2 lists some of the well-known
institutes pursuing research in this
Computer Science field.
The major funding agencies of the
Government of India that support
Figure 1.5 The interdisciplinary fields involved in bioinformatics initiatives in various
bioinformatics states are listed in Table 1.3.
Table 1.3 Major funding agencies of the Government of India that support bioinformatics
Funding agencies Web interfaces
Department of Science and Technology (DST) http://www.dst.gov.in
Department of Biotechnology (DBT) http://www.dbtindia.nic.in
Indian Council of Agriculture Research (ICAR) http://www.icar.org.in
Indian Council of Medical Research (ICMR) http://www.icmr.nic.in/
Council of Scientific and Industrial Research (CSIR) http://www.csirhrdg.res.in
University Grants Commission (UGC) http://www.ugc.ac.in
Department of Scientific and Industrial Research (DSIR) http://www.dsir.nic.in/
Defence Research Development Organisation (DRDO) http://www.drdo.org
A view of the past, present, and future of bioinformatics is presented in Figure 1.6.
This is a critical juncture for bioinformatics in India. The Indian bio-companies are
presently the heart of the industry and growing fast. However, their performance
depends on how well their research and development wings are equipped. The main
shortcoming today is the deficit of trained professionals in this field. Once well-trained
professionals put in sincere effort, the floodgates of international business will open
for India and bioinformatics will really begin to flourish. This milestone is yet to be
crossed.
Bioinformatics
A complete computer Understanding basic principles
in the future
representation of the cell of the higher complexity of
and the organism biological systems
SUMMARY
Bioinformatics has created lots of fervour in veys this answer along with an overview of the
various spheres starting from the academic subject and its branches. Newcomers in this
sectors to the industrial as well as the clinical field may feel the need to get motivated while
sectors. Its growing demand has emphasized the studying the subject. Such a purpose is also
need to sow its seeds among the future students. fulfilled in this part along with an idea of the
Hence, it has become absolutely essential that present employment scenario and opportunities
they gather a proper idea about ‘what is in India, which shall be beneficial to the
bioinformatics?’ This section essentially con- budding professionals.
REVIEW QUESTIONS
SUGGESTED READING
Altman R.B. and Dugan J.M., 2003, ‘Defining Bajorath J., 2004, ‘Understanding chemoinfor-
bioinformatics and structural bioinformatics’, matics: A unifying approach’, Drug Discov To-
Methods Biochem Anal, 44: 314. day, 9(1): 1314.
Aoki-Kinoshita K.F. and Kanehisa M., 2006, Bansal A.K., 2005, ‘Bioinformatics in microbial
‘Bioinformatics approaches in glycomics and biotechnology: A mini review’, Microb Cell
drug discovery’, Curr Opin Mol Ther, 8(6): 514 Fact, 4: 19.
520.
Benton D., 1996, ‘Bioinformatics: Principles and Hack C. and Kendall G., 2005, ‘Bioinformatics:
potential of a new multidisciplinary tool’, Trends Current practice and future challenges for life
Biotech, 14(8): 261272. science education’, BAMBED, 33: 8285.
Blundell T.L. and Mizuguchi K., 2000, ‘Structural Hann M. and Green R., 1996, ‘Chemoinformatics:
genomics: An overview’, Prog Biophys Mol Biol, A new name for an old problem?’ Curr Opin
73(5): 289295. Chem Biol, 3: 379383.
Brazma A. and Vilo J., 2000, ‘Gene expression data Hood L. and Galas D., 2003, ‘The digital code of
analysis’, FEBS Lett, 480: 1724. DNA’, Nature, 421: 4448.
Bull A.T., Ward A.C., et al., 2000, ‘Search and Horner D.S. and Pesole G., 2004, ‘Phylogenetic
discovery strategies for biotechnology: The analyses: A brief introduction to methods and
paradigm shift’, Microbiol Mol Biol Rev, 64: their application’, Expert Rev Mol Diag, 4: 339
573606. 350.
Chen W.L., 2006, ‘Chemoinformatics: Past, present, Howard M., 2000, ‘The bioinformatics gold rush’,
and future’, J Chem Info Model, 46: 22302255. Sci Am, 283: 5863.
Couzin J., 2003, ‘Functional genomics: How to Lin J. and Qian J., 2007, ‘Systems biology approach
make sense of sequence’, Science, 299: 1642. to integrative comparative genomics’, Expert Rev
Dawson K.A., 2006, ‘Nutrigenomics: Feeding the Proteomics, 4: 107119.
genes for improved fertility’, Anim Reprod Sci, 96: Lio P., 2003, ‘Statistical bioinformatic methods in
312322. microbial genome analysis’, Bioessays, 25: 266
Dutt M.J. and Lee K.H., 2000, ‘Proteomic analysis’, 273.
Curr Opin Biotech, 11: 176179. Mallick B. and Ghosh Z., 2006, ‘Bioinformatics:
Fauman E.B., Hopkins A.L., et al., 2003, ‘Struc- The rising sun’, Indian Science Cruiser, 20: 4450.
tural bioinformatics in drug discovery’, Methods Miller W., Makova K.D., Nekrutenko A., and
Biochem Anal, 44: 477497. Hardison R.C., 2004, ‘Comparative genomics’,
Fenstermacher D., 2005, ‘Introduction to bioinfor- Ann Rev Genom Human Gen, 5: 1556.
matics’, JASIST, 56: 440446. Mocellin S. and Rossi C.R., 2007, ‘Principles of
Fleischmann R., Adams M., White O., et al., 1995, gene microarray data analysis’, Adv Exp Med
‘Whole-genome random sequencing and assembly Biol, 593: 1930.
of Haemophilus influenzae Rd’, Science, 269: 496 Mullner S., 2003, ‘The impact of proteomics on
512. products and processes’, Adv Biochem Eng Bio-
Ghosh Z. and Mallick B., 2006, ‘Golden Grains’, tech, 83: 125.
The Statesman, 8th Day, 15 January, pp. 89. Roos D.S., 2001, ‘Bioinformatics: Trying to swim in
(http://www.thestatesman.net/page.arcview.php? a sea of data’, Science, 291: 12601261.
clid30&id131816&usrsess1) Sardari S. and Dezfulian M., 2007, ‘Cheminfor-
Ghosh Z. and Mallick B., 2006, ‘Bioinformatics: matics in anti-infective agents discovery’, Mini
The career choice of 21st century’, The Statesman, Rev Med Chem, 7: 181189.
8th Day, 27 August 2006 (http://www.thestates Singh O.V. and Nagaraj N.S., 2006, ‘Transcrip-
man.net/page.arcview.php?clid30&id156172& tomics, proteomics and interactomics: Unique
usrsess1) approaches to track the insights of bioremedia-
Ghosh Z., Chakrabarti J., and Mallick B., 2007, tion’, Brief Funct Genomics Proteomics, 4: 355
‘miRNomics * The bioinformatics of micro- 362.
RNA genes’, Biochem Biophys Res Commun, Teufel A., Krupp M., Weinmann A., and Galle
363: 611. P.R., 2006, ‘Current bioinformatics tools in
Goffeau A., Barrell B.G., Bussey H., et al., 1996, genomic biomedical research’, Int J Mol Med,
‘Life with 6000 genes’, Science, 274: 546567. 17: 967973.
Goldsmith-Fischman S. and Honig B., 2003, ‘Struc- Yu U., Lee S.H., Kim Y.J., and Kim S., 2004,
tural genomes: Computational methods for struc- ‘Bioinformatics in the post-genome era’, J Bio-
ture analysis’, Protein Sci, 12: 18131821. chem Mol Biol, 37: 7582.