Lecture1-Bioinformatics Technologies
Lecture1-Bioinformatics Technologies
Lokeswari Y Venkataramana
SSN College of Engineering
Dept. of CSE
Overview
• What is Bioinformatics
• DNA, RNA,
• Genes
• Amino Acids
• Protein
• Applications of Bioinformatics
• Need for Bioinformatics Technologies
• Databases available for Research
• Overview of Bioinformatics Technologies
CS, IT, Bio Intersect. Biological experiments generate large amts of data
Computing, Math&Stats, IT/Physics, Life Sciences ---> Next
generation Sequencing in the middle Bioinformatics
Bioinformatics
Bioinformatics
Bioinformatics
What is Bioinformatics?
Individuals
RNA Protein
DNA Phenotype
Evolution Selection
Populations
Biological Information
© Doug Brutlag 2015
LPOSET
© Doug
Brutlag 2015
Central Paradigm of Molecular Biology
GENOME
protein-gene
interactions
PROTEOME
protein-protein
interactions
replication
transcription reverse tr
MESSENGER (RNA) UACGUUCAGGUGACAUAAGGG
translation
PROTEIN
Substructure Species
Organism Affects the
Behaviour of
Cell Affects the
Function of
Nucleus Protein
Folds
Chromosome Amino Acid into
DNA strand
Prescribes
Gene
Base BGDCN
Cells
⚫ TCGGTGAATCTGTTTGAT
Transcribed to:
⚫ AGCCACUUAGACAAACUA
Translated to:
⚫ SHLDKL
Proteins
⚫ Evolution of species
– Caused by reproduction and survival of the fittest
⚫ But actually, it is the genotype which evolves
– Organism has to live with it (or die before reproduction)
– Three mechanisms: inheritance, mutation and crossover
⚫ Inheritance: properties from parents
– Embryo has cells with 23 pairs of chromosomes
– Each pair: 1 chromosome from father, 1 from mother
– Most important factor in offspring’s genetic makeup
Evolution of Genes: Mutation
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 39
Related Field:
Medical Informatics
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 40
Related Field:
Cheminformatics
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 41
Related Field:
Genomics
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 42
Related Field:
Proteomics
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 43
Related Field:
Pharmacogenomics
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 44
Related Field:
Pharmacogenetics
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 45
APPLICATIONS
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 46
Pharmaceutical:
• Biobanks
➢ A biobank is a type of biorepository which stores human biological
samples (including tissue providers, others include genetic profiling) for
use in research.
➢ Since the late 1990s: a key resource for supporting many types of
contemporary research, e.g. genomics and personalized medicine.
➢ Drivers: the need to accelerate the discovery and development of drugs.
As that industry shifts to more-personalised medicine, the need for high-
quality, well-maintained biospecimens intensifies.
Page 47
Medical Implications:
• Pharmacogenomics
– Not all drugs work on all patients, some good drugs cause death
in some patients
– So by doing a gene analysis before the treatment the offensive
drugs can be avoided
– Also drugs which cause death to most can be used on a minority
to whose genes that drug is well suited – volunteers wanted!
– Customized treatment
• Gene Therapy
– Replace or supply the defective or missing gene
– E.g: Insulin and Factor VIII or Haemophilia
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 48
Diagnosis of Disease:
❑ Diagnosis of disease
❑ Identification of genes which cause the disease will help detect
disease at early stage e.g. Huntington disease -
❑ Symptoms – uncontrollable dance like movements, mental
disturbance, personality changes and intellectual
impairment
❑ Death in 10-15 years
❑ The gene responsible for the disease has been identified
❑ Contains excessively repeated sections of CAG
❑ So once analyzed the couple can be counseled
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 49
Drug Design:
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 50
Drug Discovery:
Target identification
– Identifying the molecule on which the germs relies for its survival
– Then we develop another molecule i.e. drug which will bind to the target
– So the germ will not be able to interact with the target.
– Proteins are the most common targets
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
Page 51
Drug Discovery:
• For example HIV produces HIV protease which is a protein and which in turn
eat other proteins
• This HIV protease has an active site where it binds to other molecules
• So HIV drug will go and bind with that active site
– Easily said than done!
Click here to download this powerpoint template : Human Computers Network Free Powerpoint Template
For more : Powerpoint Backgrounds
52
Page 52
Bioinformatics Application #1
Phylogenetic trees
⚫ Understand our evolution
⚫ Genes are homologous
– If they share a common ancestor
⚫ By looking at DNA seqs
– For particular genes
– See who evolved from who
⚫ Example:
– Mammoth most related to
⚫ African or Indian Elephants?
⚫ LUCA:
– Last Universal Common Ancestor
– Roughly 4 billion years ago
Bioinformatics Application #2
Predicting Protein Structure
300000
200000
100000
Protein structure
0
85 90 95 00
Year
Database Approaches
⚫ Systems biology
– Putting it all together: “E-cell” and “E-organism”
– In-silico modelling of biological entities and process
Need for Bioinformatics technologies
• Bioinformatics has attracted a great deal of attention from various disciplines, such as information technology, mathematics,
and non-traditional biological sciences in recent years.
• Due to the availability of enormous amounts of public and private biological data and the compelling need to transform
biological data into useful information and knowledge.
• Bioinformatics can therefore be considered to be the combination of several scientific disciplines that include biology,
biochemistry, mathematics, and computer science.
• It involves the use of computer technologies and statistical methods to manage and analyze a huge volume of biological data
about DNA, RNA, and protein sequences, protein structures, gene expression profiles, and protein interactions.
• The transformation of voluminous biological data into useful information and valuable knowledge is the challenge of
knowledge discovery.
• Identification and interpretation of interesting patterns hidden in trillions of genetic and other biological data is a critical goal
of bioinformatics.
DDMMNPS
Data Mining
Database Tech
ML
Modeling and Visualization
Network & Tools
Pattern Matching
Structure and Process
An Overview of Bioinformatics Technologies
• The existing research in bioinformatics is related to knowledge discovery, sequence analysis,
structure analysis, and expression analysis.
• Sequence analysis is the discovery of functional and structural similarities and differences
between multiple biological sequences.
• This can be done by comparing the new (unknown) sequence with well-studied and annotated
(known) sequences.
• If two similar sequences are from different organisms, they are said to be homologous sequences.
• 1. One proposed method for sequence comparison is sequence alignment.
• It is a procedure for base-by-base comparison of two (pairwise) or more (multiple) sequences
by searching for a series of individual characters or character patterns that are in the same
order in the sequences. To search for an identical character or character patterns, the string
matching technique is widely used.
• 2. Gene prediction is the process of detecting meaningful signals in uncharacterized DNA
sequences.
• Gene prediction uses homology search to acquire knowledge of the interesting information in
DNA. 1. find struct and func sim and diff b/w multiple biological sequences
2. compare known and unknown seq
3. If 2 sim seq are from diff orgs, they are homologoues species
4.i. Seq alignment --- base by base comparison of seq (characte-by-character/patterns of characters in an order)--for string matching is used
ii. Gene prediction-finding useful signals in uncharacterized DNA seq-homology search is used
An Overview of Bioinformatics Technologies
• The existing research in bioinformatics is related to knowledge discovery, sequence
analysis, structure analysis, and expression analysis.
• Structure analysis is the study of proteins and their interactions. Proteins are complex
biological molecules composed of a chain of units, called amino acids, in a specific
order.
• They are large molecules required for the structure, function, and regulation of the body’s
cells, tissues, and organs.
• Each protein has unique functions.
• The understanding of protein structures and their functions leads to new approaches
for diagnosis and treatment of diseases, and the discovery of new drugs.