Qualitative Analysis of Biomolecules: 1. The Human Genome

Uploaded by

aysepolat7000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views6 pages

Qualitative Analysis of Biomolecules: 1. The Human Genome

Uploaded by

aysepolat7000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Qualitative analysis of biomolecules

In this exercise, you will answer questions and do an exercise in R to learn a few examples of
different methods of qualitative analysis of biomolecules.
Objective: The goal of these questions and the exercise is to familiarize you with DNA sequencing
methodology, as well as the basic principles underlying sequence alignment and assembly.

1. The human genome

Answers the following questions:
Question 1.1
What is genomics and how does it relate to genetics?
- Genetics study the genes and the passed traits, while genomics is the investigation of
the whole genome. Although they are both related to genetics.

Question 1.2
What is coding and non-coding DNA? Which major classes of elements are found in each type of
DNA?
- Coding DNA is called exons and is a DNA sequence coding for protein, while non-coding
DNA is called introns and gets transcribed to RNA, but not further translated, since they
are eliminated by the spliceosome.

Question 1.3
What are the major types of variation in the genome between individuals?
- The single nucleotide polymorphism or SNP is the major cause of variations in the
genome of individuals. SNPs are when a nucleotide is exchanged with another
nucleotide in the ratio of 1:300 nucleotides in the genome.

2. DNA sequencing
Answers the following questions:
Question 2.1
Briefly describe the principles of reversible-terminated sequencing. Include in your answer, what a
Phred score represents and define the following terms: Fragment, read length, single-end reads, and
paired-end reads.
- Reversible terminated sequencing also called next generation sequencing (NGS) is a
method where a part of a DNA is precisely sequenced, although not in a whole, but as
small fragments of DNA. A fragment is what a DNA sequence is broken down to,
consists of 400-1000 basepairs and produces a read. A read length is the length of the
fragment and is between 50-200 basepairs. The reads can be separated into single-end
and paired-end reads, where a single-end read is a sequencing of just one end of the
fragment and a paired-end read is when the read is sequenced in both ends of the
fragment.
A phred score is a method for measuring base calling accuracy when Illumina
sequencing (Illumina). The larger phred score, the better quality of sequence, whereas
an acceptable phred score is above 20.

Question 2.2
What is the name of standard data format for DNA sequences? Provide an example of a 10-basepair
long sequence in this format.
- A standard format for DNA sequences is called FASTA format, which is used to describe
nucleotide and peptide sequences.
>example_sequence_of_10_basepairs
ATGGCGTCCT
Met (M), ala (A), ser (S) and a thymine nucleotide.

Perform the exercise called “Sequence alignment and assembly in R” and answer the
following questions:

In the section: Performing automated global alignments using various gap penalty schemes
Question 3.1: Which gap penalty strategy do you think is most suitable, if we assume that the two
DNA sequences are from the open-reading frame of the same gene, but in two evolutionary distant
species?
- The first alignment in R, where the gap opening is set to a penalty of 0, and a penalty of
gap extensions to 3, meaning that the program is more prone to insert gaps than
mismatches, since it gives us a better alignment, although this is not biologically
favorable, considering that it shifts the reading frame. This alignment gives us a score of
-46.

- The second pairwise alignment in R gives us a score of -148, with a gap opening penalty
of 0 and gap extension penalty of 16, resulting in individual small gaps.
- The third alignment has a gap opening penalty of 8 and a gap extension penalty of 3,
making it more prone to an extension instead of a gap, leading to a score of -94.
Evolutionary this alignment is the one, that makes most sense, since it is more likely to
mismatch than gaps, resulting in a global alignment, that gives an affine gap penalty.

- Normally a higher score will be seen as the best answer, but here we also must
consider, what is evolutionarily favorable. Therefore, the third one is the best fitting
biologically.

Question 3.2: Use the pairwiseAlignment function to perform local alignment using affine gaps
with opening penalty of 8 and extension penalty of 3. What is the alignment found by the local
alignment approach?
- When compared to the third sequence with a score of -94 with the same gap opening
and extension, this score of 14 is much better. But this is not surprising as the
alignment was performed locally, the program allows the sequence to align right where
the sequence has most common features, and therefore is not forced to insert gap
opening penalties to align the two sequences.

In the section: Evaluating the significance of pairwise overlaps through shuffling the subject string
Question 3.3: Based on the histogram, do you think that there is more overlap between s1 and s2
than expected by random chance?
- The histogram is made randomized, making its alignment score -150-(-160), whereas
the real alignment score is given to -94, which is way lesser than the randomized
alignment.
Question 3.4: Does minor changes to the gap scoring strategy affect this conclusion?
- When the gap opening penalty is edited to 6, and the extension is kept at 3, the score is
-86, which is not a significantly different from the real alignment score.

Question 3.5: What is the Z-score of the alignment and what does the Z-score indicate?
- The Z score for the real alignment with a score of -94, is calculated to be 4.93. Since the
Z score is higher than 3, the sequences do not align by chance, but do align by nature. A
Z score lower than 3 would correlate by chance.

In the section: Genome assembly by finding the Eulerian path through a De Bruijn graph
Question 3.6: How many possible k-mers are there if you set k = 8 instead of k = 7?
- When k is edited to 8, the k-mers formed are 41, instead of 42. This means that we
have 41 fragments with a length of 8 each, instead of 42 fragments with 7 of length.

Question 3.7: Find the unique Eulerian path through the graph. What is the sequence of the original
DNA fragment?
- The DNA fragment can be read to
CTCAGATCCAATGATTATTCTCCATTGTGCAAGATTTCTTATGGGCTTCCTACTTCCCCTGAAAG
AAGATCAGCATTCTTATCATGGTGGAG

Question 3.8: Use BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch) to

identify the species the DNA fragment is derived from
- Using the BLAST function, the DNA fragment is found to be a part of chromosome 1 on
the homo sapiens species.
References:
Illumina. Quality Scores for Next-Generation Sequencing. Retrieved from
https://www.illumina.com/documents/products/technotes/technote_Q-Scores.pdf

Peptide Handbook
100% (2)
Peptide Handbook
26 pages
Bioinformatics Alignment
No ratings yet
Bioinformatics Alignment
128 pages
CE6068 Lecture 5
No ratings yet
CE6068 Lecture 5
83 pages
Bioinformatics Pairwise Alignment
No ratings yet
Bioinformatics Pairwise Alignment
128 pages
Computational Biology (3) Alignment Algorithms: by Dr. Safynaz Abdel-Fattah Computer Science Department
No ratings yet
Computational Biology (3) Alignment Algorithms: by Dr. Safynaz Abdel-Fattah Computer Science Department
107 pages
Lecture 3
No ratings yet
Lecture 3
46 pages
L6-Pairwise Seq Alignment
No ratings yet
L6-Pairwise Seq Alignment
70 pages
4 Phylogenetics
No ratings yet
4 Phylogenetics
43 pages
MIT6 047F15 Lecture03
No ratings yet
MIT6 047F15 Lecture03
56 pages
Bioinformatics A Practical Guide To The Analysis of Genes and Proteins 2020
No ratings yet
Bioinformatics A Practical Guide To The Analysis of Genes and Proteins 2020
34 pages
2006 Liviu P. Dinu, Andrea Sgarro, 2006. A Low-Complexity Distance For DNA Strings
No ratings yet
2006 Liviu P. Dinu, Andrea Sgarro, 2006. A Low-Complexity Distance For DNA Strings
14 pages
Lecture3-DNA Data Analysis
No ratings yet
Lecture3-DNA Data Analysis
17 pages
Sequence Analysis - Alignment
No ratings yet
Sequence Analysis - Alignment
57 pages
Analysis of RNA-Seq Data
No ratings yet
Analysis of RNA-Seq Data
71 pages
Disclaimer
No ratings yet
Disclaimer
36 pages
Présentation Ekin en
No ratings yet
Présentation Ekin en
40 pages
Multiple Seq Alignment
No ratings yet
Multiple Seq Alignment
36 pages
B.I Sec 4.
No ratings yet
B.I Sec 4.
18 pages
W03 Pairwise
No ratings yet
W03 Pairwise
55 pages
Transcription in Prokaryotes PPT
100% (1)
Transcription in Prokaryotes PPT
50 pages
Mitochondria: Structure and Function
No ratings yet
Mitochondria: Structure and Function
6 pages
Sequence Alignment
No ratings yet
Sequence Alignment
25 pages
Pharma Sample Data
No ratings yet
Pharma Sample Data
9 pages
Analytical
No ratings yet
Analytical
24 pages
Multiple Genome Alignment in The Telomere-To-telom
No ratings yet
Multiple Genome Alignment in The Telomere-To-telom
22 pages
Msa MTech
No ratings yet
Msa MTech
17 pages
Sequence Alignment
No ratings yet
Sequence Alignment
17 pages
Week 4
No ratings yet
Week 4
38 pages
DNA Sequences Analysis: Hasan Alshahrani CS6800
No ratings yet
DNA Sequences Analysis: Hasan Alshahrani CS6800
26 pages
MAQ - Heng Li
No ratings yet
MAQ - Heng Li
9 pages
Exam Programming Exercises
No ratings yet
Exam Programming Exercises
7 pages
Unit 2.1
No ratings yet
Unit 2.1
77 pages
Running BLAST Through Perl
No ratings yet
Running BLAST Through Perl
35 pages
Importance and Significance of Sequence Alignment - pptx12
No ratings yet
Importance and Significance of Sequence Alignment - pptx12
15 pages
ASBP Training - Alignment and Phylogeny
No ratings yet
ASBP Training - Alignment and Phylogeny
36 pages
Thermo Scientific 2024 - Restriction Enzyme Price List - Rev 1
100% (1)
Thermo Scientific 2024 - Restriction Enzyme Price List - Rev 1
12 pages
Ucla CS C121 HW4
No ratings yet
Ucla CS C121 HW4
5 pages
Para Trabajar en Clase
No ratings yet
Para Trabajar en Clase
297 pages
Frid Seminar
No ratings yet
Frid Seminar
30 pages
Sequence Alignment and Searching
No ratings yet
Sequence Alignment and Searching
37 pages
Optimal Alignment and Heuristic Solutions
No ratings yet
Optimal Alignment and Heuristic Solutions
7 pages
Chapter 7 Multiple Alignment
No ratings yet
Chapter 7 Multiple Alignment
6 pages
Multiple Sequence Alignments
No ratings yet
Multiple Sequence Alignments
9 pages
Assignment5 BI12-223
No ratings yet
Assignment5 BI12-223
9 pages
Lecture 5: Multiple Sequence Alignment: Introduction To Computational Biology
No ratings yet
Lecture 5: Multiple Sequence Alignment: Introduction To Computational Biology
34 pages
Solnlug
No ratings yet
Solnlug
10 pages
PCB Lect02 Pairwise Allign
No ratings yet
PCB Lect02 Pairwise Allign
51 pages
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
No ratings yet
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
51 pages
RNA-Seq Analysis Course
No ratings yet
RNA-Seq Analysis Course
40 pages
ModelQuestions MID Spring2024
No ratings yet
ModelQuestions MID Spring2024
5 pages
02.-Sequence Analysis PDF
No ratings yet
02.-Sequence Analysis PDF
14 pages
Sequence Analysis - Pairwise Alignment
No ratings yet
Sequence Analysis - Pairwise Alignment
26 pages
Chap 03 BioInfo
No ratings yet
Chap 03 BioInfo
15 pages
Sequence Analysis in Bioinformatics
No ratings yet
Sequence Analysis in Bioinformatics
18 pages
Beispielfragen Bioinformatik 1
No ratings yet
Beispielfragen Bioinformatik 1
4 pages
Heuristic Local Alignerers: The Basic Indexing & Extension Technique
No ratings yet
Heuristic Local Alignerers: The Basic Indexing & Extension Technique
39 pages
Unit Ii
No ratings yet
Unit Ii
14 pages
Lecture 7: Multiple Sequence Alignment (MSA) What Is Multiple Sequence Alignment?
No ratings yet
Lecture 7: Multiple Sequence Alignment (MSA) What Is Multiple Sequence Alignment?
6 pages
Bioinformatics Seminar3rdOct18
No ratings yet
Bioinformatics Seminar3rdOct18
25 pages
Chapter 2 Bioinformatics
No ratings yet
Chapter 2 Bioinformatics
9 pages
Lecture 3 and 4 LSM2241
No ratings yet
Lecture 3 and 4 LSM2241
6 pages
Sequence Analysis
No ratings yet
Sequence Analysis
6 pages
4.1. Pairwise Alignment - 2
No ratings yet
4.1. Pairwise Alignment - 2
4 pages
DLL Mod.2 Part 1 3RD QRTR G10
50% (2)
DLL Mod.2 Part 1 3RD QRTR G10
5 pages
Educational Technology Nursing Education VB Notes - Chapter 1 - Educational Technology and Educational Philosophy
No ratings yet
Educational Technology Nursing Education VB Notes - Chapter 1 - Educational Technology and Educational Philosophy
44 pages
Resume For Freshers For Biotechnologist
100% (2)
Resume For Freshers For Biotechnologist
4 pages
Poster Xmeeting 2011
No ratings yet
Poster Xmeeting 2011
1 page
Biotechnology Report
No ratings yet
Biotechnology Report
137 pages
Tissue Processing and Staining
No ratings yet
Tissue Processing and Staining
29 pages
Assignment 1 45422 PDF
No ratings yet
Assignment 1 45422 PDF
5 pages
Introduction To Genomics
No ratings yet
Introduction To Genomics
39 pages
Cell Organelle Review Worksheet
No ratings yet
Cell Organelle Review Worksheet
4 pages
Lecture 5 - DataBase
No ratings yet
Lecture 5 - DataBase
18 pages
Biochemistry
No ratings yet
Biochemistry
48 pages
ML03 Biosensors
No ratings yet
ML03 Biosensors
25 pages
MBD Lec Midterms
No ratings yet
MBD Lec Midterms
56 pages
Antibodies
No ratings yet
Antibodies
21 pages
Amaxa Nucleofectorii
No ratings yet
Amaxa Nucleofectorii
40 pages
Zinc and Thymulin
No ratings yet
Zinc and Thymulin
2 pages
MSD January To April 2025 Teaching Timetable - 061101
No ratings yet
MSD January To April 2025 Teaching Timetable - 061101
9 pages
Cytology - 1
No ratings yet
Cytology - 1
13 pages
Introduction of Histology
No ratings yet
Introduction of Histology
3 pages
Hearing Loss Mechanisms, Prevention and Cure Full Book Download
No ratings yet
Hearing Loss Mechanisms, Prevention and Cure Full Book Download
16 pages
Enzymes - Notes For UPSC Exam
No ratings yet
Enzymes - Notes For UPSC Exam
2 pages
9th Biology 2nd Quarter
No ratings yet
9th Biology 2nd Quarter
2 pages
Post Lab Transformation
No ratings yet
Post Lab Transformation
2 pages
Science LESSON-03-06
No ratings yet
Science LESSON-03-06
2 pages
Certificate
No ratings yet
Certificate
1 page
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet
Neuroevolution: Fundamentals and Applications for Surpassing Human Intelligence with Neuroevolution
From Everand
Neuroevolution: Fundamentals and Applications for Surpassing Human Intelligence with Neuroevolution
Fouad Sabry
No ratings yet

Qualitative Analysis of Biomolecules: 1. The Human Genome

Uploaded by

Qualitative Analysis of Biomolecules: 1. The Human Genome

Uploaded by

Qualitative analysis of biomolecules

1. The human genome

Question 3.8: Use BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch) to

You might also like