0% found this document useful (0 votes)
12 views8 pages

Trần Vĩnh Bảo Ngọc - BTBTWE21113 - Lab 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views8 pages

Trần Vĩnh Bảo Ngọc - BTBTWE21113 - Lab 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

NAME: Trần Vĩnh Bảo Ngọc – ID: BTBTWE21113 – Lab Bioinformatics – Lab 2 (2003/2024)

Q1.
African swine fever on SRA fever/ NCBI:

There are total 766 results in the SRA sever. On the left side, there are some filters and its
number of publications in the parentheses such as Source (DNA(400), RNA(366)), Type, Library
layout,...
+ Platform: ABI SOLID (4), Illumina (525), Ion Torrent (70), Oxford Nanopore (154),
BGISEQ(9), Cappilary(2)
+ Libraty layout: paired (412), single (354)
+ File type: bam (27), fastq (737)
There are total 303 results in the Genome filter in SRA sever.
+ Platform: ABI SOLID (1), Illumina (107), Ion Torrent (54), Oxford Nanopore (139)
+ Libraty layout: paired (102), single (201)
+ File type: bam (13), fastq (289)
African swine fever on SRA fever/Run selector:
- No Genome filter:
- Continental geography:
+ Europe (128) + North America (110)
+ Africa (92) + Asia (63)
+ Uncalculated (9) + Empty (403)
- Organism:
+ Sus Scrofa (406) + African swine fever virus (364)
+ Sus Scrofa domesticus (25) + Ornithodoros moubata (3)
+ Pig metagenome (2) + Ornithodoros erraticus (3)
+ Asfivirus (1) + Porchine sapelovirus 1 (1)
- Platform:
+ ILLUMINA (564) + OXFORD_NANOPORE (154)
+ ION_TORRENT (70) + BGISEQ (9)
+ ABI_SOLID (4) + CAPILLARY (2)
+ DNBSEQ (2)

- Genome filter:

- Continental geography:
+ Europe (54) + North America (110)
+ Africa (74) + Asia (40)
+ Uncalculated (6) + Empty (5)
- Organism:
+ African swine fever virus (286)
+ Pig metagenome (2) + Asfivirus (1)
- Platform:
+ ILLUMINA (94) + OXFORD_NANOPORE (139)
+ ION_TORRENT (53) + ABI_SOLID (1)
+ DNBSEQ (2)
Q2.

There are 35 results for the Illumina platform for African Swine fever virus in Africa
ID: SRR10282409
Forward sequence Reverse sequence

- Almost the first 95 bases have the good quality and - Almost the first 103 bases have the good quality and
36 first bases have nearly no variability. ranges from good to reasonable quality score.
- The bases from the position 96 starts to have the
variability with good score (on the green area).
- Just few bases at the end have the reasonable quality - The remaining bases all have the good quality score,
but have the long lower whisker (means that 25% of however 25% of the quality is fell in the poor quality
the quality is lower than 26 quality score). score.
- The further position of bases, the lower the quality
score and higher variability.

- Universally low quality values because subset of - Universally low quality values because subset of
sequences will have universally poor quality, often sequences will have universally poor quality, often
because they are poorly imaged (on the edge of the because they are poorly imaged (on the edge of the field
field of view) of view)

- The red X mark indicates that there is some wrong - The red X mark indicates that there is some wrong with
with this statistic. this statistic.
- The first 12 reads have the large deviations among 4 - The first 12 reads have the large deviations among 4
types of nucleotide (could because of the bias either in types of nucleotide (could because of the bias either in the
the library or the sequencing). library or the sequencing).
- The rest reads have 4 lines overlap together so it is - The rest reads have 4 lines overlap together so it is
good. good.
- The read has the GC content (blue line) nearly match - The read has the GC content (blue line) nearly match
with the theoretical distribution (red line) => Nearly no with the theoretical distribution (red line) => Nearly no
containination in the genomic dataset or biased subset containination in the genomic dataset or biased subset

- There is a slight porpotion of Ns appearing at read - There is no porpotion of Ns appearing => the sequencer
no.115- 124 => the sequencer can not make a call base makes no proportion or frequency of ambiguous bases at
at that read. each position in a DNA or RNA sequence readout.

- The sequence length is 125 bp


- The sequence length is 125 bp
- This dataset is wrong with something (the red X - This dataset is wrong with something (the red X mark)
mark) - Reads that are considered uniques (1 sequence
- Reads that are considered uniques (1 sequence duplication level) just account 28%.
duplication level) just account 25%. - Reads that appear 2 to 9 times accouting for 5%
- Reads that appear 2 to 9 times accouting for 5% - 45 % reads present more than 10 times and 5% reads
- 45 % reads present more than 10 times and 5% reads represent more than 50%
represent more than 50%
-

- There are just small Illumina univeral adapter (red) - There are just small Illumina univeral adapter (red) and
and Poly G adapter remain in the dataset Poly G adapter remain in the dataset
Q3.
Read 1: forward sequencing

- minimum overlap: 3

Read 2: reverse sequencing

You might also like