0% found this document useful (0 votes)
7 views3 pages

ReferenceSequence

The document explains how to align data to a reference sequence, specifically using the Anderson Reference Sequence for human mitochondrial genome analysis in forensic labs. It details the steps to load sequences, mark a reference sequence, and assemble contigs to ensure consistent base numbering for SNP characterization. Additionally, it describes how to compare the reference sequence to the consensus to identify genetic variations.

Uploaded by

tuaartesania
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views3 pages

ReferenceSequence

The document explains how to align data to a reference sequence, specifically using the Anderson Reference Sequence for human mitochondrial genome analysis in forensic labs. It details the steps to load sequences, mark a reference sequence, and assemble contigs to ensure consistent base numbering for SNP characterization. Additionally, it describes how to compare the reference sequence to the consensus to identify genetic variations.

Uploaded by

tuaartesania
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Using the Reference Sequence

There are many cases where you might want to align your data to a reference sequence. In the Tour Guide we demonstrated
how to use the Reference Sequence to check a clone. In this case, we will use data from the hypervariable regions of the
human mitochondrial genome and align the data to the Anderson Reference Sequence. This is a standard practice in forensic
labs, where sequencing is for human identification. The Reference Sequence will not contribute to the consensus calculation,
but it will determine the base numbering and the orientation of the overall contig.

The Reference Sequence is also particularly useful for characterizing SNPs in aligned sequences. The consistency in the numbering of
the bases allows you to reference a SNP by a given base position without concern for the effect of dowstream insertions or deletions.

This tour does not describe all the features for this kind of work, so if you would like more information on SEQUENCHER's capabilities
for forensic or reference-comparison work, please call us.

• Choose New Project from the File menu to open a new, empty project.
• From the File Menu, select Import and from the sub-menu Folder of Sequences.
• Choose the folder called Forensic Sequences.

This will load all six of the raw sequences found in the folder. These forward and reverse samples are all taken from the
same individual. Another way to load data is to just drag it from the desktop into SEQUENCHER's project window. Go to your
hard drive and open the Sample Data folder. Drag the file called "Anderson_HV1.spf" directly into the SEQUENCHER
project window. This is the standard reference sequence that we will compare our samples to. The forensics community has
established reference numbering for the bases in the hypervariable HV1 and HV2 regions of the mitochondrial genome.

• From the Select menu, choose Select None.


• Double click on the file Anderson HV1. You'll notice that rather than starting from base #1, it starts from 16,023.

To maintain the same numbering for the assembly, we will mark the sequence as a Reference Sequence.

• Close the sequence editor window by clicking on its close box.


• The Anderson HV1 file should be the only one that is highlighted. If not, click on it to highlight it.
• From the Sequence menu, select Reference Sequence to "check" it. You should see a small 'R' on the sequence icon to
remind you that this sequence has been marked.

[email protected] 1 1.800.497.4939
• Select all of the fragments by dragging a box around them or choosing Select All.
• From the Contig menu, select Assemble Contigs/Only to the Reference Sequence. When the assembly is
complete, click OK.

This allows all samples to align to a single reference sequence, regardless of inconsistencies between the individual fragments.

• Open the Contig Editor by double clicking on the Contig [0001] icon.
• Click on the Bases button.

You can see that the Anderson Reference Sequence still has the R in the icon and is surrounded by a gray border. You can move
the reference sequence to the top or bottom of the contig by clicking on the name and dragging. Also notice that the numbering
of the entire contig is consistent with the base numbering of the reference sequence.

• Select the first base in the consensus sequence.


• From the Select menu, choose Next Contig Disagree.

The first disagreement, at base 16,040, shows that this individual has an 'A' at that position that is different from the 'C' in the
reference. Since the Reference Sequence does not contribute to the consensus line, the consensus base is an 'A'.

• Let's continue. From the Select menu, again choose Next Contig Disagree. In this case, the individual has an inserted
'T' after base 16,155. To keep the base numbering consistent, this base is called #16,155.1.

[email protected] 2 1.800.497.4939
Note that two bases later, a gap has been introduced because of a 'G' that only appears in one of the sample sequences. Since
this is probably a sequencing artifact, a gap is introduced in the consensus line and no decimal numbering is added. Chances are,
you'll want to delete that base.

How can we get a compact listing of the genetic variations that characterize this individual?

• Click on the name of the Anderson HV1 sequence on the left side of the Contig Editor.
• From the Sequence menu, choose Compare to/Consensus.

In effect, we are asking how the selection (the reference sequence) compares to the consensus (of everything except the
reference sequence). The results are generated in a compact report:

Tour Guide
©2002 Gene Codes Corporation

775 Technology Drive, Suite 100A • Ann Arbor, MI 48108

[email protected] 3 1.800.497.4939

You might also like