The Genetic Code
The structure of DNA encodes all the information every cell needs to function and thrive. In
addition, DNA carries hereditary information in a form that can be copied and passed intact from
generation to generation. A gene is a segment of DNA. The biochemical instructions found within most
genes, known as the genetic code, specify the chemical structure of a particular protein. Proteins are
composed of long chains of amino acids, and the specific sequence of these amino acids dictates the
function of each protein. The DNA structure of a gene determines the arrangement of amino acids in a
protein, ultimately determining the type and function of the protein manufactured.
The deoxyribonucleic acid (DNA) molecule is the genetic blueprint for each cell and ultimately
the blueprint that determines every characteristic of a living organism. In 1953 American biochemist
James Watson, left, and British biophysicist Francis Crick, right, described the structure of the DNA
molecule as a double helix, somewhat like a spiral staircase with many individual steps. Their work was
aided by X-ray diffraction pictures of the DNA molecule taken by British biophysicist Maurice Wilkins
and British physical chemist Rosalind Franklin. In 1962 Crick, Watson, and Wilkins received the Nobel
Prize for their pioneering work on the structure of the DNA molecule.
DNA Structure
DNA molecules form from chains of building blocks called nucleotides. Each nucleotide consists
of a sugar molecule called deoxyribose that bonds to a phosphate molecule and to a nitrogen-containing
compound, known as a base. DNA uses four bases in its structure: adenine (A), cytosine (C), guanine (G),
and thymine (T). The order of the bases in a DNA molecule—the genetic code—determines the amino
acid sequence of a protein.
In the cells of most organisms, two long strands of DNA join in a single molecule that resembles
a spiraling ladder, commonly called a double helix. Alternating phosphate and sugar molecules form each
side of this ladder. Bases from one DNA strand join with bases from another strand to form the rungs of
the ladder, holding the double helix together.
The pairing of bases in the DNA double helix is highly specific—adenine always joins with
thymine, and guanine always links to cytosine. These base combinations, known as complementary base
pairing, play a fundamental role in DNA’s function by aiding in the replication and storage of genetic
information. Complementary base pairing also enables scientists to predict the sequence of bases on one
strand of a DNA molecule if they know the order on the corresponding, or complementary, DNA strand.
Scientists use complementary base pairing to help identify the genes on a particular chromosome and to
develop methods used in genetic engineering.
Genes line up in a row along the length of a DNA molecule. In humans a single gene can vary in
length from 100 to over 1,000,000 bases. Genes make up less than 2 percent of the length of a DNA
molecule. The rest of the DNA molecule is made up of long, highly repetitive nucleotide sequences. Once
dismissed as “junk” DNA, scientists now believe these nucleotide sequences may play a role in the
survival of cells. Identifying the function of these sequences is a thriving field of genetics research
A DNA molecule consists of a ladder, formed of sugars and phosphates, and four nucleotide bases: adenine
(A), thymine (T), cytosine (C), and guanine (G). The genetic code is specified by the order of the nucleotide bases,
and each gene possesses a unique sequence of base pairs. Scientists use these base sequences to locate the position
of genes on chromosomes and to construct a map of the entire human genome.
Protein Synthesis
DNA replication ensures that the genetic instructions encoded in DNA can be used continuously
through generations to produce the proteins that build and operate the cells of an organism. The process of
tapping the genetic code to create proteins, known as protein synthesis, has two crucial steps:
transcription and translation.
Transcription
Transcription transfers the genetic code from a molecule of DNA to an intermediary molecule
called ribonucleic acid (RNA). The basic nucleotide structure of RNA resembles that of DNA, but the two
compounds have three critical differences. First, the structure of RNA incorporates the sugar ribose rather
than deoxyribose, the sugar in DNA. Second, RNA uses the base uracil (U) instead of thymine (T). In
RNA uracil binds with adenine just as thymine does in DNA. Third, RNA usually exists as a single
strand, unlike the double-helix structure that normally characterizes DNA.
Transcription involves the production of a special kind of RNA known as messenger RNA
(mRNA). The process begins when the two strands of a DNA molecule separate, a task directed by the
enzyme RNA polymerase. After the double helix splits apart, one of the strands serves as a template, or
pattern, for the formation of a complementary mRNA molecule. Free-floating individual bases within the
cell bind to the bases on the DNA template using complementary base pairing. The individual bases then
link together to form a strand of mRNA.
In eukaryotes (organisms whose cells have a nucleus), the mRNA strand undergoes an additional
step before the next stage of protein synthesis can occur. The mRNA strand consists of coding regions
called exons separated by regions called introns. The introns do not contribute to protein synthesis.
Special enzymes in the nucleus remove the introns from the mRNA strand. The remaining exons then link
together to form an mRNA strand that contains the entire code for making a protein.
Translation
Once transcription is complete and the genetic code has been copied onto mRNA, the genetic
code must be converted into the language of proteins. That is, the information coded in the four bases
found in mRNA must be translated into the instructions encoded by the 20 amino acids used in the
formation of proteins. This process, called translation, takes place in cellular organelles called ribosomes.
In eukaryotes, mRNA travels out of the nucleus into the cell body to attach to a ribosome. In prokaryotes
(organisms without a nucleus), the ribosome clasps mRNA and starts translation before these strands have
finished transcription and separated from the DNA. In both eukaryotes and prokaryotes, the ribosome acts
like a workbench and clamp that holds the mRNA strand and coordinates the activity of enzymes and
other molecules essential to translation.
Another form of RNA called transfer RNA (tRNA) is found in the cytoplasm of the cell. There
are many different types of tRNA, and each type binds with one of the 20 amino acids used in protein
formation. One end of a tRNA binds with a specific amino acid. The other end carries three bases, known
as an anticodon. The tRNA with an amino acid attached travels to the ribosome where the mRNA is
stationed. The anticodon of the tRNA undergoes complementary base pairing with a series of three bases
on the mRNA, known as the codon. The mRNA codon codes for the type of amino acid carried by the
tRNA.
A second tRNA bonds with the next codon on the mRNA. The resident tRNA transfers its amino
acid to the amino acid of the incoming tRNA and then leaves the ribosome. This process continues
repeatedly, with new tRNA receiving the growing chain of amino acids, known as a polypeptide chain,
from a resident tRNA. The ribosome moves the mRNA strand one codon at a time, making new codons
available to bind with tRNAs. The process ends when the entire sequence of mRNA has been translated.
The polypeptide chain falls away from the ribosome as a newly formed protein, ready to go to work in the
cell.
Mutations
Occasionally mistakes occur during DNA replication and protein synthesis. Any
alteration in the structure of a gene results in a mutation. Mutations occur during DNA
replication when the chemical structure of genes undergoes random modifications. Once a
change has occurred, the altered genes continue to replicate in their changed form unless another
mutation occurs. Sometimes mutations occur during transcription or translation, causing protein
synthesis to go awry. Although mutations may occur in any living cell, they are most important
when they occur in gametes because then the change affects the traits of following generations.
Most mutations harm an organism. If a mutation occurs in a gene sequence that codes for
a particular protein, the mutation may result in a change in the amino acid sequence directed by
the gene. This change, in turn, may affect the function of the protein. The implications can be
significant: The amino acid sequence distinguishing normal hemoglobin from the altered form of
hemoglobin responsible for sickle-cell anemia differs by only a single amino acid.
Some mutations may be neutral or silent and do not affect the function of a protein.
Occasionally a mutation benefits an organism. Over the course of evolutionary time, however,
mutations serve the crucial role of providing organisms with previously nonexistent proteins. In
this way, mutations are a driving force behind genetic diversity and the rise of new or more
competitive species better able to adapt to changes, such as climate variations, depletion of food
sources, or the emergence of new types of disease (see Evolution).
Mutations can produce a change in any region of a DNA molecule. In a point mutation,
for example, a single nucleotide replaces another nucleotide. Although a point mutation produces
a small change to the DNA sequence, it may cause a change in the amino acid sequence, and thus
the function, of a protein.
Far more serious are mutations that involve the addition or deletion of one or more bases
from a DNA molecule. Adding or subtracting even a single base from a normal sequence during
transcription can disrupt translation by shifting the “reading frame” of every subsequent codon.
For example, an mRNA strand may include two codons in the following sequence: AUG UGA.
The addition of a cytosine base at the beginning of this sequence shifts the “spelling” of these
codons so that they read: CAU GUG. This may result in an incorrect amino acid sequence during
translation, or the protein may be truncated. Known as frameshift mutation, this type of alteration
could result in the production of a protein with no real function or one with a harmful effect.
Sometimes mutations are caused by transposition, in which long stretches of DNA
(containing one or more genes) move from one chromosome to another. These jumping genes,
called transposons, can disrupt transcription and change the type of amino acids inserted into a
protein. Transposons rearrange and interrupt genes in a way that generally improves the genetic
variation of a species.
While mutations can occur spontaneously, some can be caused by exposure to physical or
chemical agents in the environment called mutagens. Common environmental mutagens include
ultraviolet rays from the sun and various chemicals, such as asbestos, cigarette smoke, and
nitrous acid. High-energy radiation, such as medical X rays, can cause DNA strands to break,
leading to the deletion of potentially important genetic information.
Radiation damage can also affect an entire chromosome, disrupting the function of many
genes. In chromosomal translocation, a piece of one chromosome breaks off and merges with
another chromosome. In some cases, large sections of chromosomes may break off and be lost.
The cell has highly effective self-repair mechanisms that can correct the harmful changes
made by mutations and prevent some mutations from being passed on. Some 50 specialized
enzymes locate different types of faulty sequences in the DNA and clip out those flaws. Another
repair mechanism scans DNA after replication and marks mismatched base pairs for repair.
© 1993-2003 Microsoft Corporation. All rights reserved.