L1 Expression Vectors
L1 Expression Vectors
UNIVERSITY OF NAIROBI
SCHOOL OF BIOLOGICAL SCIENCES
SBT 415 MOLECULAR BIOLOGY II
Expression Vectors
Any piece of DNA capable of autonomous replication within a host cell into which other DNA
sequences can be inserted and amplified is refereed to as a standard cloning vector.
Unlike a standard cloning vector, an expression vector efficiently expresses an inserted foreign
gene by producing the protein product of the cloned gene, i.e. the recombinant protein.
An expression vector should be propagated in its host cell as a single copy genomic insert to
enhance its stability, i.e. should occur only once in the host cell genome.
It should also respond to induction with rapid increase in transcription of the foreign DNA.
1. Promoter.
The promoter marks the point at which transcription of the gene should start.
It is recognized by the sigma subunit of the transcribing enzyme RNA polymerase.
A sigma factor is a protein component of prokaryotic RNA polymerases that binds loosely to the
core enzyme and restricts mRNA transcription to one of the two DNA strands and appropriate
promoter region.
The promoter is the most important component of an expression vector because:
o It controls the first stage of gene expression i.e. the attachment of an RNA polymerase
enzyme to the DNA
o It determines the rate at which mRNA is synthesized.
o To some extent it determines the amount of recombinant protein
A small variation has a major effect on the efficiency with which the promoter can direct
transcription.
Most E.coli promoters do not differ much from the consensus sequence (TTGACA ), e.g.
TTTACA instead of TTGACA.
Consensus sequences are conserved base sequences common in regulatory regions of DNA
e.g. promoter regions, where each base occurs in a particular position in a large proportion of
genomes of that species.
Examples of consensus sequences are TATA box in eukaryotes (about 25-30 base pairs
upstream of the sequence to be transcribed) and Pribnow box sequences in prokaryotes
(TATAATG. also called a –10 region, because of the invariant T residue at base 10 upstream
from the start of the transcribed region).
(a) Strong promoters (e.g. from phage) are those which can sustain a high rate of transcription.
They usually control genes whose translation products are required in large amounts by the cell.
(b) Weak promoters direct transcription of genes whose products are needed in only small
amounts. They are relatively inefficient.
2. Terminator
(a) It marks the point at the end of the gene where transcription should stop.
(b) It is usually a nucleotide sequence that can base-pair with itself to form a stem-loop structure.
There are two major types of gene regulation in E.coli, namely, induction and repression.
(a) Induction is the switching on of expression of a gene or group of genes in response to a chemical
or other stimulus. An inducible gene is one whose transcription is switched on by addition of a
chemical to the growth medium. Often this chemical is one of the substrates for the enzyme coded
by the inducible gene.
(b) Repression is the switching off of expression of a gene or group of genes in response to a chemical
or other stimulus. A repressible gene is switched off by the addition of the regulatory chemical.
Many of the sequences important in induction and repression lie in the region surrounding the
promoter and are therefore also present in an expression vector. It is therefore possible to extend the
regulation to the expression vector, so that the chemical that induces or represses the gene normally
controlled by the promoter is also able to regulate expression of the cloned gene.
The regulation of the cloned gene by the chemical that also induces or represses the gene
normally controlled by the promoter is an advantage in the production of recombinant protein
for two reasons.
1. If the recombinant protein has a harmful effect on the bacterium, then its synthesis must be
carefully monitored to prevent accumulation of toxic levels. This can be achieved by
judicious use of the regulatory chemical to control expression of the cloned gene.
2. Even if the recombinant protein has no harmful effects on the host cell, regulation of the
cloned gene is still desirable, as a continuously high level of transcription may affect the
ability of the recombinant plasmid to replicate, leading to its eventual loss from the culture.
Expression vectors permit the expression of the cloned sequences by fusing them to transcription
and translation start signals.
Upon induction of the lacZ gene by IPTG, a fused mRNA results, containing the inserted coding
region just downstream from that of the -galactosidase.
This mRNA is translated by the host cell to yield a fusion protein.
(A fusion protein is a protein resulting from the expression of a recombinant DNA containing two
open reading frames (ORFs) fused together. An open reading frame is a reading frame that is
uninterrupted by translation stop codons).
It does not matter if one has cloned a whole cDNA or not. The antiserum is a mixture of
antibodies that will react with several different parts of our protein, so even a partial gene will do,
as long as its coding region is cloned in the same orientation and reading frame as the leading -
galactosidase coding region.
For instance, some vectors have restriction sites located just next to the control region for the lac
genes, which has been spliced into the vector. These restriction sites permit foreign DNA to be
spliced into the vector for cloning next to the lac control regions.
This allows expression of the cloned gene by using the lac transcription and translation start
signals and control of expression by the lac repressor.
Some vectors are bifunctional, allowing expression in two different hosts. For example, a certain
plasmid contain the origin of replication of the plasmid pBR322 and of the animal virus SV40.
These origins allow replication in E.coli and some cultured mammalian cell lines,
respectively.
These plasmids are called shuttle vectors, because they can transfer genes back and forth
from one type of cell to another.
The pUC vectors place inserted DNA under the control of the lac promoter, which lies upstream
from the multiple cloning site.
If an inserted DNA happens to be in the same reading frame as the lac gene it interrupts, a
fusion protein will result.
It will have a partial -galactosidase protein sequence at its amino end and another protein
sequence, encoded in the inserted DNA, at its carboxyl end.
The foreign polypeptide can be recovered from the fusion protein by cleaving at the junction between
the two components with cyanogen bromide, which cuts polypeptides specifically at methionine
residues. The methionine residue at the fusion junction must be the only one present in the
entire polypeptide. If others are present then cyanogen bromide will cleave the fusion protein into
more than two fragments.
IPTG, so addition of this chemical into the growth medium switches on transcription of a gene
inserted downstream of the lac promoter carried by an expression vector.
It is usually advantageous to keep a cloned gene turned off until one is ready to express it.
There are three reasons for this:
(a) Eukaryotic proteins produced in large quantities in bacteria can be toxic.
(b) Even if the eukaryotic proteins are not toxic, they can build up to such great levels that they
interfere with bacterial growth.
(c) If the cloned were allowed to remain turned on constantly, the bacteria bearing the gene
would never grow to a great enough concentration to produce meaningful quantities of protein
product.
The solution is keep the cloned gene turned off by placing it behind an inducible promoter that
can be turned off. One strategy is to use a very tightly controlled promoter such as the phage
promoter PL.
An example of a vector with such promoter is the pKC30.
(a) The gene to be expressed in inserted into the unique HpaI site of pKC30 vector, downstream
from the OLPL operator/promoter region.
8
(b) The host cell used is a lysogen bearing a temperature-sensitive repressor gene (cI857).
(c) When the temperature of these cells is kept relatively low (32oC), the repressor functions, and
no expression takes place.
(d) When the temperature is raised to the nonpermissive level (42 oC), the temperature-sensitive
repressor is inactivated and removed from O L, allowing transcription of the cloned gene to
occur.
and are usually exported from the cell. The nascent polypeptide is extruded across the lipid
bilayer, where a signal peptidase cleaves the signal from the protein).
(c) Purification of the protein is easy. The yeast cells can be removed by centrifugation, leaving
relatively pure secreted gene product behind in the medium.
(b) The foreign gene might contain sequences that act as termination signals in E. coli. These
sequences are perfectly innocuous in the normal host cell but in the bacterium result in
premature termination and loss of gene expression.
10
(c) The codon usage of the gene may not be ideal for translation in E. coli. Although virtually all
organisms use the same genetic code, each organism has a bias towards preferred codons.
This bias reflects the efficiency with which the tRNA molecules in the organisms are able to
recognize the different codons. If a cloned gene contains a high proportion of unfavoured
codons, then the host cell’s tRNA may encounter difficulties in translating the gene, reducing
the amount of protein that is synthesized.
Yields of recombinant protein are relatively high, but S. cerevisiae is unable to glycosylate animal
proteins correctly and lacks an efficient system for secreting proteins into the growth medium. Codon
bias can also be a problem. Despite these drawbacks, S. cerevisiae remains the most frequently used
microbial eukaryote for the reasons given above.
Another eukaryote used in recombinant protein synthesis is Pichia pastoris. It is able to synthesize
large amounts of recombinant protein (up to 30% of the total cell protein) and its glycosylation
abilities are very similar to those of animal cells. The sugar structures that it synthesizes are not
exactly the same as the animal versions, but the differences are relatively trivial and would probably
not have a significant effect on the activity of a recombinant protein. In addition, the glycosylated
proteins made by P. pastoris are unlikely to induce an antigenic reaction if injected into the
bloodstream, a problem that is frequently encountered with the over-glycosylated proteins synthesized
by S. cerevisiae. Expression vectors for P. pastoris make use of the alcohol oxidase (AOX)
promoter, which is induced by methanol.
Filamentous fungi
13
The two most popular filamentous fungi are Aspergillus nidulans and Trichoderma reesei (wood-rot
fungus). The advantages of these organisms are their
good glycosylation properties and their
ability to secrete proteins into the growth medium.
In its natural habitat Trichoderma reesei secretes cellulolytic enzymes that degrade the wood that it
lives on. The secretion characteristics mean that these fungi are able to produce recombinant protein
in a form that aids purification.
Expression vectors for Aspergillus nidulans usually carry the glucoamylase promoter, induced by
starch and repressed by xylose.
Expression vectors for Trichoderma reesei make use of the cellobiohydrolase promoter, which is
induced by cellulose.
Although gene cloning may not be necessary in order to obtain animal protein, expression vectors and
cloned genes are used to maximize yield. This is achieved by placing the gene under control of a
promoter that is stronger than the one it is normally attached to.
A promoter that has been used in mammalian cells is the heat-shock promoter of the human hsp-70
gene, which is induced at temperatures above 40oC.
Another promoter used in mammalian cells is the mouse metallothionein gene promoter, which is
switched on by addition of zinc salts to the culture medium.
14