Transcription and RNA Processing
The Central Dogma
For the phenotypic expression of any gene, information contained in DNA is first transcribed into an RNA from where it is translated into the amino-acid sequence of the corresponding protein. Phenotypes are the manifestation of the activity/function of proteins and catalytic RNAs.
DNA
Transcription
RNA Pol + factors
RNA
Translation
Ribosomes, tRNAaa + other factors
Protein
Function Enzyme, structural element, hormone, regulatory, etc
Phenotype
The assumptions of the central dogma are sound, but there are exceptions!
The Revised Central Dogma
DNA
REVERSE TRANSCRIPTION
Ribozymes
FUNCTION
Interactions
FUNCTION
Exceptions to the Central Dogma
1. Information contained in RNA can be copied into DNA by reverse transcriptases and telomerase. EXAMPLES: retroviruses, mobile introns. 2. After transcription, nucleotides can be added to, deleted from, or modified in some mRNAs by processes known as RNA editing. EXAMPLES: addition and deletion of Us in the mitochondria of trypanosomes, conversion of Cs to Us in plant mitochondria by deamination. 3. Not all genes are expressed phenotypically through proteins. EXAMPLES: the RNAs of mobile introns are nucleases, the LSU ribosomal RNA has peptide synthetase activity. Such RNAs are classified as ribozymes.
Ribozymes favor the concept that a more primitive RNA-based ancestral form of life may have preceded the DNA-, RNA-, protein-based life as we know it now.
Transcription
The first step of the expression of a gene (DNA) is to transcribe the information contained in the nucleotide sequence from one strand (template or antisense strand) into an RNA. The product of this transaction is a transcript. If a gene carries information for the assembly of a protein, the transcript is a messenger RNA (mRNA).
Translation
The information in mRNAs is decoded by ribosomes and translated by alignment of three-nucleotide codons with corresponding tRNAs that deliver the corresponding amino acids.
The messenger: a general view of transcription and Coding strand of DNA translation
Template strand of DNA
mRNA
Transcription and RNA Processing
Topics
1. RNA structure. 2. RNA synthesis. 3. Transcription: initiation, elongation and termination. 4. RNA processing: cleavage of precursor transcripts and modification of bases. 5. RNA and protein splicing. 6. RNA editing.
Videos
RNA precursors
Base
O
Ribonucleoside triphosphate
RNA polynucleotide chain Fig. 2.1
RNA Structure
Primary structure: order of the nucleotides read 5 3 5 ACUCAUCGGCACGUCAUGCUGAUAUCCGGCUUGACACU 3 Secondary structure: regions of base pairing (double-stranded helical regions) Tertiary structure: folding of the entire RNA chain
Only non-covalent hydrogen bonding is involved in secondary and tertiary structure formation
Stem-loop
Fig 2.2
Pseudoknot
Tertiary structure of a transfer RNA
5 3
Structure of E. coli 16S rRNA
1. ~1500 nucleotides long 2. Compact 3-D folding 3. Highly conserved
5 3
Kinds of RNA found in a typical bacterium (E. coli )
RNA Function Size 620S Stability T Few minutes longer than one generation longer than one generation variable No. of different kinds ~4000 variable <27
Messenger contains information for mRNA assembly of amino-acid sequence of protein by ribosomes and tRNA tRNA Transfer codon recognition in partnership with ribosome and transfer of amino acid into polypeptide Ribosomal component of ribosome, binding of mRNA, tRNA and formation of peptide bond between amino acids Diverse roles: primers, gene regulation, guide RNAs, RNA processing and degradation, cofactor in protein complexes
4S 23S 16S 5S <5S
rRNA
3 unknown and growing
Small RNAs
S = Sedimentation coefficient (Swedberg units), related to size and shape of molecule.
RNAs and DNAs can be separated by size in sucrose gradients by centrifugation and by gel-electrophoresis
16S 23S 30S
Start
30S Precursor 23S 16S
Separation by Electrophoresis
5S
Transcription: RNA polymerase (RNAP) catalyzed synthesis of RNA on a DNA template
RNA polymerases do NOT require a primer for the initiation of RNA synthesis.
Direction of RNA synthesis:
5 3
Direction of reading of template DNA strand:
3 5
Do you remember the principles of nucleic-acid synthesis? Ch. 1!
Subunits of the E. coli holo RNAP and their Functions
Subunit
alpha () beta () beta () omega () sigma (70)
Size aa Da
329 1342 1407 91 613 36,511 150,616 155,159 10,237 70,263
Gene Function
rpoA rpoB rpoC rpoZ rpoD
Assembly of RNAP, binding of some regulatory proteins, catalysis. Catalysis of chain initiation and elongation. Binding to DNA template. Restores denatured enzyme to fully functional form. Directs enzyme to corresponding promoters.
Important Features of Transcription in Prokaryotes
Transcription starts at promoters, located in the DNA upstream of the coding region of one gene or a group of functionally related genes. Promoters are relatively short nucleotide sequences in the DNA (genome) that are recognized by sigma () proteins (factors). Sigma factors bound to promoters recruit (interact, bind) the RNA polymerase core enzyme for the initiation of transcription at specific transcription start sites that are part of the promoter sequence in the DNA.
What do promoters look like?
Typical structure of Bacterial Housekeeping-Gene 70 Promoters with 10 and 35 Regions
Coding-strand consensus sequences Coding 5 Template 3 RNA
(A/G)NNNNN --
DNA
Question: What do the negative numbers mean?
Number of bases upstream (5) of a designated place, which, in this case, is the first base of the RNA (transcription start point).
Fig 2.6
Active site of RNA polymerase and binding of 70
1 4 are domains of 70 proteins Promoter contacts Active site channel pincer 35 10
pincer
RNA exit channel
Secondary channel for nucleoside triphosphate entry
Core RNA polymerase
Crabclaw structure
Holoenzyme
Transcription starts at promoters and ends at terminators
Proteins called factors bind to core RNAP and direct it to DNA nucleotide sequences called promoters. The two strands of the DNA are separated by . A transcription bubble is generated, and transcription starts at a T or C in the template strand. No primer is required for transcript initiation and no helicase is required for the unwinding of the DNA.
RNA synthesis proceeds for ~8 9 nt along the template, is released. Transcription continues until the polymerase with its transcription bubble reaches a transcription termination (t ) site. The RNA and the polymerase dissociate and are released from the DNA.
Fig. 2.7
Transcription Cycle
binding
Termination Closed complex (RPC) Transcription elongation complex (TEC) release and elongation Open complex (RPO)
Promoter recognition
DNA isomerization
Escape
Initiation
Abort
Not all 70 promoters are created equal
UP element Extended 10
Evolutionarily Conserved Regions and Functional domains of Bacterial 70 Factors
70 factors initiate the transcription of housekeeping genes (named after the 70 kDa housekeeping factor of E. coli ). Shown below is the linear arrangement of the 70-like factor of Thermus aquaticus (~43 kDa).
Inhibition of -DNA binding
Down arrows indicate contact points with core RNAP proteins Electrostatic interactions Direct hydrogen bonds Indirect hydrogen bonds potential hydrophobic (van der Waals) contacts.
From S. Borukhov and E. Nudler 2003. Curr. Opinion. Microbiol. 6: 93-100
RNAP Holoenzyme Open Initiation Complex
Part of has been removed to expose the interior of the holoenzyme initiation complex.
Modified from K.S. Murakami and S. A. Darst 2003. Curr. Opinion Structural Biol. 13: 31-39.
RNA channel Nascent RNA
Coding-strand
Template-strand
Transcription Elongation Complex (TEC)
The RNA polymerase adds from 30 to 100 nucleotides per second to the growing RNA molecule (6000 maximum per min).
RNA Polymerase Backtracking
The formation of a secondary structure (stem-loop or hairpin) at the 3 end of a growing transcript can cause backtracking of the RNA polymerase. The 3 end of the RNA is pushed into the secondary channel. GreA/GreB insert their N-termini, which have exonuclease activity, into the secondary channel and degrade the RNA until it is rearranged (properly paired with the DNA template?) in the active site channel so that elongation can be resumed. The true function of RNAP pausing and backtracking is somewhat enigmatic.
The number of different factors varies greatly among organisms
Mycoplasma genitalium has only ONE factor E. coli has SEVEN different factors Streptomyces coelicolor has MORE THAN SIXTY different factors
Each type of factor recognizes its own unique class of promoter sequences
See next page
The Seven Sigma Factors of E. coli and their Functions
Sigma factor
70 (D) N (54) H (32) S (38) E (24) F (28) FecI
Gene
rpoD rpoN (ntrA, glnF) rpoH rpoS rpoE rpoF fecI
Function
Principal sigma factor (housekeeping gene transcription). Nitrogen-regulated gene transcription. Heat-shock gene transcription. Gene expression in stationary phase cells. Periplasmic stress response proteins. Expression of flagellar operons. Regulates the fec genes for iron dicitrate transport.
Consensus Sequences of E. coli Promoters
CONSENSUS SEQUENCE Sigma 70 (D) H (32) E (24) F (28) FecI (18) S (38) 35 region TTGACA CTTGAA GAACTT CTAAA GAAAAT TTGACA 25 region N (54) CTGGCAC Spacer (b) 16 18 13 15 15 17 15 15 14 18 Spacer (b) 6 10 region TATAAT CCCCATNT TCTGA GCCCATAA TGTCCT CTAYACTT 12 region TTGCA
Watch for signals along the track!
Transcription is not a smoothly flowing process: pausing occurs frequently at sites that allow the formation of hairpin structures in the RNA. When hairpins structures form in the portion of the RNA that is located in the exit channel of the polymerase they cause temporary displacement of the 3 OH from the polymerization active site of the RNA polymerase. When this happens, the polymerase backtracks along the template, and several nucleotides at the 3 OH end are pushed into the secondary channel, where they are progressively removed backward by GreA and GreB until a proper pairing of the RNA with the DNA template is restored. Pausing is an important feature involved in the regulation of gene expression through coordination of the synthesis of mRNA with its simultaneous translation, termination and antitermination of transcription, and attenuation of transcription.
Transcription Termination
There are two kinds of transcription termination: 1. Factor independent termination and 2. Factor-dependent termination.
Factor-independent transcription termination occurs at sites in the DNA that include a region of two-fold symmetry followed by a stretch of at least four As in the template strand.
GC-rich inverted repeat
Fig. 2.18 MORE
Run of at least 4 As in the template strand
Factor-independent transcription termination
Folding of the transcript (RNA) in the active site channel breaks hydrogen bonding to the template DNA strand and causes the release of the RNA and the core RNA polymerase.
RNA
Dissociation of nascent RNA and RNA polymerase from template strand
RNA 3 end Fig. 2.18 MORE
Factor-independent transcription termination
How did scientist figure out this mechanism of termination? Mutations that disrupt base pairing in the hairpin loop structure of RNA (two-fold symmetry in DNA) or shorten the run of adjacent Us (As in the DNA template) cause continuation of transcription beyond terminator sites.
Factor-dependent Transcription Termination
Features of factor-dependent transcription-termination sites in DNA: a site specifying a sequence in the RNA, for example rut, that is recognized and bound by a protein, for example Rho () that chases the RNA polymerase and releases it and the transcript from the DNA template at transcription pausing sites (usually a G:C-rich sequence).
Specifies factor-binding sequence in RNA (EXAMLE:rut )
DNA
Any transcription-pause site in DNA DNA Polysomes In prokaryotes, translation is coupled to transcription.
IMPORTANT: Factor-dependent transcription occurs preferentially when the translation of a nascent mRNA is stalled.
Factor-dependent transcription termination
How does it work?
Rho hexamer binds to rut in the nascent RNA and then chases the RNA polymerase until it reaches it at a transcriptionpause site on the DNA. Rho then unwinds the RNA/DNA duplex, releasing the transcript and the RNA polymerase from the DNA.
binds
rut
RNA polymerase reaches a pause site unwinds RNA/DNA hybrid
Stalled ribosome
Fig 2.19
RNA and RNAP are released
E. coli has at least three different
transcription-termination factors
Rho () binds at rut sequences in nascent RNA, moves along RNA (movement requires ATP) until it reaches the RNA polymerase at a transcription-pause site. HOWEVER, if a ribosome has passed a rut site BEFORE is bound, then the ribosome prevents from catching up with the RNA polymerase, and transcription continues past the pause-site. Rho appears to be an RNA/DNA helicase. However, it is not clear whether or not can directly access the RNA/DNA duplex within the active-site channel of a paused RNA polymerase core enzyme. The other two proteins that have termination-factor characteristics similar to those of are: Tau () and NusA In comparison to , the RNA-binding sites and interactions of Tau and NusA with the RNA polymerase at transcription pause sites are not well characterized yet.
Antibiotics that Inhibit Transcription
Rifampin is a member of the rifamycins, macrocyclic lactone antibiotics that inhibit transcription at the initiation stage, but do not block elongation once initiation is complete. Rifamycins bind to the -subunit in the wall of the active-site channel of the RNAP of bacteria, mitochondria and chloroplasts. Two or three nucleotides are polymerized. MORE
Rifampin
Action of Rifampin
pppApN and pppGpN are the most common products. The Streptovaricins are related compounds that have the same action as rifampin, except that they also can block transcript elongation.
Antibiotics that Inhibit Transcription
Binds to the major groove of DNA in G/C rich regions. Inhibits transcription and replication
Actinomycin D
Think about it!
How does the process of RNA synthesis differ from DNA synthesis with respect to: 1. Substrates? 2. Initiation? 3. Template? 4. Priming? 5. Ancillary enzymes? 6. Termination? 7. Editing?
RNA Processing RNA Modification
and
RNA Editing
RNA Processing: rRNA and tRNA (rRNA operon)
Transcript (unprocessed precursor RNA) 16 S tRNA 23S 5S
Endonucleolytic cleavages by Rnases III, P, etc. Spacer Spacer
Processed RNA products Further processing and modification of bases (maturation) MATURE rRNAs and tRNAs
Fig. 2.20
RNase
rRNA processing in E. coli and
RNase
B. subtilis
Rnase M5 is similar to type II DNA topoisomerases RNase Rnase P
tRNA processing
Rnase P consists of an RNA dimer (same sequence, catalytic subunit) complexed with a dimer of a small protein. It is a ribozyme.
RNA modification
Structure of a mature tRNA
Added by CCA transferase
Determining base
Dihydrouridine
IV
The most common RNA modification is U
III
II
Fig. 2.21
Degradation of mRNA
Some of the enzymes involved also participate in rRNA and tRNA processing
Box 2.5
1. ~1500 nucleotids 2. Many modified bases 3. Compact 3-D folding 4. Complexed with 21 proteins 5. Highly conserved
16S rRNA
RNA processing
Introns and splicing
Splicing: Removal of parasitic DNA information from RNA
Group II introns
Group I introns
Box 2.6
Introns in eukaryotic mRNAs
Removal of parasitic DNA information from proteins
Removal of inteins
GyrA of many prokaryotes contains an intein
N-extein
Intein
C- extein
284
454
738
1071
VMA1 protein of yeast
Box 2.6
RNA Editing: Edited ND1 mRNA of T. brucei
* deleted U u added U
uGAUACAAAAAAACAUGACUACAUGAUAAGUAuCAuuuuAuGuuAuuuuuGGuAGuuuuuuuACAuu uGuAuCGuuuuACAuuuG*GUCCACAGCAuCCCG***CAGCACAuG**GuGuuuuAuGuuGuuuAuuGuA uuuuuGuGGuGA*AuuuAuuGuuuA**UAUUGAuUGuAuuAuA***G*GuuAUUUGCAUCGUGGUACAG AAAAGUUAUGUGAAUAUAAAAGUGUAGAACAAUGUCUUCCGuAUUUCGACAGGUUAGAuuAuG uuA*GuGuuuGuuGuAAuGAGCAuuuGuuGuCuuuA***UGuuuuGAGuAuAuGuuGCGAuGuuGuuuGu CGuuACGuuGuGCAuuuAuGCGuuuAuuAAuuGuA****GAAuuuAC***CCGuAGuuuuAAuGGuuuGuu GuGuAuAuCAuGuAuGGuuuuGG*AuuuAGGuuGuuuGuCUCCGuuG*UUAuGAuCAuuuGAGGAA*** CG*UGACAAAuuGAuGACAuuuuuuGAuuuAuG**UUGuGGuuGuCGuAuGCAuuuGGCUUUCAuGGu uuuAuuA*GGuAUUCUUGAUGAuuuuGuuuuuGGuuuuGuuGAuuuuuuGuuGuuGuuGA***UAAuAuC AuGuuuGuuuGuuAuGGAuuGuuAuGAuuuGuuAuuuGuGGGuAAUCGuuuAuuuUAuuuGCGuuuGC** *GuGGuuuGuCAuuuuuuGAuuuAuAuGAuuuA**GuuuuuA**A**UAGuuuAAGuGGuGuuuuGuCuCGu uCGuuAGGuAuGGuGuGAGAuuGUCGuuuAuuuAGuuGuuA****UGA*****GuUGuAuuuuAuGuuuuG uuAuGAuuAuuGuuuuuGuuuuAuAGGuGAuGCAuuuGA*UCGuuuAuuuuuACGuuuGuuuGAUAuGC GuAuGAGuuuGuuGAuuuGuAAGCAAuGuuuuuuuGuuGGuuuuuuuGuuuuuG*****GuuuuGuuuGuuu GuuuG**AuuAuuuAuAuuGuGAuAuuACCAuuG****AGACCAuuAuuAuGuuAuuuuAuAGuuuGuGGu GuuGuuGuuuGCCGGGuAuA*UCAuuuGC*UUGUGuuGAACACCCCAAAGGuGA***GuAuuGuuuGu uAuuA****UGuuuuuGuGuuGGuuuAuGuuCUCGuuuACGuuuGCGuuGuGCGGAuuuuuuGCA*UAUU UGuuuAuuGGAuGuuuGuuuGCGuGGuuuuuuAuuGCAuGAuuuAGuuGC***C*GuuuuAGGuAAuAuu GAuGuuGuuuuuGGAuCCGUAGAUCGuuA*GuuuuAuAuGuG**A******GGUUAUUGuAGGAUUGUU UAAAAUUGAAUAAAAA Courtesy of Dr. Donna Koslowsky
Mitochondrial RNA Editing in Trypanosomes
Editing a substrate RNA by an editosome using a guide RNA Substrate mRNAs are transcribed from mtDNA maxicircles (5-6) Guide RNAs are transcribed from mtDNA minicircles (~1000)
DNA
5 ATATAAAAGCGGGAGTTA
EDITOSOME
Transcript Guide RNA
UU UU Edited segment 5 AUAUAAAAGCGGGAGUUAUUUUUAUUAUUUUUU 3 .....*... .... 3 UUUUUUUUU UAAAAGUAAUAAA 5 C C Tether G Anchor A A A U C U A G A CC A AC GUIDE
Modified from Catteneo (1990)
RNA Editing: Base Modifications
Mitochondrial and Chloroplast RNAs
Untranslated regions (UTRs) and secondary structure modifications of nuclear RNAs in eukaryotes
C U Editing in the cox2 mRNA of maize mitochondria