Watson Molecular Biology of the Gene 5th Ed Ing

J Molecular Biology of the Gene ) ~ ~¡ ~w" , FIFTH EDITI O N oJ 1 1 ~ '1 l~ I BAKER BELL ) GANN LEVINE , \

Views 112 Downloads 7 File size 135MB

Report DMCA / Copyright

DOWNLOAD FILE

Recommend stories

Citation preview

J

Molecular Biology of the Gene ) ~ ~¡ ~w"

,

FIFTH EDITI O N

oJ

1 1

~

'1

l~

I

BAKER BELL )

GANN LEVINE

,

\

LOSICK

\

Brief Contents

n.e

Genetic Co:Ie

,.-

~

•• ;, •

a •o ,~

~

.,

~o A



i'

'"

~

• Olaln-lermil'laling or "nonsensa" codotls

1 AIso used., bacteria lo specily !he initiator formyt-Met-IRNAIMeI

adeniroe

11 .1 A

10_8 Á

'ItIe positioo and l engttl 0 1 t.he

~

b::nIs betwcen the

tare pairs_

NOTFORSALE

Molecular Biology ofthe Gene F

1 FT

H

ED

ITI

QN

James D. Watson Cold Spring Harbar Laboratory

Tania A. Baker Massachu.·;etts Instituto ofTech nology

Slephen P. Ben Massachusetts Institute of Technology

Alexander Gann Cold Spring Harbar Laboratory Pross

Michael Levine University ofCalifomia, Berkeley

Richard Losick Harvaro University

.,.., PEARSON

...-... Benjamin Cummings

CSHL PRESS

~ti.

.J

NOTFORSALE

)

Benjamin Cummings Publísher: Jim Smith Associ8tC Pr.oject Editors: Alexandn'l F'ellowes, Jeanne ZerliesofDNA Ring> 121 RNA STRUCTURE l ZZ RNA Contains R,OOsc and U racil aOO 15 Usually Singlc-5t randed J22

Nuclcosomcs Introduce Negacive Supcrcuiling in Eukar,'olcs 1I 5 Prokaryorcs H:.1ve a Spcda l Topoisomerase Ihat /ntroduCL'S Supcrcoils into ONA 116

RNA Chains Fold Bock on -n u.'tnselves to Fonn Local R~ions of Douh lc Hel ix Sim ilar to A -Form ONA 123

Topoisorncrascs a lso U nknot :m d Discm a nglc ONA MClleculcs

RN A Can Fold Up iOlo Complcx Tcrt iary S trucw rcs

Topoisomernses Cm Rclax Supcrcuilc..>d ONA

115

11 7

Topo isomcrascs Use :.1 Covalent Prote in-DNA linkagc to C lcavc an d Rejo in DNA S trands 11 8 Topoisomemscs Form an ElUymc Bridge anJ Pass DNA Scgmtllls l hrough Each Other 118

Sorne RNAs Are Enzymcs

ONA TopoisomcrS Can Be Scparatcd by Elt:ctropho rcsb 120

Sumnwl)'

CHAPTER

124

125

H amm crh(.'ad Ribozyrne C lcavcs RN A by tht: Formatian of a 2',)' Cyclic Phosphfl tt' J 25

TI1C

Oíd Lifc Evolvc from an RNA WorlJ ?

) 26

126

Bi blWteT~hy

JZ?

7

C hmmosomes, C hromatin, and the Nucleosome

129

CHROMOSOME SEQUENCE ANO D1 VERSITY 130

C h romosomc S UUc[lItt! Changes as Eub ryotic Cclls Divide 143

C hromosom(S Can l3c C ircular o r Linear

Sislcr C h romatid Colu.'Siún and ChrornOSúme Condensat ion Are Med imoo bV SMC Prote ins

130

Evcry Cc ll Maim ains:.1 C harncterisric N umbcr ofChromosomcs 13 1 Gcnomc Si%c Is Rdared of lhe Orgnn ism 133

(O

Mitosis Maintains Ihe Parental

Ch romo1>Onl c Num bcr

che Complexity

Thc E. colí Gcnomc 15 Composcd almost EOlirely ufOenes 134 More Complex O rgfln isrns Have Decreased Gene Dcns i r ~' 134 Genes Makc Up On ly a Small Prupon ion o( [he Eukaryot lc Chromusomal DNA LJ5

J 44

146

The Gap Phases of fh e Ccll Cyd c A llow Time to Prepare fo r the Ncxt Cell C ycle S tage whi lc al50 Ch~cking mar rhe l'rev ious SWblC 15 Finishcd Corrcctl y 146 Meiosis Reduces rhe Parental C hromosomc Num"'c r 148 Di ffercn t Lcvcls o ( C h romosome S tructure

The Majority o( Human In tcrgen ic Scque nces A re Composcd ofRcpctil ivc DNA l37

Can Be Obscrvt..xI by Microscopy T HE NUCLEOSOME 15 1

CHROMOSOME OUPLlCATlON ANO SEGREGATION 138

Nuclcosomcs Are ,he Building Block, of Chromosumcs 151 Bux 7; 1 MicnlCOCcaJ Nudt.:,ase arnJ che DNA AsSlJCillt.."ll uoilh tI,e Nudro5Ume 152

Eukaryotic C hro mosomcs Rcquire Cenrromeres. Tclomt!rcs. Rnd O rigins o f ReplicRlÍon to Be Mam mint'd Juring C dl Oivisinn 138 Eukaryoric C h ro rnooomc Duplicarian and Scgrcg.nion Occur in Scpanll c Phascs el rhe Ccll Cy~ l e 141

150

H istoncs Are Small , P()Si t ivcIy~l.n

white

9

phenotype

x WW

gametes

,

wY I

I. r

""'

"," 3

FI geoeration

¡

t

w

WY

genolype

ww

r, I

O

X

geootype

¡

red

I

t

t

0

ga",,""

w

r,

~r

I

F1 generation

....

d

...mita

" X

X

Ww

n

n

w

Ww

n

WY

~

ww

w

y

tri, rfl1

F2generntion

F 2 generation

red

w

n

wY

red

9

Ww

red

O

WY

white

O

wY

"'"

Ww

""n.

....

3

WY

""n. 3

wY

Box 1·2 fl(;U RE , lhe Inheritance of a sex-lInked lene In DrosopItiIo. Genes Iocated on se: chromosomes can e:pess themselves differenlly in male and female progeny, because jf mere is only one X dlfomosome pteSeI1l, recessive genes on this chromosome are always expressed Here are two crosses, both involving a recesWe gene (~ for white eye) Iocated en lhe X chromosome. (a) The male parenl ¡s a oM-ñte-eyed (wY) fIy, and thefemale is hornol$)US ror red ~ (WW ). (b) The male has Jed eres (W Y) and!he femalewhite eyes (WW). The lener y stands here fIO! for an a11e1e, but for lhe Y d1l'omosome, present in ma1e Orosofhilo In place ef él homolcgousX mromosome.. There is no gene on the Y chfOITClsorne conesponding 10 tIle w Of W gene en !he X chrornosome.

Gene IJnkO$e und Crossing Over

examples ol' nonrandom assortmeot were foun d as soon as a large number of mutan! genes became 8vailable for breeding analysis. Jn every well-studied case, the number of Jinked groups was identica l with the haploid chromosome number. For example. there are four groups of Holeed genes in Drosophi/(¡ and four morphologicnlly dislioel ehromosomes in a haploid cell . Unkage. howcver. is in cffect never complete. Thc probability that two genes 00 Ihe same chromosome wil! remain together dueing meiosis ranges from just less than 100% to nearly 50%. This variation in liokage suggests tbar there must be a mechan ism for exchangiog genes on homologous chromosomes. This mechanism is caBed crossing over. lts cytological basis was first described by Belgian cytologisl F. A. Janssens. Al the start of meiosis. through the process of synapsis, the homologous chromosomes foem paics w ith their long axes parallel. Al this slage. each chromosome has duplicated lo foem two chromatid s. Thus. synapsis brings togelber foue chromatids (a telrad) . which coil aboul one anolber. Janssens postulaled thal . poss ibly because of tension resuJting from this coiling, two of the chromatids might sometimes break al a corresponding place on each. These events cou ld create fOUT broken ends. wh ich might rejoin erossways, so that a section oI each of the Iwo chromatids would be joined to a section of the other (Figure 1-4). In lhis manner, recombinant eh romatids mighl be produced Ibat conla in a segment derived rrom each of the original homologous chromosomes. Formal proof of Janssens's bypolhesis that chromosomes physicaUy interchange material during synapsis carne more Iban 20 years later, w hen in 1931, Barba ra McClintock and Harriel B. Creigb ton, workil1g at Cornell University with the corn plant Zeo moys, devised an clegant cytological demon· stration of chromosome breakage and rejoining (Figure 1-5).

parental genotypes

~

:

krob



FI (¡ U R E 1-4 Janssens's hypothesis of

uossing overo

w,

e

1M
periment sholt.'n here, the horm~

w,

~

e

I

W> j ...

e e

e

~

e

~W>

e

~

e

w,1-

W,

e

w,

...

1

...

loo

~ crossover pntgeny

- 1

.. 1

loo ~

e


I

,

...

w,

e, wx progcny had ID alise by crossing over between theCand wx Iod. W1len sudl

c. wx

offspring VIEre cytoIogically examine;,

I

lrequent pair of reciproca! recombinants must ¡nlhesis (p. 37)

• 1he Era 01 Genomics (p. 38)

19

zo

Nuc/eic; IIdds ú:mvey Ge.nctic; Informution

AVERY'S BOMBSHELL: DNA CAN CARRY GENETIC SPECIFlCITY That DNA might hA tlm key genetk molecule emerged most uoexpectedly from studies 00 pneumonia-causing bacteria. In 1928 Englisb microbiologist frederick Griffith made the startling observation tha! nonviruJent strnjns of the bacteria became virulenl wben mixcd with their heat-killed pathogenic counterparts. Tha! such transformations from nonv irulence to virulence re prescnled hereditary changes was shown by using descendants of the ncwly pafhogenic stra ins to transforOl sHU olher nonpnlhogenic bacteria. This raised the possibility that when pathogenic ccHs are killed by heat, their genetic components remain undamagcd. Moreover. onco Iiberated from the heat-killed celIs. these components can pass through the cflll wall of the living reciplcnt ceUs and unrlergo subsequenl gcnetic recombinaIion with the recipient's genetic apparatus (Figure 2-1). Subsequent research has confirmed this genetic interpretation. Pathogenicity reflects the actiún 01' the capsule gene. which codes for 8 key enzyme involverl in lbe syntbesis of the carbohydrnte-containing capsule lhat surrounds masl pneu monia-causi ng bacteria. When the S (smooth) allele of the capsule gene is present, Ihen a capsule is fonned around the cell tbat is nccessary foc pathogenesis (the formati on of fl cnpsule also gives a smooth appearancc to the colonies fOffiled from these cells). When the R lrough) allele oCIhi s gene is presenl, no capsule is formed and the respective eeUs are no! pathogenic. ""ith io severa] yenes after Criffith's origi nal observation. extra'''$l".

thymine (ONA)

cylosine (ONA)

X ~OH'W

o

OH 8ugal':

OH

deoxyribose

deoxyribose

I

nucleotide:

deoxycytidine-S',phosphate

nucfeoside:

deoxylhymidine-S'-phosphate

deoxyguanosine

deoxya in Biologiwl SJ'.~tems

45

WEAK BONDS IN BIOLOGICAL SYSTEMS The maio types of weak honds important in biological systems are the van der Waals honds. hydrophobic bonds, byillogen bonds. and ionie honds. Sometimes. as we shall soon see, the distinction between a hydrogen bond and an ¡onie bond i8 arbitrary.

Weak Bonds Have Energies between 1 and 7 kcal/mol The weakest bonds are the van der Waals bonds. These have energies (1 to 2 keal/mol) only sJightly greater than the kinetic energy of heat molion. The energies of hydrogen and íonie bonds range between 3 and 7 keal/mo1. In liquid solutions, almost all molecules forro a number of weak bonds to nearby atoms. AIl molecules are able lo form van der Waals bonds. whereas hydrogen and ¡ooie bonds can form only between molecU,les Ihat llave a oel charge (ions) 01.' in wh ich the charge is unequally distributed. Sorne molecules thus have the capacity lo fOrIn several types of weak bonds. Energy considerations. however. tell us tha! moleeuJes always hove a greatcr lendency lo fonn the stronger bond.

Weak Bonds Are Constantly Made and Broken at Physiological Temperaturcs The energy of the strongest weak bond is anly abaul ten limes larger than the average energy oC kinelic moHon (heal) al 25 oC {O.6 kcal/mo1}. As Ihere is a significant spread in the encrgies of kjnetk mOlian, many molocules with sufficient ....inetic energy to break the strangest weak bond always exist al physiologieaJ temperatures.

The Distinction betwcen Polar and Nonpolar Molecules AH forms of weak interactions are based on ottractions between electrie t:harges. The separation oC elcdric charges can be permanenl or temporary, depending on the atoms involved. Far exomple, the oxygen molecule (0:0) has a symmetric distribution of eleclrons betwecn its two oxygen atorns. so cat:h of its two atoms is uncharged. In contras\, there is a nonuníform distribution al charge in water (H:O:H), in which the bond electrons are unevenly shared lFigure 3-3). They are held more strongly by lhe oxygen alom. which thus corries o considerable negative t:harge, whereas the two hydrogen atoms together have an equal amount oI positive charge. The centcr of the positive charge 1S on one side of lhe centcr oC the negative charge. A combinabon oC separated positive and negative charges is called an eleclric dipote momen\. Unequal electron sharing reflects dissimilar affinities of the bonding atom s lar eleclrons. Atoms Ihat have a tendency to gain eleetrons are caBed eJectronegative atoms. Elocttopositive atoms have a tendency lo give up e leclrons, MoJecules (s uch as H:t0) Ihal have a dipole moment are caUed polar molecules. Nonpolar molecules are those with no effective dipole moments, In methane (CH,¡), for examp le. the carbon amI hydrogen aloms have similar affinities for their shared electron pairs, so neither !he carhon nor the hydrogen atom is nOliceably charged. The disUihution oI charge in a molecule can also he affecled hy the presence of nearhy molecules. partic ularly iI the affected molccule

van der Waals radius of hydrogen /

covalent bond length

·.f~

' .."

/

/

.~ 10S~t

van del" Waals radius of oxygen

._1:4 .~__.

• ,

"

_

directioo of dipole moWlmenl

f I G U R E 3-3 The structure

m04ecute.

of a. wate.-

46

The tmporlance 01 Weok Chemic.'Ol tnleroctions

10Á

FIGURE 1-4 \/ariationohanderWaats torces with distance. The atoms shown in this diagram are i11Qm5 of lhe inert rare

gas ilrgon_ (Source: Adapted frorn P3llling L 1953 . General chemisJJy, 2nd edition. p. 322. Copytight 1953 by W.H. Freeman. Used with

wealoery strong van der Waals attracUon





about4Á

van de!" W3als atlraction jusI balancad tl)' lepulsive lu"oos. owing lo interpenellation 01 ouler electron shelts

o





p

"

-!)

is polar. The effecl may cause a non polar molecule lo acquire a slightly polar charactcr. If the second molecule is not polar, its presence will still alter the nonpolar molecule. cstabli shing a fluctuating charge distribution. Such induced effects. however, give rise to a much smaIler separalion of charge Ihan is found in polar molecules, resulting in smaller interoction energies and correspondingIy weaker chemical bonds.

O

O

acelale

~

~ ,

~

O

Van del"" Waa ls Fot"ces

O glycine

O

guanina

F I C; U JI: E 1·5 Dfawings ot several molecuies with!he van der Waals ,adii of the a10ms shown in purp4e. blue. and Ofange.

Van der Waals bonding arises from a nonspeciñc attractive force originating when two atoms come close to each othe r. It is based not on the existence of permanent cha:rge separations. bul rather on the induced fluclllating charges caused by the nearness of molecules. It therefore operales between all types of moJecules. nonpolar as well as polar. [t depends heavily on the distancc between the interacting groups, since the bond energy is inversely proportional lo the sixth power of distance (Figure 3-4). There also exists a more powerful van der Waals repulsive force, which comes inlo play al even shortet distances. This repulsion is caused by !he overlapping of the olller electron she lls of the atoms involved. The van der Waals aUractive and repulsive force s balance al a certain distance specific for each type uf atom. This di stance is the so-caBed van dcr Waals radius (Table 3-2 and Figure 3-5). The van der Waals bonding energy belween !wo atoms separated by the sum of their van der Waals radii increases with the size of !he respective atoms. For two average atoros, it is only about 1 kcallmol , which is JUSI slightly more than the average thermal energy of molec ul es al room temperature (0.6 kcallmol).

lVenk Brmrls ill BiOloll icol Systems

This means Ihal van de r Waals forces are an effective binding force al physiologica l temperatures only when several atoms in a given molocuJe are bound lo severa! atoms in anolher molecu le. Then the enllrgy of inleraction is much greatúr than the dissociating tendency resulting hom randorn thermal movements. For severo l atoms lo intera!;! effecti vely, !he molecular fil musl be precise, since the distance separati ng sny two in!eracting atoros musl nol be much gmater than the sum of their van der Waals radii (Figure 3-6 ). The strength of interaction rapid ly approaches zero when this clistancc is only slighUy exceeded. Thus, Ihe stronges! Iype of van der Waals contacl arises when a molecule contains a cavjty exactly complementar}' in shape lo a protruding group of another molecule, as is the case with an anligen and its speci6c anlihody (Figure 3-7 ). In this ¡nstance. lhe binding energies somelimes can he as large as 20 lo 30 kcallmol . so that antigen-antibody complexes seldom fa l] aparl. The bonding pattero of polar molecules is rarely dominated by van der Waals inte ractions, sint:e sut:h molecu les can acqu ire a lowe r energy state (lose more free energyl by forming other types of bonds.

J ABL E 3 -2 Van derwaals Radíiof lhe Atoms in Biologacal

Molecvfes

H

1. 2 1.5

N O

1.4

P S

19

185 2 .0 1.7

CHJ group Hall thickness Of aromalic rnolecule

• • "-

o"

.0

O

,~

o

O



O

o

~,

0 0.

0'"

O o

O

F I e u RE 3 -6 TIte ammgement 01 mofeades in a I~r of a crystal formed by the .. mino acid gtycine. The p¡¡oong of the molecules IS determined by lhe van cler Waals rac!ii of ¡he groups. exce¡:x lar tht> N-H O contacts, ,...tllch are shortened by lhe formabon of hydrog€n bonds. (Source: AdapleO fiom Piluling L 1960. The noture 01 the chcmicol bond ond rhe structure o/ moJecules ond

crysloJs: An IfJrroductior¡ ro modem Sfructurol dJemisuy, 3 rd edillorl, p, 262. Copynght e 1960 Comell Unrvmlty. used by permlSSion 01¡he publishef.)

J ABLE 3 - 3 Approximate Bond lengths 01 Biologically Importanl Hydrogen Bonds

Sorne Ionic Bonds Are Hydrogen Bonds Many organic molecules possess ¡ooie gruups that conlain one oc more units oC nel positive or negalive charge. The negalivcly charged mononucJeolides, fOI" example, contain phosphale groups . which are negatively charged. whereas each amino acid {except proline) has a ncgalive carboxyl group (COO ) and a positive am ino group (NH J +1. bolh of whit:h t:arry a unit of c harge. These charged groups a re usuall y neutra lized by nearby. oppositely charged groups.

van der Waals radius (Á)

Atom

Hydtogen Bonds A hydrogen bond is formed belween a covalentl y bound donor hydrogen atom wilh sorne pos iti ve t:harge and a negativel)' charged, covalentJy bound aeceptor atom (Figure 3-a). For example, the hydrogen atoms of the amino (-NH:J group are attracled by !he negalively charged keto (- C=OI oxygen otoms. Someti mes. the hydrogen-bonded atoms belollg to gruups with a unil of charge (such as NHa" or coa 1, In otber cases, both the donor hydrogen atoms and lhe negative au:eptor atoms ha ve less Ihan a unit of charge, The biologically mOSI important hydrogen bonds involve hydrogen atoms covalent ly bound lo oxygc n atoms (O-Hl or nitrogen aloms (N-I-I). Likewise. the negati ve acceptor aloms are usually nitrogen or oxygen. Table 3-3 lists sorne of the 0105 1 important hydrogen bonds. In Ihe absence of surrounding water mo lecuJes , bond energies range between 3 and 7 h :allmol, the stronger bonds involving the greater charge differences betwee n donar and act:eptor atoms. Hydrogen bonds are thus weaker Ihan covolen! bonds. yel considerahly stronger Ihan van der Waa ls bonds. A hydrogen bond, therefore, wi ll hold two atoms doser together than lhe sum of Iheir van der Waa ls raclU , but not so c10se together HS a covaJenl bond would hotd them. Hydrogen bonds, unlike van der Waals bonds, aro highly directi onal. In the strongest hyrlrogen bonds, the hyrlrogen atom points direclly al lhe acceptor atom (Figure 3-H). If it poinls more than 300 away, the bond encrgy is much less. Hydrogell bonds are also much more speci6c than van der Waals bonds, sinee they demand Ihe ex istence of molecu les with complementar)' donor hydrogen and acceptor groups,

47

Approltimate H bond length IÁ)

Bond

Q - H ..·,,·· Q OONN N-

H·""" H .""" H ,.".., H ~, .. " H .... ·.,

Q N Q Q

N

2.70 ::!: 0.10 2.63 :: 0.10 2.88 ~ 0. 13 3 .04 ::!: 0. 13 2.93 ± 0 10 3. 1O :!: 0 13

The tmportonCl' (JI Wook Chp.mical l nteror.:tiotls

46



b

f I (; U R E 3-7 Antibody-antigen inletaction.

The strtJcture shows lhe complex bel'ween Fab O l.3 and Iysozyrne. (Frschmann T.O~ Bentley GA, Bhat T.N .• 8oIJ1ot G.. Milfluzza RA, Ph~I¡p!> S.E~ lello O.. and PoIjak RJ. 1991. 1. Biol. O1em. 266: 129 15.)

hydrogen bond between IWO hydrOlC)'I grOl4lS

o

O hyS(A-B.mdC-D) do not fao.u the A ..... E directJon of the reaction, since they have small positNe lJ.G values. Howevel, lhey are ¡nsigJlIficanl owing lo !he very large negatille lJ.G va!ucs provided in Sleps B ..... e and D ...... E. Therefore. !he O'o€ral! -00

reachOn favors the A ...... E~ .

ThA construction of a large moleculA from smaller building block,s oft en requires th e input of free energy. Yet. a biosynthetic pathway. like a degradative pathway. would not exist ir iI were nol characterized by él nel decrease in free energy. This means thal many bi osynthet ic pathways demand an external source of freo energy. These free-energy sources are the high-energy compounds. The making oC many biosynlhetic bond::: is coupled wilh tbe brea kdown of a highenergy bond, so that the net change of free cnergy is a lways negative, Thus , high-energy bOnds in cell s generally have a very shorl life. Almost as SOOI1 as they are Cormed during a degradati ve roactiol1, they are enzyrnaticalIy broken down lo yield the energy needed to drive another reacti on lo completion . Nol a1l the steps in a biosyn lhe tic palhway require Ihe breakdowll of a high-energy bond. Often , only one or two steps involve such a bond. Somcti mes this is because tbe !:J.C, even in the absence of an eX lernaUy added high-energy bond , Cavors Ihe biosynlhetic direction. In othcr cases , /lC is effectively zero or may even be slightly pos iti ve. These small positive !:J.C va lues, howevcr, are not sign ifica nt so long as they are followed by a reaction characterized by the hydrolysis of a h igh ·energy bond. Rather, it is Ihe s um of a ll the free·e ncrgy changes in a pathway that is sign ifican t. as shown in Figure 4-3. [t does' nol really malter Ihal the K..... oC a specific biosynthe tic step is slightly (80:20) in favor oC degradalion if the Kecl of the succeeding step is 100:1 in favor of the Corward biosynlhetic dircclion, Likewise, not al! the steps in a degradative pathway generate highe ncrgy bonds. For examplc, only two steps in the lengthy glycolytic (Embden·Meyerhof) breakdown oC glucose generate ATP. Moroover. thero are many degrndative pathways that have one or more steps J'C(juirlllg the brcakdown of a high-energy bond. The glycolytic breakdown of glucase is again an example. rt uses up two molecules of ATP for every four that il generates . Herc. of course, as in every energy-yield ing rlegradative process, more higl~energy bonds musl be made Ihan consumed.

Peptide Bonds Hydroly:ze Spontaneously The formati on oCa dipe ptid e an d a wale r molecule from two amino acids requires a !:J. G of 1 lo 4 kcal/mol, depeJlding on which amino acids are being joined. These positive !:J.C values by themselves toll us that polypeptide chains cannol form from free ami no acids. (n addiHon, we must take in Lo accounl Ihe fact Ihat water molecules have a much, much higher conccntratlon ¡han any oLher cellular molecul es (generally more than 100 times higher). AH equilibrium reactions in which water participales are thus strongly pushed in lhe direclion that consumes water molecules. This is easily seco in Lhe defini tion of equilibrium constants. For example , the reaction formin g a dipeptide,

ami no acidiA) + am ino flc id(B) -

dipeptide(A- B) + H 20 tEqualion 4-31

has the following cqui librj um constanl: keq :;

conc;A-a X conc'I,O conc;A X conca

IEquation 4-4)

where concentrations are given in moles per Hter. Thus. for a given Keq value (related lo !:J.G by the formula !:J.G = - RT In K.¡l. a much greater concentration of water means a cormspondingly smaller concentration of the dipeptirle. The relative concenlratiOI1S are, Iherefore, very importan!. In fact, a simple calculation shows Ihal hydrolysis may oflen proceed spontaneously cven whcn the 6.G ror the nonhydrolytic eeacHon is -3 kcal/mol. Thus, in theory, peoteios are u nslable and, given sufficiel1l time, will spontaneously degeade lo free amino acids. On lhe olher hand. in I.he ahsence of specific enzymes , these spontaneous rales are too slow to bave a signifi cant dfect on cell11lar metabolismo Thal is, once a protein is made. it remains stable unless its degradation is calalyzed by a specific enzyme.

Coupling of Negative with Positive óG Free energy must be added to am ino acids before they can be uniled to form proteins , How this happens became cleae wilh the discovery ofthe fundamental role of ATP as an energy donor. ATP contai ns thrce phosphate groups attachad lo an adenosine molecule (adenosine--O~-O ). When one or two of Ihe terminal -ti groups a re broken off by hydrolysis. I here LS a significant decrease of free energy.

Adenosine-O-O-G - G + H ~O-Adenosine-O-G-O + G (!:J.G = - 7 kcallmol) (Equation 4-51 Adenosine-O-O + O-O (6.G = - 8 kcal/mol) (Equation 4-6)

Adenosine-O-G-O-O + H 20 Adenosine-O-f)-f) + HzO -

Adenosinc-O-G + G (6.G = - 6 kcallmol) (Equation 4-71

AH these breakdown reaclions h~1\'e neg;:¡tive 6.G valucs considc rably grealer in absolule value (numerical value without regard lo signl than the positive 6. G values acmmpanying the formation of polymcric molecules from their monorncric building blocks. "Ole essential trick underlyi:ng Ihese biosynthetic reactions. which by themselves ha\'e a positive 6C, is that they are coupled with the broakage of bigh-energy bonds, charactelized by negative 6G of grealer abso luto value. Tllus, during prolein synthesis. the formation of each peptide bond fó.G = +0.5 kcallmolJ is coupled with the brenkdown of ATP to AMP and pyrophosphate. which has a !:J.G of - 8 kcal/mol (see Equatioll 4-6). This results in a net 6.G of - 7.5 kcallmol, more than sufficient lo ensure that Ihe equilibrium ravors protein synthesis rather Ihan breakdown.

ACTIVATION OF PRECURSORS IN GROUP TRANSFER REACTIONS When ATP is hydrol yzed lo ADP and phosphate, mosl or the free energy is liberated as heal. Bccause heal encrgy canno! be used to make covalent bonds. a coupled reaction cannot be the resull of tv.ro completely separate reactions. one with a positive ó.G. the other V\tilh a negative aG. lnstead , a coupled reaction is achieved by two or

more successive reactions. These are always gruup-transfer reactions: reaclions. not involving oxidations or reductions. in which molecules exchange functional grou ps. The enzymes Ihat calalyze Ihese reactions are called transferases. Consider the react ion (A-X) + [B- Y)

~

[A- B) + (X- Y) _

IEquation 4-8)

In tms example. group X is exchanged wilh component B. Crouplransfer mactions are arbitrarily dcfincd to exclude water as a participant. When water is involved. (A-B) + (11-011) -

[A-oI-1)

+ (B -

I1)_

(Equalion 4-9]

Thi s rcaction is called a bydrolys is. a nd Ihe enzymes ¡nvolved are called hydrolases. The group-transfer reactions that ¡nleresl us here are those involving groups aHached by high-energy bonds. When such a high-energy group is lransfcrrcd lo an appropriatc acceptor molecule, it becomes attached to tbe acccptor by a high-encrgy bond. Group transfer thus allows the lmnsfer of high-energy bonds fmm one molecule to another. For example, Equations 4-10 and 4-11 show how energy present in ATP is transferred to form GTP. oue of the precursors ust::d in RNA synthesis: Adenosin e-G ~ G ~ G + Cuanosine-& -

Adenosine-O - Q + Guanosine-G- O Adenosine-Q - Q - G + Guanosine-G - G - Adenosine- G-O + Cuanosine-G - G - G .

IEqualion 4-101 lEq, 4-11)

The high-energy G-G group on GTP allows il lo unite spontaneously witb anol he r molecule. GTP is thus an example of what is callcd an aclivatcd molecule; correspond ingly, the process of transferring a rughenergy group js ca lled group aclivatiun.

ATP Versatility in Group Transfer ATP synlhes is has a key role in the eontrolled trapping 01' the energy of molecules Ihat serve as energy donors. In both oxidative and photosynthetic phosphorylations. eneegy is used lo synthesize ATP from ADP and phosphate: Adenosine-Q - Q +

G

+ energy -

Adenosine-G - O - G IEqualion 4-121

Because ATP is the original biological recipient of high-energy groups. it mus! be Ihe starling point of a variely of reactions in which highenergy groups are transferred lo low-energy molecul es lo give them the potentia1 lo ceaet s pontaneous ly. ATP's central role utilizas the fael that it contains Iwo high-cnergy bonds whose splitting releases specific groups. This is seen in Figure 4-4, which shows Ihrce im portant groups arising from ATP: & - G, a pyrophosphate group; -A MP. an adenosyl monophosphate group; and - O . a phospha te group. [1 is importan! to notice Ihat Ihese high-energy groups retaín their highcnergy qua lily only when transferrad to an appropriale acceptor molecule. For example, although the trans fer of a .... $ group lo .

BIBLIOGRAPHY Genct"d.1 Rcferenccs Kornberg A. 1962. On the metabolie significance of phos· phorolytie alld pyrophos pborolylic reacHons. In Hori· zons in bioehemistry led. M. Kasha and B. Pullman). pp. 251 -264. Academic Press, New York. Krebs H.A. and Kornberg H. L. 1957. A survey of lile energy Irandormation in living mate rial. Ergeb. P/¡ysi o l . Bio/. Chem. Exp . PharmakoJ. 49: 212. Nelson O.L. and Cox M.M. 2000. Lehninger principies al biochemistIy, 3rd edition. Worth Publishing, New York. Nicholls D.G. and Ferguson S. l . 2002. Bi oen ergeHcs 3 . Academic Press. San Diego, California. Purich D.L (ed.) 2002. MetllOds in enzym%gy; Enzyme kinetics and mecltanism ; Detection ond characteriza-

lian of enzyme reaclion intermediates. Methods in Enzymology. vol. 354. Academie Press. San Diego, California. Silverman RB. 2002 . Tite arganic chemistry 01 enzymecota/yzed mocHan!>. Academic Pross, San Diego. Ca lifomiCl. Stryer L. 1995. Bjochemistry. 4th edition. Freeman. New York. Tinaco 1. (ed.). Sauer K.. Wa ng J.C. and Puglisi J.o. 2001 . P/lysica/ chemistry: Principies and appUcutions in lije sciences, 41h ed iti on. Prentice Hall ColJege Division, Upper Saddle River. New Jersey. Voet D .• Voet J.c.. snd Prall C. 2002. Fundamen ta/s o{ bio. chemjstry. John Wiley & Sons, Now York..

CHAPTER

Weak and Strong Bonds Determine Macromolecular Structure NA, RNA, and protein are a1l polymcrs of simple building blocks. As we leamcd in Chapter 4, synthesis ofthese polymcrs depends on the controlled, catalyzed Iinkage oC aclivated bu ilding blocks. For DNA and RNA. these building blocks are nucleotides (see Figure 2-111. For protcins, the building blocks are Ihc 20 amino acids donaled Crom their activated inlenncdiatcs. Ihe donor tRNAs. Assemb ly of tbese choins rcquircs breakage of multiple high-energy bonds for lhe adclition of each building block. For all these molecules. the order of Ihe constituenl building blocks determines Iheir genctic and biochemical funel ion. Weak boncls playa critical role in determining the strueture and function oC thesc polymers. The primary infonnation 01' RNA, DNA. and protcins is th e order o" Iheir cova lently-linked building blocks. Ncvertheless, il is on ly after tbey bave fonned cxtensive additional wcak bonds between their different parts Ihat Ihese polymcrs adopt chameteri slic shapes that allow Ihem lo carry out Iheir functions. Tbe hydrogen bonds and ¡onie. hydrophobie. amI van dCT Waals interaetions dcscribed in Chapler 3 direcl proteins to form eritical binding siles and ONA lo assume il s dOllble hclical strucluTC. Indecd . Ihe disruption oC these inleraction s (by heat or detergent, foc example) without discuption orcovalenl boncls completely destroys the activity of all but él few biologieal polymers. In Ihis chapler we brieny describe the structurc o" biologiea l macromolecules and Ihe Corces that control Ibflir sbape. DNA and RNA are discllssed brieny here and more Iboroughly in Chapler 6. Wc then focus on Ihe diverse struct ures oC protoi ns. The final seetions of the c hapter Cocus on thc internctions bctween protcins and nucleie acids. an acti vity central lo many 01' Ihe pcoccsses we will cncounlcr in thif; book, and tha control of protrlin CllJ1ction by allostery.

D

OU T l

lNE

Higher-Order $trtictures Are Determlned by Intra- and Intermolecuwr Inleractions (p. 69)

• The Specific Conformation 01 a Protein Resulls from lIS Pattem

of Hydrogen Bonds (p. 78)

• Mas! Proteins flIe Modular, Conlalnlng Two Of Three Dornains (p. 8 1)

Weak Bonds Correctly Posltion Proteins along DNA and RNA Molecules (p. 84) Allostery: Regulation of el Protein's fundion by Chang¡ng lIS Shelpe (p. 87)

HIGHER-ORDER STRUCTURES ARE DETERMINED BY INTRA- AND INTERMOLECULAR INTERACTIONS DNA Can Form a Regular Helix ONA molecules usually have regu lar helienl configurations. This is

because most DNA molecu les eontain two antiparallel pol ynucJeotide slrands Ihat have complementary struetures (see Cha pler 6 ror more dctailsl. Both internal and external noncovalent bonds stabilize the structuro. The Iwo stranels aro held together by hydrogen bonds betwcen paies of complementruy purines and pyrimidines (Figure 5-1). Adenine is always hydrogen-boncled lo thyrnine. whenlas guanine is 6.

70

Weuk uml Sl roflg Ilonds f)pl p,rmi nc Mucromoler.ufur Slru cluffi

F I (; U RE 5-1 The hydrogen-bonded base pairs 01 DNA.. The figure shows the posibon and length of the hydrogen bonds b€tween the base !)dirs. The (ovalent bonds between the atoms within each base are shown, but double and single bonds are not distln·

Ihymine

adenlne

o

o

guished (see Frgure 6-6 in the next chapter).

o 11.1

o cytosine

O

A

0 =----=-_ 0----.J

L:I

~ ~ ""''''''''''''

I

guanine

.""",,,,,

o 1o.sA

/

\

f I GU RE 5-2 The breaking ollenrtinal

base pairs in ONA by random thermal motion. The figure sho.vs thal once sorne bond!. ha\..e broken III Ihe termirn, they can refOfm (Iower Ieft) or addiliol'oill bonds can break.

hydrogen-bonded lo cylosine. In add ition, virtually all Ihe surface aloms in Ihe sugar and phosphate groups form bonds lo water molecules. The purine-pyrimidine base pairs are found in the center of the ONA molecule. This élrrélngemenl élllows their fIal surfaces to stélck on 10p of úach olher, crcating shared (TI - 11) úlectrons belwoon the bases and limiting lhcif conlact with water. This arrangemllnt, known as base stack.ing , wouId be much less satisfaclory if only one pulynucleotide chain were presento Because pyrimidines are smaller lhan the purines, single-strandcd DNA would resu lt in the un favorable exposure of hydrophobic surface helwecn adjaccnl bases. Tho presence of c:omplementary base pairs in double-helical DNA makes él regular structuro possible, since éach base pa ir is 01" lhe sa me size. The double-helical DNA molecule is very stable for two rcasons. rirst, disruption of the double helix would bring the h ydrophobic purines and pyrimidines into greater contact with wfller, which is very unfavomble. Second. c1oublc-stranded DNA moleculcs conta in a I'el)' Jarge number of ""uak bonds. arranged so thal most oC Ihem cannot break without simultaneously broaking many others. Thus. fOf example, even Ihougb thermal motion is conslantly brcaking apart Ihe purinc-pyrimidine pairs al the ends of eélch molecule, the two chains do not usually faH aparl because other hydrogcn bonds in lhe molecule are still intacl (Figure 5-2). Once a given bond is brokon, thl:l mosl likcly next ovent is the reform ing 01" Ihe same hydrogen bonds lo reslore Ihe original molecular configuration, ralher than the breaking of additional bonds. Sometimes, of coursc, Ihe Ilrsl breakage is followed by a second, and so forth. Such multiple breaks, however. are quite rare . so lhal double helices held togelher by moro than ten baso pairs are very stable at room lemperaturo. When DNA slran c1s do come apflrt withollt reforming, Ihis typically starts al 000 c nd of tbú molccule an d procccds ¡nward. This is because

HigJwr.Order StJ1J(;/lIres Jlm o.!fermi/loo by Intl

Ihe inlor¿)ctiOIlS betwecll the bases al Ihe end of Ihe DNA are tho leasl supported by adjacent inlcractions. Thal ¡s, tbey have only one neighbariog base pair lo help secure the inleractiun. As describcd in more delail bclow, the same principle-the use of multiple weak bondsgovems the stabil ily of proteins. Ordered collcctions of secondary bonds bocomo ¡ess and less stable as their temperaturc is raiscd above physiological tcmpcmtures. At elevaled tcmpcratllres, the slmllltaneous breakage of several wcak bonds is more frcquClnt. Afler a sign ifican! numb(ll· h.-.ve broken, a molecule usually loses its original form (the process of denaluralion) ami assumes an ¡nactive, oc denatured, configuration. Thus, as the lomperature rises. moro interaetions are rnquired lo mi:lintain the double-stranded nature ol" DNA.

RNA Forms a Wide Variety of Structures In rontrasl to the highly regular structure of Ihe DNA doublo helix. RNA is usually fOllnd as a s ingle-stranded molecllle. Sorne RNA moleeules (such as messenger RNAs) function as transienl carriers of genetic infoooation and are constantly associated wilh protcins a nd thus do nol have an ¡ndependent, slable, terti ary fold. Other RNA molecules fold inlo unique tortiary struclures. For these I~As , intramolccular inleractions bctween distinCI regions Icad to the formation of spccific elements of scr;ondary slructurc. These interactions are principally between Ihe bases of the RNA and ¡nelude traditional Watson-Crick base pairing, unusua l base pairing found only in RNA. and hydrophobic base slacking. RNA differs from ONA in that Ihe ribose sugar ofthe backbone carries a 2'-hydroxyl group. In Ihe fold ed structure of RNA molecules. these 2' -hydroxyl groups often participate in interactions thal slabilize Ihe slructure. The binding of divalenl metal ions (such as Mg2+. Mn 2+. and Ca H ) lo the RNA is oft en criticAI lo the foooalion of a slablc, foldcd con formalion oocallse these ions can shicld the negative charge of Ihe RNA backbone, allowing regions of lhe molecule lo paek more elosel y together. TIle prccisely folded. compect nature 01" RNA tertiary structllre is iIluslraled by the high resolulion struclllres of sorne important RNA rnolecules, for exarnple. tRNA -a molocule Ihal participalcs in protein synthesis (see Figure 14-16). These structures reveal that base stacking plays a major role in RNA conformation: for example. 72 out of the 76 bases in IRNA are involved in stacking interactions. As in Ihe DNA double holix struclure, stacklng of RNA bases 00 lop al' one anothor is cnergetieally favorable. For tbis mason. short base paired, helical regions of RNA stack on top of ono iffiothcr lo form longer, diseontin1I0US helicaJ regioos. Thcse regions of stacked heliccs Ihen pack against cach other via additional tertiruy inleractions. We have onl)' briefly discussed the features of DNA ;:md RNA slructure here. In Chaptcr 6, we \ViII describo in much more detaillhe interactions Ihal govem the structures of these critica l ceJluJar moleeules. For the rmna inder 01" thi s chapter \Ve for;us on the rorces infl uencing the strueture 01' pl"oteins.

Chemical Features of Protein Building Blocks In contrast lo Iho four nucleotido building blocks used for RNA or DNA, Ihe 20 amino acid building blocks used for proteio synthesi s are highly di verse. Thc eonunon struct ural features of the ami no acids are tlw

72

~Uwk

(wd StrOflg Ikmds Dl:/er mifle Macrowolecu ltlr S /ruc /ul'e

H

H-

H

I I N+- C I 1"

H

'-----'..J

amino group

R

side chain

,f'0

C

'- _ O

'------.J

carboxyl group

FIGURE 5-3 Thecommon 5lructural

features of amino adds.

central carhon (G..¡ linkcd lo a hydrogen. a primary amino group. and a carboxylic acid group (Figure 5-3). Tho fourth Iinkago is to a variable sido c hain called tho R group. Tho R groups of the 20 amino acids can be calcgorized by thdr size. shape. and chemical composition (Figure 5-4 ). T he R groups fall inlo fom categorícs: l1eulral-nonpolar. nculralpolar. acidic. and basic. Tho neutral-non polar side chains aro composed 01' simple carhon chains or aromalic rings and make principalIy hydrophobic conlacts. The neutra l-polar side chains indude hydroxyl. sulfhydryl. amide. and imid7'N'i,

H,N'

• •

• •

• neutra"'polar amino acids va!ine (Val, V) It -

isoleucine (Ileu, 1) ,

,

H

I

I

RO

J

,

b-

H' -C-C

H

1 I 4'0 tI-f('-C-C

:¿~etl.

I

H

,/

I

CH

,

,/ 'eH,

serine

U'lreonine

(Ser, S)

(Thr, T)

H H_

1

o-

H

H

I 1 úO H' _c_e9'

H

I

H_

. . . .0-

CIi;

asparagioe

tyrosine (Tyr, Y)

H

I 1 úO N' _C_C9' 1

H

I

1

Oi,

O-

I

#0

1

1

. . . .0-

H

I

OH

,

I

I-I-f(' -C-C

....

glutamine (Gln, a)

(Asn, N) H H

CH,

H-

H

I 1 {'o N"-C-C I I , H eH, o-

I

C-ai,

,/ 1

.f"

o

OH

0'II

e

' H,

""

e

o

....

••

Iryptophan (Trp, W)

,I "I

úO H-N"-e-c9'

I

I , OH oI '

H

f"'t-¡¡ and >ji angtes of rotat;on

abo\lt the Co-N and Cn-C bonds. The

shaded areas represent lhe piares of the pepride I::oods. (5aJrce: lOustraticfl. lIVing Geis. Rights o.vned b; Hov.rard Hughes Medicallnstitute. Not to be reprocLced ~ho..rt permissic.n.)



O

el

primary

--

.'

secondary

tertiary

-,

..

.....------ ------- ------ -- --------.........

-' --

• o

el

F I e u R E 5·7 Four iewls of Pfotein stRlcture. (SO ....ce: Adapte from parall el. If the Ci heli ces remained perfectly rigid , they eould stay in contact fOI" only a few rcs idues. Bul by supereoili ng in a lefthanded direction. neatly paeked , highly sl able, coiled-coils are created (Figure 5-15). One example or a coiled-coil is found in (he leuci ne zipper fumily 01' DNA-binding proteins. These .DNA-bind ing factors ha ve two subunits Iha t come togelher lo form a dimer through Ihe use of a eoiled-eoil region. This coiled-coil region is called a leucine zipper due lo the repeating appearance of leucine oc other amino acids with an aliphatic side group, such as valine or isoleucine. These leucines appear in a regul ar pattern as follows. If you consider two lurns of an a helix Ihis will represenl a segment of a pproxima tely seven amino acids. The aliphatic amino acids are localed wilhin each seven amino acid strelch al the Hrst and fourth posit.ions. Thi s positioning ensures that one side of Ihe o: helix is aliphatic. sinee Ihe Hrst and fourth posili ons will be on the same face oflhe heli x. These faces in two adjacent helices are packed againsl each other, burying their hydrophobic s irle chains away from the aqueous environment.

MOSI Proleins Are Modulor, COlllWflÍllg Til'o PI Three Domoins

11

MOST PROTEINS ARE MODULAR, CONTAINING TWO OR THREE DOMAINS The subunits of soluble proteins vary in size from less than 100 to larger than 2,000 amino acid residues. The smallest polypeptides ibat form folded pmteins have molecular weights of aboul 11.000 daltons (approximately 100 residuesl, bul mosl are between 20,000 and 70,000 daltons for a single subunit. Proteins larger iban abou! 20,000 daltons are often formed from two or more domains (Figure 5-16: see also Box 5-2 , Large Proteins Are Often Constructed oI Several Smaller Polypeptide Chains). The term do ~ main is used to describe a part of the structUffi Ihat appears separalo frorn the rest, as if it would be slable in solution on its O\'\!'Jl, which is oflen the case. Typica lly, a single doma in is formed from a continuous amino acid sequence and not portions oI sequence scattered throughout the polypeptide. This is an important poinl when considering how multidomain proteins have evolved . Proteins Are Composed

oE a Surprisingly Small Number oE

Structural Motifs Determination of the first half-dozen protein structures showed a bewildering variety of protejn fo lding motihi, imp lying the existence oE an infinite number of protein struclures. No\\! thal \\le know the three.dimensional structures oI Ihousands oE proteins, however. it appears thal a relatively small number of different domains aecount for most of Ihe large variety of protein structures. Although en Bccurale estimate is nol possible, the number oE truly unique dornain molifs will be orders ofmagnitude smaller than the number of lmiQue proteins. Specific kinds of domain motifs are often associated with particular kinds of activilies. One frequ enlly obse rved motif has been termed the dinucJeotide fold becausc it is frcquently found in enzymcs ibat bind F I e u R E 5-16 Pynavate '!:inase is composed of dislinct domains. The predominant ó:lmains ollhe enzyrne ale shown in tUlquoise. purple. and red (AI1en s,e and MUlrhead H, 1996. Acta Oyslollogr. D. 8KJJ. Crystallgr. 52: 499.) lmage prepared with MoIScripl. Bob5cripl. and Raster 30.

Box 5-1. Large Proteins Ale Often Constructed of Several Smalle.

Polypeptide O1ains Most large pwteins are regular aggregates of several smaller polypeptide dlains. The relationship among the polypeptide dlains making up suCÍl a protein is termed lts quartemary structure. Fbr example, !he macrorrdewlar complexes responsible fOf the synthesis of RNA (RNA!XJIymerase) and protein (ribosome) are each assemblies of multipe subunits. 1he a:mplexes are about SOO,()(X) and 2,500,000 daltons, fespectively, but do not indude any individual subunits greater!han 200,000 daltons. The nOOsome is composed 01 both protein and RNA suOOnits. This type 01 lactor is called a rioonudear protein (RNP). Why are large protein COTlplexes composed of multiple subunits rather than a sing4e large suoonit? The use of multiple subunits to build large protein complexes reflects a building principle applicab1e to all complex structures, nonliving as well as living This prin~e states that it is muCÍl easier to redJce the impact of cortSIrucOOn mislakes if faulty subunits can be discarded bebe they are inccxpaated into the final pfOduct. Far example, let us consider two ahernative Whows Ihe dimer of me induc.er·lac re-pressor complex. Blnding of inducer causes el change in ttle Slructure thal reduces affimty of repteSSOl' for!he operator. (b) The right side of lhe figure shows lhe dimer in !he absenc:e of inducel. In rm c.ase. the hinge helic:es Iorm and ,he N·lermll1éll domaln m:, in the lower pan of Ihe figure (b), shows the componenls of Ihe ONA molecule and tbeir relative positions in the helical structure. The backbone of each strand ofthe helix is composed of alternating sugar and phosphale residues; the bases project inwaro bul are accessible through the major and minor grooves. Let us begin by considerlng the nature of lhe nucleotide , lhe fundamental building block oC DNA. The nucleotide consists of a phosphate joined to a sugar. known as 2' -deoxyribose, lo which a baso is altached. 'fhe phosphate and Ihe sugar have the structures shown in Figure 6-2. The sugar is called 2' -deoxyribose because there ls no hydroxyl al posilion 2' (jUSI two hydrogens l. Nole that lhe positions 00 Ihe ribose are designated with primes lo distinguish them from positions on Ihe bases (see the discussion below). We can Ihink of how Ihe base is joined 10 2'-deoxyribose by imagining Ihe removal of a molecule of water between the hydroxyl on lhe l' carbon of lhe sugar and Ihe base lo form a glycosidic bond (Figure 6-21. The sugar and base ruone are caUed a nucleoside. Likewise. we can imagine Hnking Ihe phosphate lo 2'-deoxyribosc by removing a water mol(!cule Crom between lbe phosphate and the hyrnoxyl on Ihe 5' carbon lo make a 5' phosphomonoester. Adding a phosphate (01' more IhaJl one phosphateJ lo a nucleoside crcates a nucleotide. Thus. by making a glycosidic bond belween the base and the sugar, and by making a phosphoestcr bond betwecn the sugar and the phosphotic aeid, we have crealed a nucleotide (Table 6-1).

P

in bases

F I G U R E 6-1 The heliul structure of (a) 5chemilic modeI of!he dotbIc helix. O'le l'-'n of me heIix (34 AOf 3.4 nm) spans approxillldle!y , 0.5 base pairs. (b) Space-filling model ot the doubIe he\{l(, The sugar aOO phospI1ate residues., each strand form lile bacItbone, ~ich ilre tr«ed by lile yellow. gray, aOO red cirdes, sholMng lhe heflCaI t>Ms1 of the overall molerule. The bases proteeI if"Mlilltl but are acce;sllIe thfough rnflIOl aOO rTllOOI grooves. DNA

FIGURE 6-2 formation of nudeoúde by removal of water. The numbers of Itle carbon alorns in 2'-

134

Chmlllo$ollles. Chmmotin. olld the NucJeO$Olllfl

bacteria. single-cell eukaryotes. and multiccllular e ukaryotes-sec Chaplcr "l9). it is not surprising Ihat genome size is rough1y cOITelated with an organism's apparent com pl úxily. Thus, prokaryotic cells typically havc genomes smallcr th an 10 megabases (Mb). The genomes of s ingle-cel! eukaryoles are typically less Ihan 50 Mb, a1though thc mom complex prolozoans can have genomes grealer Ihan 200 Mb. Mu lticeJluJar organisms have even Jarger genomes Ihat can reach sizcs grcater Ihan ]00,000 Mb. A1though lhere is a cOITclation betwcen genomc sizc and organism complexily. il is far from perfect. Maoy organisms of appar/mu y similar cornp J exi ti e~ have very differen! genome sizes: a fruil fl y has a genome approximalely 25 times sma J!er than a locust and the rice genome is aboul 40 times smallcr Ihan wheal (~e e Table 7-2). Ln these examples, the number of genes rather Ihan the expansion in genomc size a pp ear~ lo be more c10sely reJaled to organism cornplcxity. This becomes c1ear when we examine the relative gene densities of differenl genornes.

The E , coli Genome ]5 Composed almost Entirely of Genes The groal mnjority oC Ihe single chromosome 01' Ihe bacteria E. coJi encodes ptoleins or slructural RNAs (Figure 7-2). The ma jority of ¡he noncoding scquencL's are dcdicalcd lo regulaling gene tmnscription (as wc shnll see in Chapter 16). Because a single site of transcription initiation i~ oft en uscd to control the cxprcssion of several genes, even thesa regioos are kcpl lo n minimum in Ihe genome. Dne critical clernent of Ihe E. col; genome is nol parl of a gelle: the E. cali origin ol' replicalion. This short chromosomal region is derlicaled to directing Ihe 8ssembly of Ihe replication machinel'y (as we shall discuss in Chapter 8 ). Despile its importanl role. this region is still very small , occupying only ~ few hu ndred base paiTS of Ihe 4.6 Mb E. coN genome.

More Compfex Organisms Have Decreased Gen e Densíty What explains the drnmatically differen! gcnome sizes of organisms oC apparently similar complexity (~uch as the fruit fl y and locust)? The



genes



rcpeal ed sequences

inlrons



O intergenetic sequences

RNA poIymerase gene

ESCheriCn'",II"."' (i5i.7jge.~ .¡II.U.ulI!I __ - • •III.m. l. '" I~• •II!I._(

.

1• • •11I• •1II1II• • •1JI111• •'

.

"

Sacdlaromyces cerevisiae (3 1 genes)

'-, _

••RJ.IL~.. ~.""i""lIúI"U.IU.IU.~""UII."••~"""~••L "'I"~" .~..~•.uI~.u•.u.~ Drosophila melanogaster (9 genes) -

.'

I

"

_

,'

Human (2 genes)

,

,

,

O

10000

20000

••

..-

"_ , _ _ o . , . . .. . .

,

30000

,

1

1 '

-------. -------

••

.. _

'0000

FI (j URE 7-2 Comparrson of tlle cllromosomaJ gene density for different organkms. A r€presefltative 65 I:h region 01 ONA IS ill ustrated fa each organismoThe regien tha! encocles the Ia!gest SUbuM 01 RNA poIymerase (RNA Pd 11 101 Che eut.aryotic cells) is intl!c&ed in red. Note ho.v lhe ntJfl'lber of genes encoded Wlthin lhe Si\m~ Ier.gth 01 ONA deoeases as organ¡sm complexity traNSeS.

••

_ '. '

,

50000

-

, 60000

Chromosome SeqUf!/lCf' and Dillefsi(1I

135

differences are largely related to gene density. One simple measure of gene density is the average number of genes per Mb of genomic DNA. Thus, if an organism has 5,000 genes and a genome size of 50 Mb, then Ihe gene density fur thal organism is 100 genes/Mb. When the gene densilies 01' diffcmn l organisms are comparcd , it becomes clear that differen! organisms use Ihe gene-encoding potential of DNA with varying efficiencies. There is a rollgh inverse correlation between organism complexity and gene density: the less complex the organism, the higher the gene density. For example, tbe highest gene densities are found fOI" viruses Iha! in some Instances use bolh strands of the DNA to encode overlapping genes. Although overlapping gonos are rareo mielenal gene densily 1s consistent)y near 1,000 genes/Mb. Gene densily in eukaryotic organisms is consis lently lower and more variable than in lheir prokaryolic coun lerparls (see Table 7-21 . Among eukaryoles. lhere is st Hl a genera l lTend lar gene density Lo docrcasc wi lh increasing organíslll complcxity. Tho simple uníce ll ular eukaryote S. cerevisiae has a gene densi Ly very close lo prokaryates (- 500 gene5/Mb). In contrast, the human genome i5 estimalcd to have a 50-fold lower gene density. lo Figure 7-2 the amollot of DNA scquence devolcd to lhe cxpl'ession of a rclatcd gCllP. l:onscrvcd across all organisms (the large subu oil of RNA polymerase) is compared. i1lu slraliog the vasl differences in gone density. Organisms wilh mUl.:h larger genomes tlum humans are likely to ha ve much lower gene densitics. Whal is responsible far Ihis red uction in gene density?

Genes Make Up OnIy a Small Proportion of the Eukaryotic Chromosomal DNA 'f'wo fnctors conlributc lo Ihe docreased gene density observod in eukaryotic cells: lncreases in gene size and incrcascs in the ONA betwoon genes. r.:alled inlcrgcnic sequences. Individual genes are longcr f"or Iwo masons. FirsL, as organisms bccome incmasingly complexo Ihero is a significant ¡ncrease in regions of DNA requircd lo dircct and reguJate transcription, called regulalory sequences. Second, proteinencoding genes in eukllryotes frcquently have discontinuolls proteincoding regions. Thcse interspersed non-protein-encoding regians. called introns, are removed room the RNA after transcription in a pro(,'(lSS ca1led RNA splicing (Figure 7-3); we shall consider RNA splicing in detail in Charler 13. The presonce of iotruns ean increase dramat it:ally

¡nlroo 1

2

F I (j U R E 7-3 Smemadc o. RNA spltdng.

ONA 2

3

2

3

¡ primary

RNA transaipl

spliced mRNA

' ''-=========~''~===========-''3

---3'

Transcriplion 01 pre-mRNA is Initiilted al Ihe allow sl"lo.M1 aboJe e~on 1. This primal)' tronSOlpl is lhen processed (by splid ng) lo remove ncn:odlng inlrons lo produce messenger RNA

Chromosomes. Chromolín, ond Ihe NucfoDSDme

, A 8 L E 7- 3 Contributlon of Introns and Repeated 5equen(es to D;tferent Genomes

Specles

Gene density (9......... ·)

Average numbet 01 Introns per gene.

Perceotage of ONA lhal 15 repelitive·

PRQKARVOTES (bacteria)

Escl'rerich;a COi; K-12

950

O

480

00'

200 BO

3

e Inc.. par! 01lile Tdeo""",, assembly

sorne JeQuires more !han 147 bp 01 ONA 10 formoil two such factors bínd 10 !he DNA le!.s lhiln this distance apart. me intervening DNA camot assemble inlo a nudeosome. (b) A sub5eI 01 ONA-binding proteins have lile ability te bind te nudeosomes. Once bound 10 ONA, slJch proteins will ladlilOte lile assembly of nudeosomes irrmediately adjacent lo the protein's DNA-binding site.

1

nucleosome-free

b

(J



oucleosome assembly

I

I

posttioned

""'eorome

Sorne Nucleosomes Are Found in Specific Positions in vivo: N ucleosorne Positioning Because of their dynamic interactions with DNA, moSI nuclcosomes are nol fixed in their locations. But lhere are occasions when reslrictioS nucleosome location. or positioning nucleosornes as it is ca ll ed. is beneficial, Typically. poshioning a nucleosome aJlows the DNA hindiog s ite for a regulatory protejo lo remain in the accessible Iinker DNA region. In many instances, s uch nucleosome-free regioos are larger to aJlow extensive regulatory regions lo remain accessible. Nuclcosome positioning can be direded by DNA-binding proteins or particular DNA sequences, In ¡he cell, the mús! frequen t method in vol ves competilion bctween nucleosomcs and DNA-binding proteins. JUSI as many proteins canool lJind to DNA within a nucleosome. prior bioding of a proteio to a sitc on DNA con preven! association of thc rore his!ones wilh tbat stretch of DNA. If two such DNA-binding proteins are bound lo siles positioned closer lhan lhe minimal region oC DNA required to assemble a nucleosome (- 150 bp), the DNA bctween the proteins will remain nucloosome-free (Figure 7-36a ). Bindíng of ad ditional proteins to ad jacent DNA can further increase Ihe sire of a nucleosomefree region. In addi tion to this inhibitOI}' mechanism 01' proteindependen l nucleosome positioning. sorne DNA-binding proteins

G :CrlictJ

hislone octamer

F IG U R E 7-37 Nucleosomes prefer 10 bind bent DNA. Spea fic DNA sequences can position nudeosomes. Bec.ause the DNA is bent severely during associarion v.ith !he nudeosome, DNA sequences rIlal p:'lSitioo nudeosomes Me inlrinsically bent A:T base pairs nave an lnmosic tendency te bend toward the mi roe groo\le and G:C base pairs have!he opposite tendency. Sequences mal alternate between A:T- and G:C-rich sequeoces 'l'Jith a periodícity of - S bp will iICI as prelerred nudeosome binding sites. (Source: Jldapted Irom Alberts B. el a1. 2002_MoJecuIar biology of /he ceH, 4th edition, p. 2 11 , f4-28. Copyright e 2002. Reproduced by pennrssion 01 Routledge/Tayior & Fraods Books, Inc.)

interael tightly with adjaeent nucleosomes. lead ing lo nllcleosomes preferen lialIy assernu ling immedialely adjacenl to these proteins (Figure 7~36b). A second mcthod oCnucleosome positioning ¡nvalves particular DNA sequences that have a high affinity for the nllcleosome_ Becallse DNA bound in a nucIeosome ¡s· bont, nucloosomos preferontially fmm on DNA Ihal bends easily. A:T-rich ONA has an intrinsic tendency lo bend towaro the minor groove. Thus, A:T~eich DNA is favored in posi ~ lions in which the minor groove faces the histone aclamer. G:C-rich DNA has Ihe opposite tendency ando thereforc. is favored when the m i~ nor groove is fadng away from the histone octamer (Figure 7~ 37). Each nucleosome will try to maximize this arrangement of A:T rich and G:C-rich sequences. 11 is importanl lo note that such altemating stretches of A:T rich and G:C-rich DNA are rareo More importall tly, despite being favored . sueh un usual sequenees are not requircd Cor nucleosome assembly. Thesc mechanisms oC nuc\eosome positioning influ encc the orga ni ~ zation oC nucleosomes in the genome. Despite Ihis, Ihe majority of nucleosomes are nol tighlly positionecl_ As you will leam in the chap~ tees on eukaryotic transcription (Chaplees 12 and 1 7), tightly positioned nucleosomes are most often found al si tes directing the initiation oC transcription . Although we have diseussed positioning primarily as a method to ens ure that a regulatory DNA sequc nee is aecessible. a positioned nuc!eosome can just as easily prevent access to specific DNA siles by being positioncd in a manner that overlaps the same sequen ce. Thus , positioned nucleosomes can have bolh positive aml negative effects on lhe accessibilil y oC nearny DNA sequences. An approach lo mapping nucleosome locations is dcscrihed in Box 7~2. Detennining Nu cleosome Position in the Cel!. 4

4

Modification of the

N~ Teminal

Tails of the Histones

Alter-s Chromatin Accessibility When histones are isolated from eells . their N ~terminal tai ls are typi~ cally modified with a variety of smaU molecules (Figu re 7~ 38). Lysines

170

Cf¡ro/J!osomes.

Chromolin . (md 'he Nucleo!>ome

in Ihe fails are frequently modified with acetyl. groups or rnelhyl groups and serines are subject lo modification with phosphale. Typically, acetylated nueleosomes are associated with regions of the chromosomes thal are transcriptionally active and deacelylated nueleosomes are associated with transcriptionally-repressed chromalin. Unlike acetylation. methylation of different parts of the N-terminal taBs is associated with both repressed and active chromatin, depending on the particular ami no acid Ihat is modified in the hislone tail. Phosphorylation of the N-terminal lail of hislone H3 is commonly observed in the highly-condcnsed chromatin of mitotic chromosomes. It has becn proposed that Ihese modifications result in a "code" thal can be read by the proteins ¡nvolved in gene expression and olher DNA transactions (Figure 7-38). Haw does hislone modification alter nucleosome function? One obvious change is thal acetylation R11d phosphorylation each acl to reduce the overal! positive charge of the histone tails: acelylation of Iysine neutralizes its positive charge (Figure 7-39). This loss of positive charge reduces Ihe affinity of the lails fOf Ihe Ot.-gCllively-charged backbone of lhe DNA. Equally importent, modiñcation of the hislone lails affects the ability of nueleosome arrays to fonn more repressive higher-order chro-matin struclure. As we describcd aboye. hi slone N-tenninal tails are required lo (oon the aD-nm fiber. and modification of tbe tails modulates this function . For example. consistent with the association or acel'ylatcd hislones with expressed regiDns of the genome, nucleosomes with this modification are significantly less likely to participate in the formalion ofthe reprcssivc 30-nm fiber.

Box 7-2 Determining Nudeosome Posmon in the cell

The significance of the Iocanon of nudeosomes adjacent lo important regulatory sequences has led lo Ihe developmenl of melhods lo monitor the Iocalion of nudeosomes in ceUs. Many of Ihese methods exploit me ability of nudeosomes lO protea DNA from digesbon by mloococcal nudease. As desaibed in Box 7-1, micrococcal nudease has a stroog preference to cleave DNA between nucleosones rather than DNA tightly associaled \Mth nudoosomes. This property can be used 10 map nudeosomes lhat are asscx.iated with Ihe same position throughout a cell¡x¡pulation (Box 7-2 Figure 1). To map nudeosome location accuralely, il IS importanl lo isolate the cellular chromatin and treal it with the appropriate amounl of micrococcal nudease with minimal disruptioo of the overaH chrornatin strudure. This is typically adlieved by genlly I~ing cells .......,ile leaving Ihe nudei intacto The nude. are lhen bnefly lreamd (typ'ical1y for 1 minute) .....;th several diffcrent concentrations of mioococ.cal nudease, a protein Slna!! enough lo rapidly diffuse into lhe nudeus. The goal of lhe titmlion is fOf micrococcal nudease to c1e~ Ihe region of inlerest only once in each cel!. Once Ihe DNA has been digested, the nudei can be Iysed and all the prolein re~ from Ihe DNA. The sites of de.avage (and. more importanlly, !he sites not de 1

histooe-fold domain

&¡=:r

H2A

NTi:,

'""

1 H28

N

t t t

K

5

K

12

H3 ~

t t t ~s 34' RKK

10

N N Nt N N N N! N!

K

14

5

K

18

K 23

K 15

K S 27 28

7.

gene silencing

// _

gene eJq)fession

gene s~encingl helerochromalin

'/ _

chromosome condensation

'/ _

gene expressioo

j

t !

;

¡'1

• ?

'/ ;

N N N

n¡slone deposition

'/ _

j

-

H4 N

K

'f _

t

e

// -

¡

t

e

&

1:

t

,

f(]

c

transaiption elongalion

,

S 1

t

5

t

8

t

12

'1 '1 j -

hislene deposition

'1-

gene expression

)

,t.

;

gene silencing

In addition to dimet effects on nucleosomaJ funr,tinn . modiflca tin n ofhistone fail s also generates binding sites for proteins (Figure 7-39b). Spocific protein domains calIed bromodomains and chromodomains mediate these intemetions. Bromodomain-containing proteins internet with ilt;clylated hi slone tail s and t;ruomodomain-containing proleins interne! witb meth ylated histon e tails. Many of the proteins that contain bromodomains are themselves associated with hislone tail-specifi e acetyl transferases (Table 7-7). Sueh r,omplexcs can racilitate the maintenance of acetylated duornatin by rurther modify ing regions that are already acetylated (as we shall discuss below). Thc association 01' chromodomain-containing proteins with hislone lail-specific melhyl-

lIogu/oticm ofChromotin Structure



173

H4 acetylated

H2A H3

•~

+

unmodified + 0~-'

+V' f--

-r-

_____

=-

+ H3 + + ...

---]

+~ ~

methylated

H2B

+

~+

+~ +



H2B H4

+

bromodomain

chromodomain protein

pl"otein

b

H4

+'(

H2B

e

~H3

FIGU R E 1-39 EHects of histone tail modifications. (a) The eHect on the association with nudeosome-bound DNA Unmodir~ aOO methyldled !listone lail5 are thoughl to associosome assembly. The d.istributive in heritance of o ld histones during chromosome duplica lion provides a mechani sm for Ihe accurate propflgation of Ihe parental pattem of histone modification. By Ihis mechanism . old histones, no matter on wh ich daughler chromosome lhey end up, tend 10 be found close, in Jocation, to their position on Ihe parental r:hromosorne (Figure 7-4Z). Thi s localized inheritance of modified histones provides a Iimited number of modi6cations in similar positions on each daughter chrornosome. The ability of these modifications to recruit enzymes that add si milar modifications lo ad jacent llucleo sornes (see Ihe discussion of bromodomains and chromodomains above) provides a simple mechanism to mainlain similar states of modification after DNA replicBtion has occurred. Such mechanisms are likely lo playa critk.a l role in the inheritance of chromalill states from olle generation to another. 4

Assembly

oE Nucleosomes Requir'es Histone "Chaperones"

The assembly of nucJeosomes is not a spontaneous process. Earl y studies found Ihat Ihe simpl e addition of purified hislones lO DNA

Nucleosomc .-1!>'!>'cmbly

cid hislones: ~ H2A •

new hislones:

H2A

H28 •

H3 •

O H28 O H3

e

H4 H4

r> replication

letramer

acety l transferase binds

acetylated histone tails

1

"",~I

transfefase

mocIlficat lon of adjacenl "neYl' histones

1

1 77

F I (; U RE 1~ 42 Inheritana of parental HJ.ti4 tetramers faalltams the inheritana of chromatin states. As ~ chfOl'l1ClSOITle is rq>lkaled. lhe distribllbon 01 the pdrental HJ.H4 letramers results in lhe daugt"lter chromosomes receiving the same rnodific~bons as lhe paren!. The abi~ty 01 lhese modiflcations lo reauit enzymes tha! peform lhe same modiflCdtions lacilrtoleS the mt'I'eCI propagoricw1 of lhe same sldte 01 modification kl the \'NO daughtef chromosomes. PcetyLation is stn.-m en the core regioll5 of !he histones lor SImpIioty. In l eality, Ihis modiliation is generalty on!he N-terminal toils..

178

Chromosomes, Chromolin, ond ,he NucJcosomf!

resulted in littJe or no nucleosorne fonnation. lnslead. Ihe majority of the histones aggregate in a nonproductive rorm. For correet nucleosorne assembly, it W(lS necessary to mise salt concentralions 10 very high levels (> 1 M NaCI) and Ihen slowly reduce the concentralion over many hours. Although useful ror assembling nucleosom es for in vitro studies (such as for the structural sludies of Ihe nucleosome described earlier), elevated salt concentrations ate nol involved in nucleosome assembly in vivo. Studies of nucleosome assembly unde r physiological saJt concentralions identified faclors required lo direct the assembly of hi stones onto the DNA. These factors are negatively-charged proteins that form complexes with either H3'H4 tetramers or I-I2A-I-I2B dimers (see Table 7-8) and escort them lo sites of nucleosome assembly. Beeause they aet lo keep hislones from inleracting with the DNA nonproductively, these faetors have been referred lo as hislone chaperones (see Figure 7-43), How do the histone chaperones direct nucleosome assembly lo sites of new DNA synthesis? Studies of the histone 1-13-1-14 tetmmer chaperone CAF-I reveal a Iikely answer. Nucleosome assembly directed by CAF-I requires that the target DNA is replieating. Thus, replicating DNA is marked in sorne way for nucleosome assembly, lnterestingly, Ihis mark is gradually lost after repli cation is completed. Studies of CAF-l-dependent 3ssembly have determined that the mark Is a ring-shaped sliding clamp proteio ca lled PCNA. As we will discuss in delail in Chapler a, this fa ctor forms a nog around the DNA duplex and is responsible fm holding DNA polymerase on Ihe DNA during DNA synthesis. After lhe polyrnerase is finished , PCNA is released from the DNA polyrnerase but is sl illlinked to the DNA. In this condition, peNA is tlvailable lo interacl with other proteins. CAFI associales wit h fue released PCNA and assembles I-I3 "H4 tetmmers preferentially nn the PCNA-bound DNA. Thus. by associating with a componen t of !he DNA replication machinery, CAF-l is direcled lo assemble nucleosomes al siles orrecent ONA re plica tion,

old h¡stOfles:

H2A .

new histones: [

H2A

replication

le~ - >13-1d nucleosomes/chromatin as the chromoSQmcs are duplicalad. Nudeosomes are assembled illlmed ialaly after Iha DNA is replicaled , leaving Iittle time during wbich the DNA is unpockaged. This involves the fundion of spocialized histona ch aperones Ihol eseorl Ihe H3·H4 telmmers and tl2A'HZe dimers lo the replication fork. During the replication of Ihe DNA, nucleosomes are tran-

sienlly disassambled. Hislone H3·H4 tclramers and HZA ·Hz B dinlCrs are taJldornly d istributed to one or Ihe a lher daughler lllo1ecules. On average, each new DNA molecula rcccivcs half o ld and h alf new histo nes. Thus, both chro mosomes inheril modificd hislones w hich can Ihf'.fl Rct as "sootprc.pUcu/ion

o[ DN/I

a slow oroo ONA ~s

mispaired

/ """"' ""

.......

3 ·011

-

" 3'

!ions wi th the palm region. This altered geomctry reduces the rate of nucleotide add ition in much the same way thal addition of An incorrectly paired dNTP reduces catalysis. Thus, when e mismatched nucJeolide is added, it both decreases the mte of new nuc1eolide add ilion and increases Ihe rale of proofreading exon ucl ease activity. As with DNA synthesis. proofreading can occur withoul releasing the DNA fro m the polymerase (Figure 8-10). When a mismatched base pa jr is detected by the po lyrnerase. the primer:template juncti on s lid es away from th e DNA po lymerase active sh e and into th e exonuclease site. (This is because the mismatched DNA has a reduced affin ity of the palm region .) After the incorrect base pair is removed. thc correctly paired primcr:templale junction slid es back into the DNA polymerase active s ite and DNA synthesis can continue. In essence, proofread ing exonucleases work like a "delete key" on a keyboard. removing only lhe most recent erroIS. The add ition of a proofreading exonudease greatly increases the accurncy of DNA synlhesis. On average. DNA polymerase inserls one incorrect nucleotide for every 10~ nucleotides added. Proofreading cxonucleases decrease th e appearance of an incorrect paired base to one in every 10 7 nucleotides added. Thi s error rote is still significantly short of the actual rate of mutation observed in a Iypical cell (approximately one mistake in every 10 10 nucleolides added). This add itionallevel of accuracy is provided by the post-replication mismatch repair process that is described in Chapter 9.

e resume ONA svnlhesis

THE REPLlCATION FORK

i

Both Strands of DNA Are Synthesized Together at the Replication Fork

" 3'

fl(;URE 8-10 Pfoofreading

ellonudeases removes bases from Ihe 3' end of mismaktted DNA. (a) 'M1en en irIcorreCIl1l.Idectide is inrorporated ,nto the ONA

by iI po/vrnefase, !he IiIteot DNA~ 1'> redu:ed and lhe ilffirWty of !he 3 ' enrj 01 the prwner lOf the {)NA ~ ~e SlIe is Qminished (b) \r";hen rni5matched, \he 3' ero of lile DNA ha!. increased affiríty lor the prooffeading e:or.udease active site Once bourd at lt1is CIC1/'JE' site, !he rTWS/lIiltd-ed 1"lJCI«ltide is removed. (e) Once Ihe misl1li:ltched nudeotde IS removed, the affinty of the propt:rly bdse-p¿ired ONA for the [)NA pay-

merase active si1e IS re:stoo!d aro OOA s)11thesis connnues. (Source; Ac!apled ,"om Baker TA. aod BeU S.P. 1998. PoIyr1"X.'ta5eS ilOd the repIisor¡le;

t\I\actlnes ~It'm 1"T\i!Cllnes. Cel 92. 296, f.g. lb. Ccpyfighl © 1998 VoIIth pernisSlOl'l frorn ElsevJer.)

Thu s far we have discussed DNA synlhesis in a relatively artificial contexto Thal is. al a primer:template junction that is producing only one new sl.rand of DNA. In the ceU, both slrands of the DNA duplex are replicated al the same time. This requires scparation of the twa slrands of the dou ble helix to create two template DNAs. The junction between lhe newly separated template slrands and the unrep li cated cluplex DNA is known as lhe replication fork (Figure 8-1 1). The replicalion fork moves contin uous ly toward lhe duplex region of unreplicated ONA, leaving in its wake two ssDNA templates that direct !he formation of two daughter DNA duplexes. Thc anti-parnllel nature of DNA oeates a complication ror the simultaneous replication of the two exposed templates al the roplicalion fork. Because DNA is only synth esized by elongating a 3' end. on ly one oC the two exposed temp lates can be replicated continuously as the replication fork moves. On this templale strand, the polymerase sim ply "chases" the replication fork. The newly synthesized ONA sltand directed by this templete is known as the leading slrand. Synthesis of the new DNA strand directed by the other ssDNA temp late is more problematic. This template directs t.he DNA polymerase lo move in lhe oppOSitc direclion of Ihe replication fork. Th e nc\\' DNA strand direcled by this template is known as the lagging slrand. As shown in Figure 8-11 , this strand of DNA must be syn thesized in a discontinuous fa shion.

rile Replication Fork

directioo of leading Slfand poIymernse movemenl

-.

overall direction á ONA replication

,

-

I

lag91n9 strand

direcliQ'l of lagging strand polymerase rTlO\Iemenl replicaled ONA

FI Ci UR E 8-11 11Ie replKalion fork. Nev.ty synthesized DNA is indicated in red and RNA primers are indicated in green. The Okazaki fraglnenlS showl ~R' artOOally sr.:,rt for illustra~ purposes. In lhe cell, Okazaki fragmenls can vary between 100 lo greélter than 1,000 bases,

Although the leading strand DNA polymerase can rep1icate it:o template as sooo as it is exposed. synthesis of the lagging strand mus! wai! for movemenl of Ihe replication fork lo expose a substantial length of templete before it can be replicated, Each time a substanHal length of new lagging strand templnte is exposed, DNA synlhcsis is initiatad and continues until il reaches the 5' end of Ihe previous newly synlhe:oized :olretch of laggillg straod DNA. The resulling short frogments of oew DNA fooned 00 the laggiog slrand are called Okazaki fragments and can vary in length from 1,000 lo 2,000 oucleotides in bacteria and 100 lo 400 nucleotides in e ukaryoles. Shorlly after bcing synthosized. Okazaki fragments are covalently joined together lo generata a coTltinuous. ¡nlacl' strand of new DNA. Okazaki frngments are. Iherefore. transient intermedintes in ONA replication.

The Initiation of a New Strand of DNA Requires an RNA Primer As described übove. all DNA polyrnerases require a primer with a free 3'OH. They cannol in¡tiate a new ONA strand de novo. f-foware new strands of DNA synthesis started? To accomplish this. the cell takes advantage of Ihe ability of RNA polymcrases lo do whal DNA polymerases cannot: slart new RNA chains de novo. Primase is a special¡zed RNA polymerasc dedicated lo making short, RNA primers (5 -10 nucleotides long) on an ssDNA templa te. These primers are subsequently extended by DNA polyrnerase. Although DNA polyrnerases incorporale onIy deoxyribonucleotides ioto ONA, they can initiate synthesis using either ao RNA primer or a DNA primer annealed to t}le DNA templale. Although bolh the leading and lagging slrands require primase to ¡niliale ONA synthesis. the frequency of primase funcli on on Ihe Iwo strands is dramatically different (see Figure 8-11). Eacb lending

unreplicated ONA



193

194

Tlle RepJicotio" of DN/t.

1

RNAse H

:JlmmnIHl'''lII~mmlllln::

!

s'

",,00'-"

:1lmmmlll~lllll~lllmmnC: I

I

prime r:lemplale juoc:tJ'on

!

OOA poIy""""",

mllll f.,

5 ' 3 '

,]11111 IIl11lHy1111111111

FIC;URE 8-12 RemovalofRNAprimen from newly synthesited DNA. The seq.

uenti to initiate synthesi s of a new RNA primer. Instead, primase is acHvaled onIy when i' associales with other ONA replication proteíns, such as DNA helicasc. These proleins are c:onsidared in more detail below. Once aclivated . primase synthesizes a RNA primer using tbe most recenUy exposed lagging strnnd template, regardless of sequence.

RN A Primers Must Be Removed to Complete DNA Replication To complete ONA replica tion. the RNA primers used for the initiation must be removed and replaced with ONA (Figure 8 -12). Removlll of fu e RNA primers can be thought of as a DNA repair e venl and this proccss sharos many uf Ihe properties oI excision ONA repair, a process covered in detail in Chaptar 9. To replace the RNA primers with ONA , ao enzyme called RNAse H recognizes and removes most of each RNA primer. This enzyme specifically degrades RNA that is base-paired with ONA (hence. the "H" in its name. which stands for hybrid in RNA:ONA hybrid). RNAse H removes 11.11 of lhe RNA primer except the ribonucJeolide directl y linked lo the ONA end. This is becauseRNAse H can onIy cieava bonds between two ribonucleotides. The final ribonucleolide is removed by an exon uclease thlll degrades RNA or DNA from their 5' end. Removal of lhe RNA primer leaves a gap in the doubte-stranded ONA that is an ideal substrato for ONA polymerase-a primer:template junction (see Figure 8-12). DNA polymernse fill s Ibis gap until evel)' nucleotide is base-paired. leavíng a ONA molecule Ihat is complete excepl for a break in Ihe backbone betwcen tile 3'OH and 5' phosphflte of '-he repait:ed strand. This "nick" in the DNA u m be repaired by un enzymc called DNA ligase. DNA ligase uses a high-energy co-faclor (such as ATP) to create a phosphodiester bond between an adjacenl 5' phosphale and 3'OH. Only after al] RNA primers are replaced and the associaled nicks are sealed is ONA synthesis complele.

DNA Helicases Unwind the Double Helix in Advance of the Replication Fork DNA polymerases are generally poor at separating Ibe two base-paired strands of duplcx ONA. Therefore , al fu e replication fork, a second class of enzymes. caJled DNA helicases, catalyze the separation oI ¡he two sLrnnds oI duplex ONA. These enzymes bind lo and move d irectionaJty along ssDNA using the energy of nudeoside triphosphale (usually ATP) hydrolysís to displace any ONA strand that is annealed to the bound ssDNA. TypicaJly. DNA helicases Ihat ad al replication forks are hexameric proteins thAl assume the shape of a ring (Figure 8-13). These ringshaped prolein complexes encircle one of tiUl two single strands at the replication fork near t.he single-stranded:double-stnmded junction.

The Replico/io n Fork

3'~

" -. ..1

HI5

FIGURE 8~11 DNAheIKa5eS.sepa.-ate lhe two strands of tf1e do"ble heli... lIVhen ATP ¡s added lo a ONA helicase bound lo

-

ssONA. the herlCa5e ity on

rTlO'v€S

with a defincd po/ar-

the ssONA In !he instance iUustJated, the DNA helicase hc:s a 5' ....Y poIarity. This polanty means thal the ONA helicase v..oulc! be bound lo the lagglng strand template al the replication fOfk

\.



Like DNA polymerases. DNA helieases Bet processively. Each time they associale with substrato. they unwind multiple base pairs DI DNA. The ring-shapcd hexumeric DNA helicases found al replication forks exhibít high processivity because they cncircle Ihe ONA, Release of Ihe helicase from its ONA subslflile therefore raquires the opening a f the hcxameric protein ring, which is arare event. Alternatively, lhe helicase can diSsúciale when iI reaehes Ihe cnd o flh e ON A strand Ihal it has cllcircled. Df course, this arrdllgement of enzyrne and ONA poses proble ms for lbe binding of tbe DNA heliease to the DNA su bstrale in lhe first p lace. Thus, there are specillli zed mechanisms Iha! assemble DNA hclicases around Ihe DNA in cells (sec " Initiation of Replication" below). This topologicallinkage between proleios iovolved io ONA replication and their DNA substrates is a common mechanism lo increase processivity. Each ONA helicase moves aloug ssDNA in a defincd direcl ion. This property is a characterislic of oac:h DNA hclicase called lis polarity (see Box 8-1, Determining Ihe Polarity of a DNA Helicase). DNA helicases can have a polarit y of eithor 5'-3' or 3'-5'. This direction is always defined according to the strand of DNA bound {or encirded ror a ring-shaped helicaseJ rather than the slrand that is displaced. In the case o f a DNA he licase Ihal funclions on Ihe lagging slrand template of the replication fork, Ihe polarity is 5'-3' lo aIlow the DNA helicase to procced loward Ihe duplex region of Ihe replication fork (see Figure 8-13). As is true for all en zymes Ihal move aIong ONA in a directional m anner, movement of the he lic8so aJoog ssDNA requ ires Ihe input of chemical eDergy. For helicases. lhis encrgy is provided by ATP h ydrolysis. Single~Stranded

Binding Proteins Stabilize Single~Stranded DNA Prior to Replication After tbe DNA hc licase has passed, Ihe newly gencrated single-strnnded DNA must remaio free of base-pairing u ntil it ato be used as a lempIale for ONA synlhesis. To stabilize Ihe separated strands, singIe-strandod DNA binding proteins (designated SSBs) rapidly bind lo the separaled

196

711e Replicalion ofDNA

strands. Binding of one SSB promotes the bindiJlg of another SSB to Ihe immcdiately adjacen l ssDNA (Figure 8-14). Th is is r..alled cooperative binding and occurs because SSB molecules boun d to immediately adjacen l regions of ssDNA can also bind tú each other. Thi s strongly stabilize::; the ¡nteradion of the SSB wilh ssDNA maki ng siles alrcady occupied by Olle or more SSB molecules preferred over otber sites. Cooperative binding ens ure::; that ssONA is rapidly coated by SSB as it emerges froro Ihe DNA helicase. (Cooperative binding is a prop-

Bolt 8-1

Determining the Polarity of a DNA Helicase ssONA cirde by DNA helicase, it will migrate according to its actual size, 200 bases. A modificatíon of this simple expenment can be used to determine the polarity of a DNA helicase. Suppose there is a restr1dion enzyme deavage site lacated asymmetrically within the base-paired regían (Box 8- 1 Figure 2). VVhen this site is deaved it will generate a largely single-stranded, linear DNA with two regions of dsDNA of different lengths al each end. Remember that DNA helicases bind to ssDNA, not dsDNA. Thus, me only place that a DNA helicase can bind Ihis new linear substrate is between me two dsONA reg"ions. Because of ¡he polarity of DNA helicases, any given DNA helicase can displace only one 01 the two short ssDNAs. 8ecause the two short ssDNA regions are of different lengths. the size 01 the released fragmenl W111 reveal which diredion Ihe DNA helicase moved along the ssONA region 01 !he linear subslrate.

The activity of a ONA helicase can be detected by its ability lo displace one strand of a DNA duplex from another. In a typical DNA helicase assay, the substrate is composed of one short, labeled ssONA annealed to one long. unlabeled ssDNA (typically the label is radioadiva IIp incorporated inlo the short ssDNA). Consider a large cirrular ssONA (fOl" example, 5,000 bases) hybridized lo a short (200 bases), labeled linear ssDNA molecule (&::»1 8- 1 Figure 1) . A DNA helicase will displace the short linear ssONA from the large ssDNA cirde. Separarion 01 the sttands can be detected by a change in electrophoretic mobility of me short, labeled ssDNA, in a nondenaturing agarose gel (see Chapter 20). After Ihe gel 15 exposed to X-ray film to detect only the radiolabeled ONA, !he posilion in me gel that the short DNA occupies can be determined. V'.tlen it is hybtidized to Ihe ssDNA a rde, Ihe short ssONA will co-migrate with the larga ssDNA a rde. In contras~ ance Ihe short ssDNA has been displaced from the

a

o

b

O O • @l bailed O O ® ® • ATP O @ O @ O DNA heliCase

200 ba,., fnKIiolabeled) ONA

0 ---1

5,000 bases (unlabeled ssONA arde)

--

--

,,;-ray film e1:posed lo agarose gel

80X 8-1 FICURE 1 A biochernical assay fot DNA hefkaseactMt)t

(a) DNAsubstraleto

dctect helicase actMty. A 5.000 q:. unlabeled ssDNA ciroAar D-IA is anlleilled 10 a 200base raálOlabeled DNA. For CO/l\Ief1lence me two moIecuIes are Il()( dr.3Wll 10 scaIe. (b) lo detect DNA heficase activity, lhe DNA Slbstrclte is exposed to the DNA hcIicase (in this case with aOO withrut ATP). NtJ:s!he reaction, the resulting DNA moIecUes are separated by agarose gel eIectrophoresis (noodenatur;,g). Wlen lhe short raddcbeled DNA is base-paired YoiIh the large ssDNA drde, both mdeA-tth;n the ONA fragmen~ lhe ONA will start out as a bubbte shape tren COIlVert to a y-stlape (Box 8-3 F¡gure 2, red DNA fragments). lhese unusually shaped ONAs can be distinguished from the majority of linear DNA. using two-dimensional agarose gel eIectrophoresis and v.nen they are seen can previde dear evidence of an origin el replication (Box 8 -3 Figure 3). To identify ONA that is in !he plncesS cA replicating. ONA derived from dlviding cells is first cut Wth a restrictioo enzyme and separated on a two-dimensiCllal agarosc gel. In !he first dimensim, !he ONA is separated Di size ond shope and in the second dimension, the ONA is separated primarily by size. This is accomplished by using cfiffa-ent ga density and eIectrophoriSs rates fa each dimension To separate Di size and shape, the agarose gel pcres are small and the rate of electrqJhrn:sis is fasl In conlras~ te separale primarily by size, the agarose gel pares

Box 8-3 (Continued)

are larger ane! me rate of electrcphaesis is s1ovver. Once electrc:p/"'tcresis is amplete, the DNA moIecules are transferred te nltrocellulose and detected by Southem blotti"lg (see Chapter 20). lhe choice d the restrictioo enl}1T1e and ONA probe used can dramatically affect the oulcome of the analysls. In general Ihis method requires thal the investigator already ha\..€ significant information about the Iocation of a pc(ential origin of replic.atim How can the t>MXIimensional gels identify the DNA intermOOtates assodated with a replication origin? lhe particular pattem of ONA migraticn can Iead ID urequivoc.al evidence of an OOgin of replication. The most unusual structures migreqUon in bolh x aod y

resolutioo of x al sile aod Y al site d Irt siles.

o,

,

297

298

Sj/IJ-Specific Rflcvmbillutioll ulld 7hmsposilioll o1oNA

Serine Rccomblnases Introduce Doublc,Stranded Breaks in DNA and then Swap Strancls to Promote Recombination CSSR always occurs belwcen t\\'o recombination sil es. As we sa\\'

above. these sites may be on the same DNA molecu le (for inversion or deletionJ or on Iwo differenl molecules (for integral ion). Each recombination site is made up of double-strandcd DNA. Therefore . during recombination, four single strands of ONA (t,""o from eae h duplexJ must be c1eaved and then re joined-now with a differenl partner strand - to generate the rearranged DNA. The serine reeombinascs cleave all four strands prior to strand exchange (Figure 11-6). One molecu le of the recombinase protein promotes cach of these cleavage reactions; merefore a minimum of four subunits (that is a tctramer) of th e rccombinase is requíred. These double-s trand ed DNA breaks in the parental ONA moleeules generatc [our double-slranded ONA segments (marked by the proteins bound to them as R1 . R2 . R3 . and R4 in Figure 11 6). For rccmnbin fltion tu OCl:ur. the R2 segmcnt of the top DNA mo\ccul c. 4

FIGURE 11-6 Recombinationbya serine recombinase. Eam of the four DNA strands is dcaved WltNn Ihe crossover region by ene SlJbunit of the proIein. lhese subunits Clre labeled RI , R2, R3, and R4, Oea"age 01 the t\No individual stl'i!nds 01 one a..plex is staggered by two bases. This two base regloo foons a tl'¡hrid ÓJplex In the recombinant proó.Jcts. lhe recom-binatlOfl sites are Similar lo Ihose shoNn In Figure 11 -4.

crossover regiofl I

I

!o• • 3"

.. 1

L

",,1.,.

reJOln ng ~ ~ feeomblnase

COIlservotive S ite·Speciftc; RlJ{:umbination

299

musl recombine with the R3 scgment of the hollom DNA molecu le. Likewise. Ihe R1 segmenl of the 10p rnolecu le musl recombine with the R4 segmenl of the hottom DNA molecu le. Once this DNA "swap" has occurred. the 3'OH ends of each of the c1eaved DNA slrands ca n altad the recombinase-DNA bond in Iheir n ew parlner segment. As discussed above, tlds reacl ion liberales the recombinase and cova lentIy seals the DNA strand s to generate the rearranged DNA product.

Tvrosine Recomb¡nases Break and Rejoin One Pair of DNA Strands at a Time in contrast lo the serine recombinases. Ihe tyrosine recombinases c1eave and rejoin l\Vo DNA str.mds first, and only Ihen c1eave and rejoin Ihe other Iwo strands (Figure 11-7). Consider two DNA molecules wilh Iheir recomhination siles aligned . Here also. fOllf mulecules of Ihe recombinase are needed , one lO cleave each of the fOllf

FIGURE 11·7 Recombinationbya tyrosine ,ecombinase. !-'ere the R1 and R3 slhJnits deave!he DNA in !he first step (a); in \he e¡¡ample snQNl\ the protein becomes ~rted lo \he OJI DNA by a 3' p-tyrosir.e bond. Exdlange of the ¡¡I"SI pair of strands OCOJ(S when the two 5' a-I groups al the bfeak siIes each attad:. the prrtcin ·ONA bood ination

·

individual DNA strancls. To starl recombination. the subunits orrecambinase hound lo the left recombinase hinding sites (marked as Rl and R3 in Figw'e 11-7a) cach cleavc the lap strand of the DNA lllo1ecule lo which they are bound. This deavage occurs a l the first nucleotide uf the crossover region. Next the fi ght lap strand from the lap (gray) DNA molccule and the righ! lop strand from the huttom (red) DNA mo)cculc "swap" partners. These two DNA strancls are then joined, now in the recombined configurations. This "first strand" exchange rcacHon generales a branched DNA inlermediate known as a Holliday junction (see Chapler 10) (Figure 11-7b). Once Ihe first strand exchange is complete, Iwo more recombinase subunits (those marked RZ and R4) c1eave the hottom strands of each DNA molecule (Figure 11-7c). These slrands again switch partners, and lhen are ¡oined by the reversal of the c1eavage rcaction. This "second slr.md" exchange reaclion "undoes" Ihe Hnl!iday ¡uncHon. to yield the rcarranged DNA products. In lhe next section we discuss how lhese chemical sleps occur in the context of the r€Combinase protein-DNA complexo

Structures of Tvrosine Recombinases Bound to DNA Reveal the Mechanism of DN A Exchange 'fhe mechanism of site-speci fi c recombin ation is best understood for the tyrosine rccombinases. Severa l structures of members of this protein class have been so lved, and these SITuctures revcal the recombinases "caugh t in tIle BCt" of recombination. One beautiful example is the structure of the Cre recombinase bound lo lwo different configurations of lhe recombining DNA. Insights inlo the mechanisms derived fram these s lru clures are explai ned below. Cre is an enzyme encoded by phage P1, which functions lo circularize the linear phage genome during infecHon. The recombination sites on Ibe DNA . where Cre Beis. are call ed /ox sites. Cre-/ox is a simple example of recombi nation by the Iyrosine recombinase family; only Cre protein and the /ox sites are needed for complete recombination. Cre 18 a lso widely used as a tool in geneli c engineering (see Box 11 -1, Application of Site-Specific Recombination to Cenetic Engineering). The Cre-/ox structures reveal that recombination requires fOUT subunits of Cre, with euch molecule bound to one bioding site an the substrate DNA molecu les (Figure 11-8). The conformation of the DNA is generally a sqllure planner rour~way juncti on (see lhe discussion of Holl iday junclions in Chapler lO) wlth each "arm" of this ¡unction bound by one subun it of Cre. Although at firsl glance the structures appear tu have fourfold symrnetry, this is not really the case. Cre exists in two di stinct conformations wilh one pair of subun1ts in conformatioo 1, shown in greeo, end the other pail" in conformatlon 2, showll in purple (Figure 11·8b). Ool y in one of these cOllformatíons (lhe greco subuoit s in the figure) can Cre c1eave and rejolll DNA. Thus. only one pair of subunits is in tIle active conformaticn at a time. The pajr of subunits in th is active cou fo rmation switches as lhe reacllon progresses. This switching is crítical for controlling Ihe progress of recombinatíon and ensuring the sequential "one strand at a time" exchange mechanism.

COIl1/elVOtive Sile-Spccific Recombinalioll

301



Cre-ONA

in1emIediale I

"

b

I (; U RE 11-8 MechaniSlll of site-specitic recomblnation by DIe ere recombinase. (a) The left pJed to deveIop in the absence of the recombinase, but Ihen aftef birth, ere expre5S1oo can be 1:urned on~ The presence 01 the rea:rnbinase causes deletion of the gene 01 inter· esl In this case, the pupensity 01 the Cre-treated mice (in v,.hich the gene 15 deleted) for lung canea- can now be ronpared with lheir -normal" lilter mates, in which the gene of interest is stitl ¡nlact Thus, recornbination using Cre allovvs !he potential fundions of the genes lo be unrotered in differenl stages of developmenl

BIOLOGICAL ROLES OF SITE-SPECIFIC RECOMBINATION Cells and Vlruses use conservative site-specific rccombinalion for a wide variely of biological functions. Sorne of Ihese fullctíons are discussed in the following sections. Many phage insert their DNA into the hosl duomOSQme during inrcetion using Ihis recombination mcchanism. In other cases, site-specific recombinalion is used lo alter gene expression. For example, inversion of a DNA segment ('..Bn allo", two altemutive genes lo be expressed. Site-specific rccombination is also widely USl.>d lo help Illuinlaio Ihe slructural integrity of circular DNA molecules duríng cycles ofONA replicalion. homologous recombination, and cell division. A comparison of sile·specific recombination systems reveals sorne general themes. A H reactions depcnd cr ilically on the assembly of the recombi nase protein on tbe DNA, and the bringing together oflhe t\Vo recombination sites. For sorne recombination reactions Ihis assembly is \'ery simple. requiring onl)' Ihe recombinase and ils DNA recognilion sequences 8S JUSI described for Creo In contrast, other re'lctions re· quire accessOl)' proteins. Tbese accessory protei ns inelude so-called architedural prolelns thal bind spccific DNA sequenccs and bend the DNA. They organiza DNA into a spedfic sha pc and thereby stimulute the recombination. Architectural protcins can also control Ihe direction of a recomhination reaction, for example. to ensure Ihal inle· gration of a DNA segment occurs while preventing the reverse reac· tion-DNA excision. Clearly, this type of regulalion is essentia l for a logical biological outcomc. Finally. we will a lso sce Ihat recombinases can be regu lated by other proteins lo control when a particular DNA rearrangcment takes place and coordina le it \Vith other cellular cvcnls.

BioJogical Rof~ af Silc-Spccifit; Recombinatian

30:1

A Integrase Promotes the Integration and Exdsion of a Viral Genome into the Host Cell Chromosome When bacleriophage A infects a hosl bacteriwn. a series of regulatory events resul! ei ther in establi shmen t of lhe quiescent Iysogenic state or in phage multiplication, a process called Iytic growlh (see Chaplers 16 and 21). Establishment of a Iysogen requires Ihe inlegration of Ihe phage DNA into the hosl chromosornc. Likewisc. when Ihe phage leaves lhe lysogenic stilte lo replicate iln d make new phagc particles. it

musl exdse its DNA fram Ihe hosl chrolllosome . The analysis of this integra tion/excision reuction provided lhe first molecular iosights ioto sjte-specific recombination. To integrate. the A integrase proteio (A lnl) r:ala lyzes recombination between Iwo specifi c siles. known as the atto or altachmenl. sites. The ottP site is on Ihe phage DNA (P fo r pbage) and lhe aUB site is in the bacterial chromosome (B for bacteria; see Figure 11-2). ALot is a Iyrosine recombinase. and Ihe mechanism of strand exchange follows the palhway described aboye for tile Cre protein . Unlike ere rccombination . howe\'er, A intcgf"dtion requires accessory proleins lo help the roquirud protein-DNA complex lo ilssemble. These proleins control the reaction lo ensure that DNA intL"gI"al ion and ONA excision occur al Ihe right time in lhe phage life cycle. We will fusl consider Ihe integra ti on paUlway and then look al how excision is Iriggered. lmportanl lo Ihe regulation of h integration is Ihe highJy asymmetric organiza tion of the attP and attE sil es (Figure 1] -9). 80th siles carry

m,J r ,o! IHF

JI':

FI c.u RE 11-9 Recombination sites ñwoIved in A integration and elldsion showing the important sequence elements.. e, C . B, and B' are !he core ,'..'nl bUlding sites. The additionaI proteir1 binding sites ale en auP aOO flank rhe e and C' Slfes. These re~ ale called the "ilffTlS;" lI1e sequences en !he Ieft are calleó the P arm and those en \he ritt'f are C311ed the p. armoThe srnaHpulple boxes Iilbeled PI. P~, and P,' are !he arm Aln! bindingSltes. Sifes marked Hare lhe IHFbnding sites, and siles rnar\o;ed Xare !he siles v.tJich hind Xís. F is !he SÍle boJnd bv tlS, another architectural proa, not dis· cussed lunher here. lhe grao¡ regioos ale me O'OSSO\fE'f regiCf'ls

For danty; 1..111115 no! sho.vn Ix:.uoo to !he mre sites. Note !ha!: not al proten bindng sires are lilled dunng elther Integratrve ur e),ds¡v~ ¡~nation. Alte¡ ¡~. the P arm IS part of attL v.hereas, !he p. atm becomEs

pan ofattR.

304

S ile-SpedJic Recombinotiofl ofld 'fmnsposition af DNA

FIGURE 11·10 ModelforlHFbending DNA lo brtng DNA-binding sites topthet. fue ~Int aOO \HFbinding siles I,om the P' arm of auP are shov.t1. \HF bindlng lo lhe H' Slte

bends !he [)NA lo allow one moIecule 01 ~Inr 10 bind bot:h !he p .. and e sites. The break in the DNA wit/lin the H' site reflects a nick ¡hal \NaS Pfeer1t in !he DNA used lar strucIufal anolysr; 01 the IHf.DNA complel!. (Source: ffom Rice P. er al. 1996. Oystal strur:ture of an IH F-ONA

ccmplex. CeH87: 130.5. Copyngllt © 1996, with permission from E1seo.1er.)

a central core segmenl (approximalely 30 bp). These eore recombination siles each consisl of Iwo >.Inl binding siles and a erossover region where strand cxchange occurs (as described abovc). Whereas aHB consists onJy of this central eore region, attP is mueh longer (240 bp) and earríes numerous addilionaJ protein binding siles. Flanking each si de of Ihe eore region of aUP are DNA regions known as lhe "arms." These arms carry a variety of protein binding siles, including additiona l sil es bound by >.Inl (labeled HS Pl , P2, and P', in Figure 11-9). Mnt is an unusual protein beeausc it has Iwo domaios in volved in sequence-speciflc DNA bi nd ing: one domain binds tú the arm recombinase recognition siles and Ihe other binds to the core recognition sites. In addition, the arms oCattP carry siles bound by several architectural proteins. Binding of these prote hls governs the directionality and efficieney of n'l combination. Intcgtation requires attB, attp. Alnl, and an architectural protein called integration host factor (IHF). IH F' is a sequence-dependent DNA-bindiog protein thal introduces large bends (> 160. exci se? An additionaJ arc hitectural protein. Ihi s onJ3 phage-encoded, is essential foe excisive recombination . This prote in, ca lled Xis (for excise), binds lo specific ONA sequences and intro~ duces bends in Ihe DNA . lo lhi s manoer, Xis is similar in fundion to m.F. Xis recognizes Iwo sequence motifs present in one arm oI attR (and also presenl in at/P- marked XI and X2 in Fi gure 11-9). Binding these sit os introduces a largo be nd (> 140°) and logethe r. Xis, >.Int, and IHF stimula le excision by assembling an active prol einDNA complex al a UR. This eomp lex then ¡nleracls producl ive ly with protei ns assembled al ouL and recombinati on occurs. lo a ddil.ion lo sti mllJatin g cxcis ion (reeombination between attL and atlR), DNA bind ing by Xis Hlso inhibits inlegration (recombination between a tlP and attB). The DNA slruclure c waled upon Xis binding to attP is incompatible with proper assembly of Alnt and IHF at this sile. Xis is a phage-encoded protein and is only made when Ihe phage is lriggered to enter Iytic growth. Xis expression is desceibed in detail io Chapler 16. li s dual aelion as a stimulatoey cofactoe foe exci sion and an inhibilor of integralioo cnsures Ih:ere If Ihe replicated chrornosome fOl'lT"6 monorners. segregatlOn wiR break !he s'fTlaptic cotnpIex and rhe áf ~es wift move lIWlIY frorn the midcell location belore divt-

.sron In COI1tras~ if !he chrornoscrne forms a ÓiITJef (righl panel), !he S)Nptic ccrnpIex rernains

trapped al rnick;eII and alk:ws access 10 FtsK. whidllS locaIized lo the (ell dMsion site. FtsK lhen activates >:erO. >:erO-medialed recanbnaoon, lo/Iowed by Xerc-mediated recomt.inaoon, then allcr.Ns resolution of !he ÓITlerS inlO

XerC

rnonomers lor ceUdMsicn (Soorce: Barre E.1 al. 200L A'oc. Nat. Arod. Sri U..5A 98: 8189, 15, p. B194.)

-.------. ------.----l·.-----.--.

division site closure - - . --. - - - . -

L--..------. --_.._.l FIsK

aclivalion of XerD by FIsK )(ere also active

1

OD~ ceUdivision

rra nspositioll

01 DNA

FtsK is un ATPase I.hal Iracks a long DNA. JI fun cl ions as a "DNApumping protei n" simil ar lo Ihe RuvB protein Ihat promotes DNA branch migration during homologous recombination (discussed in Chapter 10). FtsK is also a rnembrane-bound protein Ihat is localized in Ihe cell ul Ihe site where cell division occurs. lt functions lo move UNA a'..'ay from the canter of the cell prior lo division so thltl Ihe cell can divide al Ihi s site (l-l gu re 11 -16). This 10calizatiol1 of FIsK lO Ihe division site is key lo how Ihe cclls insure thal XerD is Ilclivated spcci fi cally when \1'.., • 1 • ,~I., . _ T...•'

.

BO)( 11·3 flCURE lb Exllmpleoholo,wrñegationin snapdragon nowers due te TamJ transposition. The size ot wlite patches ís relllted lo the frequern:y of transposition. (Soun:e: Chatterjee M. and Martín e 1997. The Plalt Journa/11 : 759 - 771 ,

Frgure 2a. page 762.) be in differenl chrCfTlO5ClTlélI Iocations in the descendents of (Ir) individual plant. This observation prcMded !he fLrSt insight Ihat netic elements courd rncM!. Ihat is "transpose,~ within chrorno-

se-

sanes.

eo x

11-3 FIGURE la Exampleofcorn(maize)mbshowing mJOI v.wgation due te transposition. (Source: Photograph tllken by Barbara w.caíntock; image counesy CoId Spñng Harbar laborotory An:hives.)

Os, in fact. is a nonautonomous DNA transposon that moves by cut and paste transposition. Os movemenl requires the Ac (adivator) e rement-arso discO\lered by McClintock- Io be present in the same cel! and provide the transposase protein. Ac is now recognized lo be part of a large family of DNA transposons called Ihe MT family named far the hobo erements from flies, the Ac elements from maize, and the Tom e lements trom snapdragon.

TnlO transposes via lhe cut-and-paste mech anism (descri bed above), usiog the DNA hairpin slrategy lo cleave the nontransferred strnnds (Figures 11-19 a n d 11 -21 1_ The TnlO sequence a lso has a site for IHF b inding. lHF helps in the assembly of proper trans pososome oomplex needed for fecombination as it does dwing phage ~ integration (see above) . T n lO is organized into threo functionill modul es. This organi7.ation ¡s· relatively common. and elements th at have il are caJl ed composite tr ansposon':i. The two outennost modul es. called ISlOL (left) and ISlOR loght), aro actualIy mini transposon s. " IS" stands for ¡nsertion sequen re. IS10R encodes th e gene for tbe transposaso Ihat recognizes the terminal inverted repeat sequences of ISlOR, ISlOL, and Tn1O. ISlOL, altho ugh very s im ilar in sequen ce lo ISlOR, d oes nol encode a funcl ional transposase. T h us. both ISlOR and TnlO are autonomous, w h ereas ISlOL is a

Exomplf1S ofTrunspofiflblp.-ElenlP.nts and 11¡eir Regu lo/ion

329

TA8lE 11-2 MajorTypes o. Transposable Elements

Type

Structural Features

Mechanlsm o. Movement

Examples

ONA-MEOIATEOTRANSPOSITION

Bacterial repliCative transposons

Terminal invertcd repeats that Uank antiblotic· resislance and transposase

Bacteriat cUI-and-paste oansposons

Terminal inverted repeats thal flank antiblOtic· resislance and fransposase

Copying of elernent ONA Dcccmpanying cach round 01 inserliO'l into a new la,gel sile.

Tn3, -y6,

phage Mu

Excision of DNA Irom otd larget slle and inserrion ínto new site Excision 01 DNA iforn old larget site and inser!ion into new site

Tn5. Tn lO. Tn7. 18911. Tn917

geces

genes Eukaryotic Iransposoos

Inverfed repeats that lIank coding regían wilh inlr01S

P clements (Drosophila) hA 1 family elernents fc 1/Marinerelements

RNA-MEDlATEO TRANSPOSITION

Vira/-flke retrotr:msposons

Poly-A retrotransposons

-250 10 600 bp direel

terminal repeats (LTRs) ttanklng genes lar reverse transcriptasc, integrase, and relrovirat-like Gag proteln 3' A-T-rich sequence 8nd 5' UlA flank genes encoding an RNA-binding protein and reverse transcriptase

TranSCrlptlon Into ANA ¡rom promoter in left LlA by RNA poIymerase II lol!owed by reverse tmnscriplion and insertlon al target sile Tmnscriptloo into RNA Irom internal prometer; targetprimcd feverse Iranscriptian ini(iated by endonuc tease cleavagc

nonautonomous lransposon. 80th types of 1510 elem ents are found , as e.xpected, unassociated with Tn10. TnlO limits it~ copy nWllber in any given cell by strategies thal restrict ils Iransposition frcquency, One mechanism is Ihe use of an antisense RNA lo con trol the expression of the l.ranspo~ase gene (figure 11-29) (see Ihe diseussion of anlisense RNA regulation iJl Chapler 17). Near Ihe end of ISlOR are !wo prnmoters Iha! dimct the synthesis of RNA by !he host eell's RNA polymerase. The promoter thal directs RNA synthes is inward (called P 1N) is responsible for the expression of the transposase gene. T h e promoter thal directs tran scriplion outward (P OUT) ' in contrasl, serves IQ regu late transposase expression by ma ki ng an antisonse RNA, as follows. The RNAs syll lhesiz(ld from P 1N and POUT overlap (by 36 base pairs) and therefore can pair by hydrogen bonding between these overlapping (complementary) regions. This pairing prevents binding of ribosornes to thc P 1N transcript, and thus syothesis of the transposase protein. By this mcchanism, cells tba! carry more copies of Tn 10 will transcribe more oC !he antisense RNA, whieh in furn willlimit expression oC the transposase gene (Figure 11-28, sce legend ror more delails). The transposition frequency will , therefore, be vel}' low in sueh a strain. ln contrast, ¡f theTe is only one copy ofTnlO in the eeU, the level of antiscnse RNA will be 10w, synlhesis of tbe lransposable protcin will be cfficient, and transposition w ill occur at a high er freque ncy.

TolO Transposition Is Coupled to Cellular DNA Replication Tn 10 al~o coup les lransposition lo celJular DNA replication. Recall fha l bacteria su ch as E. coU (a eommon host for TnlO) methylate Iheir'

Ty elernents (yeast) Copia elCfTll:!nts (Orosophila)

F and G elements (Drosophila)

UNE ane! SINE elernents (mammals) Alu sequences (humans)

330

S ile-Specific Recombinof;on und Tmnsposition 01 DNA

F I (; U R E 11·28 Anti5ense ft8Ulation 01 TnlOexpression. (a)Amapofthe overtapping pl'OlTlOter regions is shov.n The left\.vard promoter (plN) prometes cxpression 01 the tranposase gene; the rightward prometer (pOlff), ......t1ich lies 36 bases lo lhe left of plN PfOIl1OleS elIpress/on 01BIl antisense RNA. lhe first 36 bases of each transcnp! are complemen-

tary to ene another. Note that in cells the anlisense transcripl initlated al pOUT is Ionger ,¡ved than is lhe mRNA irutiated al plN. (b) In c:eIIs h;Mng a high copy number of Tn/ O, !he RNA:RNA pcIiring occ.urs ffequenlly aflcl bIocks translanon ot !he tranposase mRNA (therel7yevenlually reducing Ihe cwy number of the

a

IS/0righl anflsense RNA 'a:u FU

..

Po", transposase gene lfansposon . . ._ _ _ _- -_ _ _"':'.Pii~.io, mRNA ..

'------.J

36 base pair

overtap b highTn10copynumber: RNA:RNA palring is fTequefll

',,;f 1"",__."

eIemen1). (e) lf1 c:ells having a Iow copy number of!he trarlSp05Ofl, RNA:RNA pairing IS rare; the

translation 01tra!1pOSilSE! mRNA is efticiefll ancl lhe o:py flumber in lhe c:e11is inc:reased.

5'

lranslalion of transposase mRNA is blQCked

e low Tn10copy number: RNA:RNA pairing is rafe

5'

1

o,

lfanslation lransposase mRNA is efficienl

ONA al CATe siles (see Chapter 8, Box 8-4). This methylation occurs after DNA replication . suc h that CATe siles are hemimethylated for the few minutes between passage of the repfication tor l:: and recogm-· tion ofthese seq uences by the methylase enzyme. II is during Ihis brief period -when Ihe TnlO DNA is hemimelhylaled - thal transposition is most lik.el y lo occur. Thi s coupling of transcription to Ihe methylation state is due to the presence oCtwo critical CATC siles in the transposon sequence. Qne of these siles is in th e promoter for the tran sposase gene; Ihe sccond is i.n the bin ding s ite for the transposase within one of the inve l1ed terminal repeats. 80th RNA polymerdse a nd trans posase bind more tightly to the hemimethylated sequences than to their fully melhylated ver· sions, As a result , when the DNA is hemimethylated. the transposase gene is most efficiently expressed . aud the transposase protein binds most efficient ly to the DNA. Therefore, transposition ofTnlO occurs al its highesl frequency during Ihis brief phase of the cell cycle jusI after its DNA has been replicated (Figure 11· 29 ). Regulation of TnlO transposition by DNA methylation serves to limit Ihe overall frequency of transposition. lt also restricts transposi· tion specifically lo actively dividing cells. This tirning ensures Ihat there are two copies of the chromosorne present to "heal " Ihe doubleslranded DNA break teft in the old target site as a resull of transposon excision_ These "empty larget sites" are repaired via homologous ¡-e. combination by the double-strand break. repair pathway. T his recombinat ion reaction requires that two c:oples of tite chromosomaJ region be presen t (see Chapter 10).

Exomples ofTronsposoble Elements cmd T1IP.ir Regulotion

F U¡ u RE 11-29 Tfansposition ofTnFO after passage of a replicatwn foric.. Transposition is activated by me hemime!h}'iated OOA thal exists jusI afta-!)NA repllcation (me~

,--

'- ~pos ition-Iike mechanism can be USfld ror olher Iypes of ONA rearrangcmenl reactions. The prime example of Ihis is Ihe V(D)J recombination reaclion. rosponsible for assembl y or gcne fragments during developmenl oflhe vllncbrate ¡mmune system.

BIBLIOGRAPHY Books Bushman F. 2002 . Lotero/ DNA tronsfcr: Mec}lOnisms Qnd cOllsequenccs. Cold Spring Harbor Labowlory Press, Cold Spring Hachor, New York. Craig N.L" Craigie R., CclJcrt M.. and J...;:¡mbowitz A.M .. eds. 2002. MobiJe DNA 11. American Soc.iety foc Microbiology. Washington, OC. S itc~Specific

Recombination

Bakcr T.A. 1991. .. ' .. and then Ihero \Vere Iwo." Nuture 353~

794 -795.

Che f) Y. and Rice P.A. 2003 . New Insight ioto sitespecific recombinalion from FLP recombinaSü-DNA struclures. Annu . Rev. 8iop}¡ys. Biomo/. Struct. 32: 135-159.

Hallet n. passed into protein it cannot gel out again. The transfer of informaUon t:rom nueleie acid to nueleic acid. or from nueleie acid lo protein, may be possible, bul transfer from prote¡n lo protejo , or from protein lo nueleie aeid. is impossible. lnformatíon means here the precise detcrmination of sequence. either of bases in the nueleic acid or of amino a cid residues in the protein. Chapters 12 through 15 trace the now of ¡nformation from the copyíng of the gene into an RNA replica known as the messenger RNA to the decoding of the messenger RNA into a polypeptide chain. The process by which nucleotide sequence information is transferred from DNA to RNA is known as transcription, and this is the sub ject of Chapler 12. A multi-subunit molecular machine known as RNA polymerase creates a moving " bubble" in the double helix in which DNA is unwound at the leading edge of the bubblo and rewound into a helix al the traiHng edge. The RNA polymerase uses one of the two transientLy-separated ONA strands within the bubble as a template upon which it progressively builds a complementary RNA copy by base-pairing. The messenger RNA is CJ'Cüted in a similar manDer in all cells. Bul, while the basic enzyrnc thal makes the RNA is very similar. the rest of tho machinery involved ín transcriplion in eukaryotes í5 more complex than íts counlerpart prokaryotes. Sequences in the DNA that determine where transcription starts (promoter) and where il stops (terminatorl are also described. In prokaryotes, once the messenger RNA 1S synthesized, it is read)' for !he nex! slage of information flow in which RNA is used as a template for protein synthesis. Dut not in eukaryotes: Ihero tbe RNA product of transcription must undergo a series of maturation events before it is competent to serve as a messenger RNA . 1'wo of these- the addition of thc so-ca ll ed "cap" structure lo the 5' end, aod of a po ly-A tilil to the 3' -ore described io Chapler 12. The most dramatic processing event is cal1ed mRNA spUcing, and is described in Chapter 13. Genes in eukaryotic cells are frequcntly ¡nterTupted by one, or sometimes Olany. nonprotein-coding segments knowo as introns. W~en the gene is transcribed into an RNA copy. these ¡ntmns must be removed so that the proleín-coding segments. known as cxons. can be joined to each olber to create a conliguous protoin-coding sequence. Chapler 13 descríbes the elaborate molecular machine responsible for removing introns \Vith great precision. Part 3 culminates, in Chaplers 14 and 15, with the process known as translation . Trns· is the process whereby genetic information, in the forro of tbe sequence oC oudcotidcs in messengcr RNA, is uscd lo direc t the orclered lncorporation oC amino acids iota the polypeptide chain of a protein. Chapter 14 describes the fo ur principal partici pants in translation: the coding sequence in messenger RNA; adaptor molecules known as tRNAs; enzymes known as aminoacyl tRN A syn-

Por1 3

Expression 01 the Genome

345

thelases that load amino aci ds onto the tRNA ada ptors; ilnd !he protein-synthesizing factory itself. the rihosome. which is composed of RNA and protein . The remainder of the chapter describes how these four compone nts. with help from a number of key auxilliary factors, manage the remarkable process of convert.ing the nucleotide code of a given mRNA into the correct order of ami no acids in its protein product o Finall y. Chapter 15 describes the classic experiments that led lo the elucidation of the genetic codeo and tays out the rules by whic h the code is transJated. Tbe nucleotide seq uence information is based on a Ihree letter codeo while the protein sequence information is hased on twenty different amino acids. Thc code is degenerate with two or more codons (in mosl cases) specifying the same am ino acid. There are also specific codons that in dicate where lranslation should sta rt aod whera it should stop .

PHOTOS FROM THE COLD SPRING HARBOR LABORATORY ARCHIVES

Rithard RobetU,. 1971 Symposium O" Cfl,omatin. Moch 01 Robert5' re5e..o,tever, was lor establishing the OOflTl, e eIegons, as a modeI system lor the study 01 devel0pme0tal biology (Chaptel 2 1),

/

frands Cridl., 1963 Symposium on synthesis and StructUJe of Macromolecuk!s. In addition to his role in soMng !he strl..ldure 01 DNA, OICk was an intellectual ¿riving fo«:e in !he cbteIopmenI 01 rnoIecuIilr t.J1ogy donng !he r¡eld's critica! eatiy year~ His 'adaptar h)POthcsIs~ (po~ished in the RNA r¡e Olb f1e\o\'SIener) predicted ¡he existence of rnoIea;Ie-s required lO translare!he genetic axIe of RNA ¡nto !he amino add sequence 01 pfO(eins. Only jata were IRNAs found lo do JUSI that (Chaptel 14).

PIliflip Shalp. 1974 Symposium on Tumor VifUSes. Sharp aOO Rimard Roberts shared !he 1993 Nobel Pnze in Mediane lar ~ng that many eokaryotic genes elre ·split" - rhélr IS. their roding regions are intern.!pl:ed by sr:retches 01 non-coding ONA. The norrroding r€gions elre removed lrom the RNA copy by ·splicing" (Chapte r 13). Sharp is stown here WIIh hlS wife Anne.

Paul Zamemik. 1969 5ymposium on The Medranism of Protein Synttresis. Zamecnik developed in vitro systerns 01 protein SyntheslS tha! provee! critica! to underst. T he inlcrnction between TBP ond DNA involves on ly n limited number of hydrogen bonds belween the protein and lhe edges of Ihe base pairs in the minar groove. Instead, much of the specificity is imposed by two pairs of phenylalanine side chains Ihat interr.aJate between the base pairs al eilher end uf th~ recognition sec¡uence and drivelhe strang bend in the DNA, T hus, A:T base pairs are fl:l vored becallse lhey are more madiJ y distorted to 8\10W !he initi al openi ng of the minar groovc. There are also extcllsive interaclions between the phosphattJ backbone and basic residues in the B sheet, adding to t.he overall binding energy of tbe interaction.

The Othcr General Transcription Factors also Have Specific Roles in lnitiation We do nol know in delail Ihe funelions of all Ihe olher general transeription factors. As \Ve have noted. sorne of these factors are in fuel complexes made up oftwo or more sllhllOilS (shown in Table ]2-2). Below we comment on a few structural and functional ehnracleristics. TAFs. TBP is associaled with übout ten TAFs. 1\vo orlhe TAFs bind ONA elements al lhe prumoter; for example. !he initiator element (hu) and Ihe dowJlstream promotef elemenl (DPE) (see Figure 12-12). Several of the TAFs hnve structural homoIogy to hi slone proteins. a nd it has been pro posed tila! !hey mighl bind DNA in a similar manncr, although evidenr:e for such a form of DNA binding has nol been ohtained. For example, TAF4 2 and TAF62 from Drosophila have becn shown to fonu a struelure similnf to thnl of the 1-13 · H4 tetramer (see Chapler 7 ). Theso hi stone-Iike TAFs are found nol only in lhe TFlJD complex bul are also associated with sorne histone modifieation enzymes , such as the yeasl SAGA complex (see Table 7-7). AnoUler TAF appears lo reglllate the binding of TBP lo DNA. 11 doos Ibis using an inhibitory flap thal binds lo the ONA-binding surface of TBP-another example of molecu lnr mimicry. This flap must be displaced for TBP lo bind TATA. TFllB , This protein , a single polypepli de chain, enters the proinitiation complt:x after TBP (Figure 12-13). The crystal structure of lhe

ternary complex of TFIlB - TBP-DNA shows specific TFII B- TEP and TFlIB-DNA eontacts (Figure 12-15). These include base-specjfic inleractions with the major groove upstream (lO the BRE-sea Figure 12-12) and lhe minor groove downslream. of the TATA elernen!. The asymmetTic binding of TFIIB to th e TBP- TATA complex accounls for Ihe

r A8LE

12·2 11teGeneral Transaiption Factors af RNA PtIIymerase 11

GTFs

NumberOf SUbunits

TBP TFIIA TFIIB TFIIE TFIIF TFIIH TAFs

2 1

2 3 9 11

l"h6 mmbers shown 81alor veas! bul are $S/Tll1ar 101 0Itler t'\JIaryolllS. 1I1C1uding h uman:¡ .

MtlChan;sms 01 Tronscriplion

flCURE 12·15 TFIIB-TBP-promoter

compell. This slructure shoNs the TBP prolein boond 10 lhe TATA sequence, just as we saw in lhe preW:lus figure. Here, !he general transcriplion factor TFIIB (shov-m in lurquoise) has bem added lhis lripartite romplex fotms !he platform lo which o\her generallransaiption factors, aOO PQlllltselt are recruited during pre-1nilialion oomplec assembly. (Nikolov O.B., Chen H., Halay E.D., USheva AA, Hisatake K. l ee O-K, Roeder RG., Cfld Burley S.K. 1995. Narure 377: 1 19. ) Image prepared wi!h MotScriPt. BobSaipt, 6s cach: bacleria have just one. The threc eukaryofi c eozymes are called RNA Poi 1, U. and lll, lo this chapter we fm.:u scd primarily on Poi n, as this is the enZ}'me tbal transcríbes the vasl majorily of genes in the cel! and aH Ihe protein cod ing genes. The basie enzyme fTom E. eoli, caJ1ed the core enzyme, has one Cap}' of each oí tbree subunits-j3, ¡3:', and w aJld two copies o( o. AIl thase s ubunits have homologues in the eukaryolic enzyrnes. The struclures of the bacterial and yensl Poi n enzyme are ¡¡Iso s imilar. Both rcsemblc a crab daw in sbape, rhe pincen; being mude up of Ihe largest subunits, j3. and W in the case of the bacterial enzymc. The active site is al the base oC the pincers, and aecess lo and from Ihe aclive sitc is afforded through five channek one allows double-strandcd DNA lo cntCT oolween the pino:.:ers al Ihe fronl of the enz)'me; Iwo olhcrs allow the two single strands- the ternp late and non-template strands- Io lcave Ihe e nzyme behind the active s ite: anolher mannel provides tha raute by which NTPs cnter Ihe 8(..11ve site; and the RNA product, which peels off the DNA templale a shon dislance benind the site of pol)'merization, exits the enz}'me tbrough tbe 6fth ehannel. PoI n differs from Ihe bacterial f,!nZ}'me in one important wa)'. The former has a so-caHed "Iail" al the Clerminal end of Ibe large subuni l. and Ihis is absent IToro /he bacterial enz)'me. Tbis tail is made up of multiple repeab; of il heptapeptidc sequcnee.

A cound of transcription proceeds Ihrough three p'nasas cl:illed ¡nHiatioo, elongatioll, and termina. ion. Thollgh RNA polymerases can s)'nthesiza RNA unaided, olIJer proteins-callad initiation factoes - are required for aecurste and cfficient initintion. These faclo rs ensure Ibat thc cnzymc initiales transcrip'ion only from appropriate si les on Ihe DNA. ca llerl promoters. In bacteria lhere is onl}. ooe ioitifltion factor, 0- , whereas in eukar)'oles Ihare are several, collectivf.'ly called lhe general transcription fa ctors. In eukar)'otes , the ONA is wrapped within nucleosomes and, in vivo, e ffident initiation ver)' afien requires additiooal prote¡ns . induding the Mediator Complax lind nuc!eosome modifying enz)'mes. Transcriplional aclivator proteins are also needed (seo Chapler 17). Dming lnitialion, RNA polyrnerase (togcthar with Iho initiation factoNi) binds to the promoter in a c1000d complexo In Ihat slale the DNA remains in a double·strandcd form. This dosed complex then undergoes isornerization lo Ihe open complruc In that form , the ONA around Ihe trans("Tiption slart sito is unwound , disrupling the base pain;, and formiog a bubble oC single-strandcd ONA. This trans ilion allows aecess lo Ihe template slrand, which determines the arder of bases in Iho ne\\' RNA strand. This phase of initiatioo is followcd by promoter escape: once the enzyme has synthesized a series of short RNAs, called aborti\'e iniliation, it manages lo mako a lrans(.'ript ¡hal grows beyond 10 bp. Al Ihis poinl Ihe cnz)'rne leaves the promotor and enlers the elongation phasc. During Ihis phase, polymeidse moves along the gene while 010 cnzyme performs several functi ons: it opens Ihe DNA downstream and reseals il upslream (behiod) Ihe active sitc; il adds ribomiclf~olides to Ihe 3' end of Ihe growing transcript: it pecls the newly-formed RNA off the templato sorne B or 9 bose pairs behind Ihe point of polymerization; and it also proofreads Ihe transcript checking for (o.od repladng) incorrectly inserted nucleolides. Transcription in bolh bacteria and cukaryotes fo llows tbese same steps. There are differenees in tbe two cases, however. For example, in bacteria, isomerization 10 the open complcx occurs sponlnneously and does nol require ATP h}rdrol)'sis. In cukar)'oles this step does requiro ATP hydrolysis. Moro strikingly, in eukaryotes, promoler escape is regulalcd by lhe phosphorylation sla te oC Ihe CTD lail. Thus, the fo rm of PoI II thal binds the promoler- in Ihe pre-initiation complm: has an unphosphorylaled erD. This dornajn bec:omes phosphorylated by one or mOfe kinases, including one thal is par! of one oflhe general transcription fael ors, TFIIH. Termination aIso works differently in bacteria and eukaryotes. Thus, in bacteria there are hvo kinds of term!nators- intrinsic (Rho-independent) and Rho-dependen!. lntrinsic terminators consisl of two sequence eloments thal operate once transcribed into RNA . One element is an inverted repeat that rorms a stem loop in the RNA, disrupting the elongating polymerase. In oombination with a slring of U nueleolides {which bond onl}' wcakly with the template strand), Ihis leads lo release oC Ihe traosc.TipL

Bibliogroph:y

Rho-dcpendent tenninators require the ATPase Rho. a protein thal hops on elongating transcripls and " pulls" them from the enzyme. In cukaryotes, lerm ina tion is dosely Jinked lo an RNA pmcessing event called 5' polYBdenylation . Once phosphoryl¡¡led, Ihe CTO taH of the Poi IJ mes ¡tself from ¡he otber proteins at Ihe promoler. releosing poly· merase into the elongation phase. The cm then binds faetors involved in tronscriptional elongaticJn and RNA processlng. Thus. there is ao ext:hange of initlalion for oloD-

377

gation and processing factors as !he polymcrase movcs away from the promoter and slarts transcribing the gcne. There aro aiso intcral-1ions betwcen the elongation factors and Ihose involved in proccssing, ensuring proper (;oordinalion ofthcse evcnts. In Ihis chaplee we considered capping of Ihe 3 i cnd of Ihe RNA transcripts, polyadenylalion of Ihe 5' eod. and the link between the las! of Ihese and transcriptional term iDation. Splicing is dcsLTibed in the ne:d chapler.

BIBLIOGRAPHY Books

Transcription lnitiation

Cold Spring Harbar Symposio 00 Qllantitotive Biology. 1998. Volume 63 : Mechanisms of Transcription. Cold Spring Harbor Laboratory Press. Cold Spring Harbor, Ncw York. Ptashne M. and Gann A. 200 2. Cenes Olld sisoals. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York . White RJ . 2001. Gene Il'onscription : Mechanisms ond control. BlackweJl Science. Malden. Connecticut.

Malik S. and Roeder RG. 2000. Transcriplional regulation through medialor-Iike coactivators in yeas! and meta· zoan cells. Tronds Biochem. Sci. 25: 277-263. Myers L.e. and Kornberg R.n. 2000. Mediator oftrans(.Tiptional regulatíon. Annu. Rev. Biochem. 69: 729-749. Woychik N.A. and Hampsey M. 2002. The RNA poly· mCTiJSC 11 machincry: Struclure iIIuminates functian .

RNA Polyrnerase Borukhov S. and Nucller E. 2003. RNA polymerase holoenzyme: Structure. funetion and biological implications. Cllrr. Opin oMicrobio/. 6: 93- 100. Darsl S.A. 2001. Ba(;lerial RNA polymerase. Cw·r. Opill. Struct. Biol.U: 155 - 162. Ebright R.H. 2000. RNA polymerase: Structurnl similarities hctwccn bacterial RNA polymerasc Bnd eukaryotic RNA polymcrasc U.f. Mol. 8io/. 304: 687-698. Murakami K.S. and Darsl S.A. 2003. Bacterial RNA poly* merases: The whole story. Curro Opin o Struct. Bio/. 13: 31-39.

Paget M.S. and Helmann }.O. 2003. Tho sigma 70 family of sigma. (adors. Cenome BioL 4: 203.

Promoters Butk'l" I.E. and Kadonaga J.1'. 2002. The RNA pol}rmerase 11 coro promoter: A key componenl in the rogulation of gene cxpression. ~lles Dev. 16: 2563-2592.

CelJ 108: 453-463.

Young B.A. , Gruber T.M., and Gross C.A. 2002. Views of transcription initilltion. CelJ 109: 417 -420.

Elongation and RNA Processing Howe K.J. 2002. RNA polymerase 1I conducfs a symphon)' of pre-mRNA processing activities. Biochim. Biophys. ACla 1577: 308-324. Maniatis T. aad Reed R 2002. An extensive nehvork. of c.:oupling among geno expression machines. Na tl1ro 416: 499-506.

Termination Richardson J.P. 2002 . Rho-depeodent termination al;ld ATPases in transcript tenninalion Biochim. Biophys. Acta 1577: 25 1 -260. - -- 2003. Loading Rho lo terminate trans(.Tip!ion . CeJ/ 114: 157 - 159.

e

H A P TER

RNA Splicing

he coding sequence of a gene is a series of Ihree-nuclootide codoos that specity the linear sequence of amino acids in its polypcptidc product. Thus far we have tadlly assumed tha! the coding sequence is contiguous: the codon for one amino acid is imme-diately adjacent lo the codon for the next amino acid in the polypeptide cham. This is true in the vas! majority of cases in bacteria and their phage. Bul it is nol ahvays su fOI eukaryotic genes. [o those cases, t.he coding sequence is periodically interrupted by stretches of noncoding soqueoce. Thus m.my cukHryotic genes are mosaies , consisting of blocks of coding sequences separated from each other by blocks of ooncoding sequences. The coding sequences are called exons and the intervening sequcnccs are callcd ¡nloollS. As a consequcnce of this alternating pat tem of exons and ¡ntrons. genes hearmg noncoding interruptions are often said to be "in pieces" or "split. " Figure 13-1 shows a typical eukaryotic gene in which the coding region is interrupted by thrce ¡ntrons. splitting il into four exons. Tho number of introns found within a gene varies enormous ly-from one in the case of mosl intron-contain ing yeast genes (and a few human gllnesl, to 50 io the case of the chicken proa2 collagen gene, to as maDy as 363 in the case of the 1itin gene of humaos. Also, the sizes of the exoos and introns vary. Indeed iotrons are very often much longer tban the exons they separate. Thus, for exam ple, exons are typica lly on the arder of 150 nudeotides, whereas introns-though they too can be short-can be as loog as 800,000 nuc1eotides (BOa kbl_ As aoother cxample, the mammaliao gene for the enzyme dihydrofolate reductase is more Ihan 31 kb long, and witbin it are dispersed six exons that corrcspond to 2 kb of mRNA . Thus, io this case, Ihe coding portian of the gene is Icss than 10% of jts totallength.

T

""""ole, region 2

3

genomic

exon 1

! !

2

4

3

IranscriptiOn

pre-mRNA

nonooding region

4 ,.--,

3

>3'

spliol'lg

spliced mRNA

• The Chemlstry of RNA Spliang (p. 380) The Spticeosome Machinery (p. 383) Sptiang Pathways (p. 385)

• Altemallve Splicing (p. 394)

• &011 Shuft!ing (p. 401)

• RNA Editing (p. 404) • mRNA Transport (p. 406)

f I c:¡ U R E 13-1

,-,

ONA

OUTl lNE

Typic~1 eukaryotk gene.

1he depicted gene conlains lour exons separated by three introns. TranSOlption from Ihe promoler generales a pre-mRNA, shouvn in lhe middle line, lhat conlalns alt !he exons aOO ,-nlrons. Splicing rerTl(llr'eS me inlrons and !uses !he exons lo generale !he mature mRNA Iha!, once processed lurther (see poIyadel'lylatioo, Chapter 12) aOO e>¡JOrted from lhe nudeus, can be Ifanslated lo ~ a protein producL

13'

2

3

4 379

Like Ihe uninterrupted genes of prokaryotfls, the spli t genes of eukaryotes are transcri bed ¡nlo a single HNA copy oC the eoli re gene. Thus. the primary transcript for a typ ica l eu karyotic gene contaios introns as well as exons. This is shown in lhe middlc part of Figure 13-1. Because of the length and number of ¡ntrans, Ihe primary transcripl (or pre-rnRNA) can be very long indeed. In the extreme case of Ihe human dystrophin gene, RNA polymerase must Iraverse 2,400 kb of DNA lo copy Ihe entice gene inlo RNA. (Given thal Iranscription proceeds al a rate oC 40 nucleotides per second, it can readily be seen Ihal il takes a slaggering 17 hours lo make a single transcript oC this gene!) Despite this seemingly odd gene organization, the prolein-synthesizing machinery of the cell (Chapter 14) is equipped ooly lo translate messenger RNAs conlaining a cootiguous stretch of (;Odons; it has no way of Identifying and skipping over a block of noncoding sequenee. And so the primary transcripts of split genes must have their introns removed before they can be translated into protein. Introns are removed from the pre-mRNA by a process called RNA spHcing. This process converls the pre-mRNA inlo mature messenger RNA and must occur with grea! precision lo avoid Ihe 1055, or aL1dilIon. of even a single nucleotide at the si les al which Ihe exons are joined. As we shall see in Chaplers 14 and 15. the triplct-nllcleotide codons of mRNA are translated in a fixed reading frame Ihat is sel by the first codon in the proteJn-eoding sequence. lack of precision in splícing- if, for example, a base were losl or gained at the boundary between two exons-would throw the reading frames of exons out of register and downslream codoos would be incorrectly selected and the wrong amino acids incorporated into proteins. Sorne pre-mRNAs can be sp liced in more than one way, generaling alternative mRNAs. So, for example, different combinations of introns might be removed. This is called alternativc splicing, ando by this strategy. a gene can givc rise lo more Ihan one polypeptide prodllCt. lt is estimated tha t 60% of the genes in Ihe human genome are spliced in alternative ways 10 generate more Ihan one protein per gene. Tite number of different variants a given gene can encode in Ihis way varies from two lo Itll ndreds or even thousands. For cxarnple. the Slo gene (rom rat which encodes a pOlassium channcl expressed in oeumns has ¡hu potential lo encode 500 alternalive versions of that producto And, as we shall seco there is B DrosophiJo gene thal can encode as many as 38,000 possible products as a resull of alternative splícing! In litis ehapler \Ve discuss. nol only the mechanisms and reglllation of RNA splicing, bul also ideas about why eukaryotic genes have inlerrupted coding regions. We also describe RNA editing. anolher way initial transcripts can be altered lo change wbat tbey encode.

THE CHEMISTRY OF RNA SPLICING Sequences within the RNA Determine Where Splicing Occurs We now consider Ihe molecular mechanisms of the splicing reaction. How are the ¡ntrons and cxons distinguished from each other? How are introns removed'? How are exons joined with high precision? The borders between introns and exons are marked by specific nucleotide

sequences with in the prc-mRNAs. These sequcnces delinr,ate wherc splicing will occur. Thus. as shown in Figu re 13-2. the exon-intron boundary-that ¡s. the boun dary at the 5' end úf lhe inlTOn -is marked by 1:1 sequence ca lled the 5' splice sitc. The introo-exon boundary at the 3' end of the ¡ntron is marked by the 3' splice site. (The 5' and 3' splice sites were sometimes referred to as the donor and aoccptor siles, respectivcly, but this nome.nclature is rarely used today,) The figure shows a third seqllence neccssary for splicing. This is callcd the bra nch point site (or bmnch point sequence). It is found enürely within Ihe ¡nlmn, usually close to its 3' cnd, and is followcd by a polypyrimidine trael (Py tract), as shown. The consensllS seqllcnce ror each of these elements is shown in Figure 13-2. Th e most highly conserved sequences are the GU in the 5' splice site, the AG in thc 3' sp lice site. and lh~ A al the branch si le. These highly conserved nucleotides are all fOllnd within \he inteon ítself-perhaps not surprisingly, as the sequence of the exons, in contrast to the introns , is constrained by the need to encode the speciñc amino acids of the protein producl.

The lntron ls Removed in a Form Called a Lariat as the F1anking Exons Are Joined Let us oogin by considering Ihe c hemistry of splicing. which is achieved by two sllccessive transcsterification reactions in whic h phosphodiester linkages within the pre-mRNA are broken and new ones are formed (Figure 13-3). The firs t rcacti on is triggered by Ihe 2' OH of the conserved A al the bran ch site. This group acts as a nucleophile to attack the phosp horyl group orlhe conserved G in the 5' splice sile. (This is an SN2 reaction th at p roceeds through a pentavalent phosphorous intermediate.) As a consequence, the phos phodiesler bond between. the sugar a nd the phosphate at the junction between the intron a nd the exon is c1eaved and the freed 5' end of the ¡ntron is joined lo Ihe A within the branch sile. Thus, in addition to the 5' and 3' backbone Iinkages. a third phosphodiester extends frulll th e 2 'OH of tlHtt A to creat(~ a thret:-wé1Y junction (hence its description as a branch poin t). The structure of the three-way ju nction is shown in Figure 13-4 . Notice thal the 5' exon is a leaving grollp in the first transeslerificali on reaction. In the secon d reaction. the 5' exon (more precisely, the newly liberated 3'OH of thc 5' cxon) reverSes its role and becomes a nllcJeophile that altacks the phosphoryl group al the 3' spli ce sit e (Figmc 13-3). This second reaction has two consequences. First, and most important ly. it ¡oins the 5' and 3' exons;

5' e>t transesterifi ca tion reaction, the newly Iiberated inlron has the shape of a laria •. In the two reaction steps , thero is no net gain in the number of chcmica l bonds-two phosphodiester bonds are brokcn, and two new ones ffiH.de. As it is just a question of shuffling bonds, 00 energy input is demanded by the chemistry of Ihis proces!>, But. as we shall see

FIGURE 11~4 Theslructureofthe

three-way junction fonned during the sptidng reaction.

o~

3'end--""' ot ;ntron

7'he SpliCf!()$ome Mochillery

bclow, a largc amount of ATP i5 cunsum ed during the sp li cing rp..oc· tion. This enel-gy is rcqu ired, nol for the chem i5try, bul lo properly assemble and operate the splicing machinery. Another point aboul Ihe aplicing reaction ia direction : whal ensures Ihat sp li cing on ly goes forward- that ¡s, toward the prod· uds shown in Figure 13- 3? 1Wo features thal cou ld conlribu te 10 Ih is are as fo ll ows. First , the forwa rd reacti on involvcs an increase in entropy-a single prc--mRNA molecule is split into two molecu les, the mRNA a nd the Iibcraled lariol. Seca nd , lhe excised cxon is rapid ly degra dad a rter it s removal and so is nol available to partake in the reverse rcaction.

RNAU

RNAI

I

S'

383

.,001

1

GlJ

Exuns from Different RNA Molecules

Can Be Fused by

Trans~Splicing 5' _ _ _ _

In our description of spli cing aboye, we assumed tha t the 5' splicc sitc of one exo ll is ¡oined to the 3 ' s plice s it e of the exon tha! imm ediately foll ows it. T his is nol always the case. In alternative sp li e· ing, exons ca n be skipped, and a given exon is ¡oined to Olle furlh er downstream (as we sec later in tlle text)o In sorne rnses, two cxons carried on diffcrellt RNA molecu les can be spli ced together in a process called tra ns-splidng. Although gencrall y rare, Irans-splicing occurs in almosl all the mRNAs of trypanosomes, In the nema· tode lVorm (e . eJegansl, aH mRNAs undergo trans·splici ng (to attach a 5 ' leader scquence), and many of them und ergo cis·splicing as \Vell , Figun: 13 -5 s hows how the basi c splícing reaction just dcscribed is adaptcd to carry out trans-spli cing.

THE SPLlCEOSOME MACHINERY RNA Splicing I s CarTied O ut by a Large Complex Called the Spliceosome The transcstcrifi cation reactions ¡usl described are medialed by a huge molecu lar "machi no" ealled the spliceosome. Thi s comp lex comprises aboul 150 proleins and 5 RNAs and is sim il ar in size to a ribosomc (Chapter 14), In carrying oul oven él single s plicing reac· tion, the sp li ceosome hyd ro lyzes severa I molecu lcs of ATP. Strikingly. il is believcd lhat many of lhe funclion s of lhe spliccosome are carried a ut by its RNA components rather than the praleios, again reminiscenl of the ribosome. Thus , RNAs locate lhe se(]uence elements at Ihe in tron· exon borders and Iikely participale in ca taly· sis of Ihe splicínS reaction ilseH. The five RNAs fU1, U2, U4, USo and U6) are collectively ca lled small nuclear RNAs (snRNAs). Each of these RNAs is bctwccn 100 and 300 nucleotides long and is complcxed witb several protcins. Thcse RNA protein comp lexcs are callcd small nuclear ribonuclear proleins (snRNPs -prollounced "sollrps"). The spliceosome is the large com· plcx made up of Ihese snRNPs, bul !he cxact makeup difTers al difTcrenl stagcs of the spli ci ng reacHon: different snRNPs come and go al differcnl times, each canying oul particular functions in the reaction. There are 81so man)' proteins wilh in the spliceosome tbat are nol part of tilA snRl'\JPs, and othcrs besides that are on ly loosel)' bound to lile spliceo· sorne.

~

F t G UR f 11·5 Trans-SpUcing. In !fanssplicing. two exoos. inillally found In tI'oQ separate RNA moIerules, are spliced togelher mto a slf'l8le mRNA. rhe chemrstry of mis reaction 1:> !he $olI"TIe as thaI 01!he standafd spIiang reac\lOfl descrhod previou!Jy, and the 5pked prcdJct IS lOOsbnguIShable. The rnIy cWference IS lhat !he oChet product - lile Iariat 111 !he Slandard reacnoo- 15, 1f11r~ a y~ brandl sboctl.Jfe inslead. fus is because the lMIiII ~ oongs togeltle! t\o'IO RNA moIeOJles rather !hao formlllg a loop Wllhin a s¡ogle mdecule.

384

RNA Splicing

The snRNPs have three roles in splicing. They recogn ize the 5' splice si te and the bmnch s ite; they bring those siles togetber as required; and they ca talyze (or help to cata lyze) the RNA c1eavage a nd joining reaétions_ To perform these functions, RNA-RNA, RNA-protein, and protein-protein interactions a re aH important. Wc slart by consideri ng sorne of tbe RN A-RNA interacti ons. Th esc operale wilhin individual snRNPs, between different snRNPs, and between snRNPs a nd the pre-mRNA. Thus , for example, Figure 13-6a shows the interaction, through complementary base-pairiog, of the V I soRNA and the 5' splice s ite in the pre-mRNA. Laler in the reaction, that splice s ite is recognized by Ihe V6 snRNA. In another example, shown in Figure 13-6b. lhe branch site is recognized by the V2 snRNA. A third example, in Figure 13-6c, shows en in teraction between V2 and U6 s nRNAs. This brings the 5' splice s ite and the branch si te together. lt is these and olhar sim il ar interactions, and the rearrangements they lead to, that drive the splicing reaction and contribute to its precision, as we will see a HUJe later. Sorne RNA-frce proteins are involved in splic:ing as menti oned Above. One example. U2AF (U2 auxillary factor). recognizes the polypyrimidine (Py) tracl /3' spli ce site, and, in the ¡n¡tial step of the spli cing reaction, helps another protej o, branc h-poinl binding protein (BBPl, bind to the brancb sile. BBP is tben dísplacp.d by the U2 snRNP. as shown in Figure 13-6d. Other proteins invol vcd in Ihe splici ng reaction includ e RNA-annealing factors . which help load snRNPs onto the mRNA. and DEAD-box helicase proteins. The latter use their ATPase acti vity to di ssociale given RNA-RNA interactions. all owing allernativc pairs to form and thereby driving tbe rearrangemenls thal occur through the splicing reaction. Finally. befare turning lo ¡he spliceosome mediated splicing pathway itself. we look at one further interaction. Figure 13-7 shows the crystal st ruclure of a seetian of fhe Ul snRNA bound lo one nf Ih e protei ns nf Ihe Ul snRNP.

FIGURE 13-6 SomeRNA-RNAhybrids

fOfmed during the splicing te.1ctioo_ In sorne cases, (a) different snRNPs reeognire the same (or overlapping) sequences in the premRNA al different stages 01the splicing reaction, as shO'Ml here for U1 and lJ6 recognizing the 5' splice 51le_ln (b) snRNP U2 is shO'Ml recogmzing !he branch slte_In (e) the RNA:RNA pairing betwee1 the snRNPs U2 and U6 i5 sho.Ml. finally, in (d), Ihe same sequence wthin!he pre-rnRNA i5 recogniled by a protein (001 part of an snRNp) al one 5ta¡¡e and disp\dced by an snRNP al ano!her. Each of these ehanges aeeompanies the arr"!Val or depanure of eomponents ol the spliceosome and a strudural rearrangement lhal is fequired for!he splicing reaction lo proceed

a

b 3'

S'

"

S'

exon 1 S'

e

lfjl.l

,

d

lru:!!M:C==__ ' exon 2

S'

...

exon 2

~~===_

Sp/icing Pothwoys

365

FIGUR E 13-7 Sbucture of spliceosomal pHltein~NA comp5ex: U'A binds haitpin 11 of UI snRNA. (Oubridge C 110 N Evans P.R. y

y

Tea CH., anc! Nagai K. 1994. Nature 372: 432.)

Irnage p epaled v.1th MoISoiPl. BcbSaipt. and Rasle13D.

SPLICING PATHWAYS Assembly, Rearrangements. and Catalysis Within the Spliceosome: the Splicing Palhway The steps or the splicing pathway are shown in Figure 13-8. lnitially. tbe 5' splice site is rccognized by the VI snRNP (using base pairing between its snRNA and the pre-mRNA. shown in Figure 13-6). One subunit of U2AF binds 10 the Py traet and the other 10 the 3' splicc site. The rorrner subunH interacts with BBP and hclps tbat proteio bind lo the branch sile. This arrangetnent of proteins and RNA is caBed the Emly (E) complex. U2 snRNP Ihen binds lo lhe branch site. aíded by U2AF and displacíog BBP. This arrangemenl is caUed the A complex. The oose-pairing belween Ihe U2 snRNA and lhe branch site is such Ibal Ihe branch si te A residue is extruded from the resulting stretch of double helical Rt'lA as a single nucleotide bulge as shown in Figure 13-6b. ThisA residue is Ihus unpaired and avaiJable lo react with tbe 5'splice site. The nex! step is a rearrangement of the A complex to bring together al! three splice sites. This is achieved as follows: the U4 and U6 snRNPs. aJong with !he us snRNP. ¡oin the complexo Together these three snRNPs are mUed the tri-snRNP partide. within which t he U4 and U6 snRNPs are hcld togcthcr by complementary base-pairing between theír RNA components. and the Us snRNP is more loosely associated through protein:protein interaetions. With the enuy of the Iri-snRl\rp, the A complex is converted ioto the B complexo

386

RNA Splicing

FIGURE 13-8 Stepsofthe

spliceosome-mediated splicing readton. 1he assernbly and action of lhe spliceosome are shovvn. ,¡¡nd the details of each step are describeMth free G and a substrate lhat indudes a sequence complementary to the internal guide sequence, it Vllill repeiltedly catalyze deavage of substrate rnolerules. We will have converted a group I intron inlO a ríbozyme. similar to the wé!f !hat !he self-deaving hamrnerhead cooId be converted lo a ribozyrne by separating !he active site from ,he substrate (Chapter 6). We can go a step lur/her by changing \he sequence uf the internal guide sequence and thereby generate taila-made ribonudeases that d eave RNA moIecules of our choice. not enzymes because Ihey

RNA _ _ _ _ _ _ _ _ _ _ " 5' _

ribozyme

5' _ _ _ _ _ _ _ _ _ "

+

S G _ _ __

B ox

389

13-1

F I '" U R E

1 ,"roup I introns can be converted into tRIe ribozymes.

390

RNA Splldng

a pre-mRNA spliceosome

e

b group 11 self-splicing

group 1self-splicing

5'. ":::;"U4

U4

5' ~J, ~ 3'

!

5' _ _ _ '

+

5' _ _ _ 3'

+

OH 3'

'"

G-

1

G-

5' _ _ _ 3' +

(Q)

f I e u R E 13-9 Qoup I and group U intrOfls.. This flgtXe compIA-binding dornéIin of a plOtein IS efK.uded by one e¡((lfJ, wIlile the dirnenzallOn dornain of tnat same protem is encoded by a separate exon. Protein dornains foId indepeodently of!he resr of !he protelfl in which they are found, and often carry out a Single fundion (as we álSCllSSed in Chapler 5). lhus. exoos can oflen be exchanged betweer¡ proteins productively

• First. the borders bctween eXons and introns within a given gene ofien coincide with Ihe boundaries between domains (see Chapler 5) wi.hin Ihe protein encoded by thal gene. That is, it seems lhat each exon very oflen encodes an independently folding uníl 01' protein (often corresponding lo an independent funcHon as well). For exampIe, consider the DNA-binding protein depicted in Figure 1 3-21. Like most DNA-binding proteins. this one has two domnins-the ONA recognition dornaio aod the dimerization domajn. As shown in lhe figure. these dornains {D1 and D2J are el1coded by separate exons (El and E2) within the gene. • Second, many genes. 3n d the proteins fll ey encode. have llpparenlly arisen during evolution in par! via exon duplication and divergence. Proteins made up of repealing units (such as immllnoglobu lins) have probably arisen thi s way (see Chaplee 11 Figure 11-35). The presencc of ¡nlrons bctween each exon makes the duplication more likely. • Th ird, related exons are sometimes found in otheew ise unreJated genes. Thal is. lhere ls cvidence that exons really have been reused in genes encoding different proteins. As an exam ple. consider the LDL receptor gene (Figure 13-22). This gene contain5 50me CXOIl~ Ihal are c1early evolulionarily related lo exons found in the gene encoding the EGF precursor. Al the same time, it has other exons that are cJearly related lo exons from the C9 complemen t gene (Figure 13-22). More extensive examples of cxon accreti on are apparent from Ihe complete seq uences of genomes-for example, the human genome. As shown in Figure 13·23 . there are numerous examples af proteins made up ol' highly relaled domains used in varíous combiualions. encoded by genes made up of shuftled exons. As we have seen, exons tend lo be mlher short (sorne 150 nudeolides or so) while introns vary in lenglh and CaJ1 be very long indeed (up to several hundred kbJ. The size ratio ensures that, for the average gene in a higher eukaryote, recomhinaUon is more Iikely to occur wilhin the ¡ntrons .han within the exons. Thus. exons are more likely lo be reshuFfled lhan disrupted. The mechanism oC splicing-the

Exon Shuffling lDl receptor gene

exon

. ~~:-i-"J '. ~ 1'------~~~'~ "

C9 comploment gene

,

EGF precursor gene

F I (i U R E 13-22 (ienes made up of parts of otfler genes. lhe LDL receptor (!he plasma 10lIl denSlty lipoprotein receptOl') gene conlains a strelch of SIl( exons dosely relategion. As ShOWll in Figwe 13-26c, the gRNA and rnRNA form an RNA-RNA duplex wilh looped out s ingltl-stranded regions opposile whcre Us will be inserted. An endollucJease rncogn.izes and l:uls the mRNA opposite these loops. Editing ¡nvalvas the trnnsfcr of U" into the gap in the message. This process il> catalyzro by lhe enzyma 3' tenninal uridylyl t.ram... Cerase rruTase). After fhe addition of VI>. lhe t",o haJves of !he mRNA are joincd by an RNA ligase. and Ihe "(.>diting" region of tho gRNA conlinues its setion aJolIg the mRNA in a 3' lo S' direction. A single gRNA C8J.1 bu

N~

o~~)

4{)S

o

l~ UJ

cytidine

de.m;n". ,

o

N

F I c; U RE 13-25 lhe dumination of the

base cytosine lo p'oduce urdo

.06

RNA Splicing

a DNA seOOS global. Trends Gene!. 19: 295-298. erllvcley B.R 2001. Alternalivc splicing: Increllsing divc rsilY in the proloomic world. Tronds Genet. 17: 100 - 107. Ladd A.N. and Coopcr T.A. 2002. Finding s ignals tha! rcgulatc altcrnalivc splicing in Ihe posl-genom ic era. Gcnome Bio/. 3: rcviewsOOOB.I - 0008.16. Manifllis T. and Tasic B. 2U02. Alternative pre-rnRNA sp/jcmg aod prolt.'Orne RXpansion in melazoans. Nolu re 418: 236 - 243. Smilb C.W. alld VlI.lcarccl J. 2000. Altcrnalivc prc-rnRNA splicing: The logie of eorniJinatorial control. 'fl'ends Biochem. Sci. 25: 381-388.

rnRNA Transport Drcyfuss e., Kim Y.N .. and Kalaoka N. 2002. McsscngerRNA-binding protei ns and ¡he mcssagt..'S Ihey carry. Not. Rev. MoJ. Cell 8io/. 3: 195-205.

410

RNJI Splicing

RN A Editing Bcnne R. 1996. RNA L>d iri ng: Haw a mcssage is changed. CUIT. Op in oGen. [)ev. 6: 221 -231 .

[Hane V. and Oavidson N.O. 2003. C-Io-U RNA cditing: Mochanisrns loodi ng 1.0 gcnclie diversily. ' . 8iol, Chem. 2711: 1395-1398.

WA. snd Fournicr M.l. 2003. RNA·guidúd nudcolidu JJlodificalion uf !'i)x'suma l Rnd ulh üT RNAs. ' . Biol. C/lcm . 278: 695 - 698. KoogRn L.P.. Ga llo A.. and O'Conncll M.A. 2001 . Thc many rules of Sil KNA editor, Nol. Rcv. Cellet. 2: 869-878.

OUCEllur

Maas S .. Rich A.. s nd Nishikul'8 K. 200 3. A-Io-I RNA editing: Rt.'Ccnt ncws snd resid ual myslerics. ' . 8iol. CJ¡em. 278: 1391 - 1394,

Madison-Anhm ueci S" Gnulls J.• tmd Hajduk S.L. 20112. Ediling machines; T he eomp lu" ilics of trypanosomc RNA L>d ili ng. Ce/J lOB: 435 - 438. Simpson L, Sbiccgo S" snd Aphasizhev R 2003. und ine insortionldeletion RNA cdi ting in trypanosomc mitochondria: A comp lex bu!;ine5¡;. RNA 9: 265 -276.

CHAPTER

Translation

he central qm"stion addressed in this chapter and Ihe n6xl is how genelic ilúormation conlained within the order of nuclootides in messenger RNA (mRNA) is used lo generale the linear sequences of amino acids in proteins. This procesoS is known as translation. or the events we have discussed. lranslation is among the most high ly conserved across a ll organisrns and among Ihe mas! energetica ll y costJy for the ceH. In rapidly growing bacteria! cells . up to 80% of the ceU's energy and 50% of the ctlll 's dry weight are dedIcated lo protein synthesis. Indecd. the synthesis oC a single prolein requires the conrdinatcd action oC wcll ovcr 100 proteins flnd RNAs. Consisten! wilh Ihe more cOIup lex natute of the translation process, we have ruvidcd Qur discussiao iota IWO chapters. lo th is first chapler wc describe the avents thal aUaw doc:oding of the mRNA, snd in Chapler 15 wc describe Ihe nalure of lhe genelic axle and ils recognitia n by Lransfer RNAs. Translatioll IS a much moro formidable challenge in infarmalion trnnsfer than the transcription of ONA jnto RNA. Un li ke !he comp le-mcntarity benvoon the ONA te mplate and the ribonucleotides of Ihe messcnger RNA, tl1e si de c ha ins of amina acids have liule or no spccific affinity for the punne and pyrimi dine bases found in RNA. For cxample, Lhe hydrophobic side chains of Ihe amino acids s lanine, valinc, leucine. and isoleuci ne can nat fonn hydrogen barrds with the amioa snd kelo groups of the nllcIL-'Otid e bases. Likewise, iI is hard lo imagine thal severa! difIerenl combinations of three bases of RNA could farm surfaces with unique affinities for the aromatic ammo acids phenylalanine, tyrosine. s nd Iryptophan. Thus. it soomed unlikely thal direct intf.lractions betweeu the mRNA templs te end tho amino acids cauld be responsible for the spt.'Cific end accursLe ordering of amina adrls in s polypeptid tl. With tbese cons iderntiolls in mind, in 1955 Francis H. (''rick propost.>d. tha! prior 10 tbeir incorporation iuto polypeptides, 8mino acids musl atlach lo a spocial adaptor malecule that is capable of directly inloracting wiLh and rccognizing Ihe tbree-nucleotide-Iong coding unils oCIhe mcssenger RNA. Crick imagined thal the adaplor would be an RNA molecule because it would need lo recognize Ihe code by Watson-Crick basc-pairing rultls. Just two yeaI8 later, PauI C. Zaroecnik and Mahlau B. Hoagland demollstraled that prior lo Iheir incarporation inlo proteins. amino acids are atlached lo a cJnss of RNA molecu1es (represenling 1 5% of aU cellular RNA). These RNAs aro caJlcd lransfer RNAs (ar tRNAs) bernu'it! Ibe aromo acid is subsequentIy transfen-ed to Ihe growing polypcplide chain. The machinl3ry responsible fOI" traosJating the langllage of messengcr RNAs into lhtf language of proteins is composed of fOllr prilllary componenl s: mRNAs, IRNAs, aminoacyl tRNA synlhetascs, and Ih e ribosome. Together, these componenls accomplish Ihe extraordinary

T

QUTL INE

Messenger RNA (p. 4 12)

• Transfer RNA (p. .q 15) Attachment of Amno ,A,cids lo tRNA (p. 417)

lhe Rilxlsorne (p. 423)

InlllatlOn of Translallon (p. 432)

• Translation EIongaoon (p. 440) TeflTllnation o, Translation (p. 448) TranslalJOrl--Dependent Regulahon 01 mRNA and Prciein 5tability (p. 452)

lask of translating acode wriltcn in a four-base alphabel ioto a second code wriUen in the language of the 20 amino acids. The mRNA provi des Iho informatioo Ihat musl be interpreled by the translalion machinery. and is ,hE:! lemplate for Iranslation. Tbo protein-coding rogion of lhe mRNA consists of an ordcrro serios of Ihree-nudeotidelong unils caJled codons lhat specify tho order of amino acids. The tRNAs provide Ihe physical interface betwoen th e ami no acirls beiog added lo tha grow ing polypeptide chai n and Ihe codons io Ihe mRNA. Enzymes callt:d aOlinoHeyl tRNA syolhetases coupl~ aOlinu acids lo spocific tRNAs thal recognize the appropriale codon. The fina1 central pIayor in lranslatioll is lhe ribosome, a remarkablo, multi-megadallon machino composed of both RNA and protein. The ribosome coordinales Iha correel rer.ognition of tho mRNA by eacb tRNA and ('.atalyzes peptide bood formation between the growing polypeptide chain and Ihe amino acids altached lo Ihe selt."Ch:td IRNA . We will firsl consider the kcy altribultlS of p..8ch of Ihese four components. Wo then describe how these components work together lo accomplish traru;laliOlL Rect:ut progn:ss iu elucidatiug the ~lructllre of the componeots of the translational machinery make this an exciling area-Olle that i5 rich in mecharustic insights. Among tlw questions we wiU ask are the foltowing: What is lhe organil'A1tion of nucJeotide sequence information in mRNA? Whal is Ihe slnlcture of tRNAs. and how do aminoacyl tRNA syothetases recognize and attach Ihe correel amino acids lo each tRNA? Finally, how does Lhe ribosomo orchestratc the decoding of n udeotide sequen ce informatioo aud tha addítion of amino acids lo Ibe growing polypeptide chain?

MESSENGER RNA Polypeptide Chains Are Speci6ed by Open~Reading Frames The translation machinery decodes only a portion of each mRNA. As we saw in Chaptor 2, and wiU consider in detail in Chapter 15, the informatioo for proleio synlhesis is in Ihe form of lhrea-nudeotide codons, which tl8ch sptlcify one amino acid. The proteio coding regioo(s) of each mRNA is compasad of a conliguous, non-overlappiog string of codoos callcd nn open-reading f¡-ame (commonly kuown as ao ORf). Each ORF specifies a single proteio and slarls and onds al inlernal s ites within the rnRNA. Tbol ¡s , !he ends of an ORF are distincl from the ends ortho rnRNA. Translation struis at the 5' end Qf lhe open-reading frame and proceeds one coelon al a time to the 3' eud. Tha first and last codoos of an ORF' ilre known as the sta.rt and stop oodollS. [n bacteria, the star! codon is usually 5'-AUG-3' bul 5'-CUG-3' and sometimes even S'-UVG-3' are also used . Eukaryotic ceUs always use 5'-AVG-3' as the start codon. Trus codon has two important functions. Fírst, it spocifies the first amino acirl lo be incorporated into the growing polypeptide chain. Second, it defines th~ roüding rrame for all suLsequent codons. Becausa codoos are immediately adjacent lo each olher aud because corlons are Ihree nuclcotides long, any stretch of rnRNA could be translaled in Ihree different readiog (rames (Figure 14-1). However, once translation starts, each subsequenl codon is always immedialaly adjacent lo (but not overlapping) the previous three-base codf,ln. Tbus, by setting the 10l:ation of tobo first codon, the start codon determines the location of all following codons.

Messenger RN

s ~ IIlInAr.YE 11IlJJl1!I!I!I1!liE11!1!'D u G A r.YEt!I!I1llllll1l[;K!l!![!I;I'JGI!I!ItiI!I!ll1lJ

""" 00080 ",,,, 0000080 s

~

D!l[1'l'J[.lr:.11l!lrJl'll!lI!l!l!lIil1l!lEWil!l[1'l'J[.l_I!I!E!lI!E_t!D!Il!I!E!lI!E!J

00000000000008 FI CU R E 14-1 Tluee possible rodina trames of ,he E.. coli ttp lode, sequence. Start oxIons are shaded in green and stop codons are shaded in red. The amíno acid seq.¡eoce d Ihe encOOed sequence is inooted in !he single let!er code below each codon.

Stop codons. of w hich Ihere are three (5'-UAG-3'. 5'-UGA-3'. and 5'UAA-3'), define !he ond ofilie open-madiog mmo and signal termination of polypoptido synthesis. We can now fully ~ pp reciate the origin of Ibe tenu open-rroding fmme. It is a contiguous sfrotch of codons "read" in a particular frarno (as sel by the first codon) thal is "open" lo translation Lecause illacks a stop codoo (tha! is. unlil Ihe last codon in Ihe ORF). Messenger RNAs contajn at least one opon-readíng frrune. The number of ORFs per mRNA is diffcrenl bclww n eukaryotes and prokaryotes. Eukaryotk mRNAs almost always contain a single ORF. lo contrasto prokaryotic mRNAs frequently contain two or more ORFs and hence can encocle multiple polypeptide chains. Messenger RNAs containiog multiple ORFs are known as polycistronic mRNAs, and those cncoding a singlo ORF are known as monocislronic mRNAs. As you leamed in Chapler 12. polycistronic mRNAs often encode proteins Ihal perform reJated functioos. such as ditTerent steps in the biosynthesis of an amino acid Of nucleotide. Tbe struclures of a Iypical prokaryotic and flllkaryotic mRNA are shown in Figure 14-2.

Prokarvotic mRNAs Have a Ribosome Binding Site that Recruits the Translational Machinery Por l:ranslation to oecor. the ribosome must be recnüted lo the rnRNA. Tu facilitate binding by a ribosome, man)' prokaryotic open-reading frames contain a shorl sequence upslrcam (on the 5' sid e) of the slart codon called the ribosornc binding sitc (RBS). This elemelll is also referred to as a Shine-Dalgarno sequcnce after tha scientists who discovered it on the basis of comparing the sequences of multiple mRNAs. The ribosome bind ing site. typically Tocaled truee to nine Laso pairs on Ihe 5' sida of the slart codoo , is complomentary lo a seqUeJ1ce located near the 3' cnd of one of Ih e RNA r.omponents, th c 16S ribosornal .RNA (rRNA) (sue Figure 14-2a). The ribosome binding site base-pai rs with Ihis RNA componcnt, thereby aligníng Ihe ribosomo whh Ihe beginning of the open-reading framo. Th o core oC this region of the 165 rRNA has the sequence 5'-CCUCCU-3' . Not surprisingly, prokaryotic rihosome hind ing sites are mosl oft on a s ubset of the sequence 5'-AGGAGG-3 '. The f:lx!ent of complementarity and Ih e spacing bl:llwmm Iho rihosome binding site amI thu start codon hus a

414

Tronsfulion

FIGURE 14-2 Slructureofmessenc"" RNA. (a) A polyóstroruc prokaryotic message. The ñbosome binding sife rs indICated by RBS, (b) A monOClSfronic euk.:lryoric messagc. The

S' cap IS indJcatecl by a 'ball" al !he cnd of!he mRNA.

S'

In"

l

''''''

o;JXI

l ===

b

S' NN~NNN AUG GNN 3' S' ~J._..m~'.

5'~

___~____.d"'--= · =-AAAAA,,3'

!

~"""""""

strong intlue nce on how actively a particular open-reading frame is transJated : high complementnrity and proper spad og promotes active lranslation , whe reas limited complementarity and lar poor spacing genera lly supports lowur levels of translatio n. Sorne pmkaeyotic ORFs intemal lo a polydstronic message lack a stmng r ibosome binding site bul are nonetheless actively translated. In these cases tJ\(J start codon often overl aps Ihe 3' end o r the adjaccnl opelH eading frame (mosl afien as the sequellce 5 '-AUCA-3' , wruch contains a stal1 and a stop codon). Thus, a ribosome that has jusI compleled transJali ng lhe upstrearn opnn-reading frame is appropciately positioned lo bcgin trallslating from the sla.r1 codon for Ihe dow nstream open-reading fram e, circum venting the need for a ribosome binding site to recruit the ribosome. This phenomenon of linkcd translatiol1 between overlapping open-reading frames is knowlI as tl'anslatlonal coupling.

Eu karyotic mRNAs Aee Modified at Their 5 ' and 3' Ends to Fadlitate Translation UnJike th eir prokaryoti c cou ntcrpa rts, eukaryot.ic mRNAs recruit ribosomes using a specific r:hemical rnodificatiol1 ca ll ed Ihe 5' cap. whic h is located al Ihe extre me 5' end of the message (see Cha pter 12 and Fjg ure 14·2b). The 5' cap is a melhylated guanine l1ucleotide tha! is ¡oined to the 5' end of tlle mRN A via an unusual 5' to 5' Hnkagc. Crealed in three sleps (see Chaplee 1 2), the guanine nucl eotide 01" Ihe 5' ca p is connected to the 5 ' end of the mRNA throug h three phosp hato grollpS. The resulting structure recruit s the ribosome lo t.hc mRNA. Once bound to the mRNA, the ribosome moves in a 5' - 3' dircction until il encounters a 5'-AUG-3' 5ta rt codon . a process ca lJ ed scanning, Two other features of eukaryotic mamm alia n mRNAs stim ulotc tran slatiOIl . O ne fcature is tl1e prcsc llce. in sorne mRNAs. o" a purinc

Trollsfer RNA

three bas~s upstream or Ihe start r:odon and a guanine immediateIy downstream (5'-G/ANNAUGQ.-3' ). Ttlls sequence was origiually identified by Marilyn Kozak Hnd is referred to as the Kozak sequence. Many euk.aryotic mRNAs Iaek thcse bases, but their presence increases the effidellcy of transIation. In eontrast lo the situatioll in prokaryotes. these bases are thought lo inlerael with inHiator tRNA, nol with the smalJ rRNA. A seeond feature that contributes lo effident translatioll is tllC presence of a poI y-A tail al the extreme 3' elld or Ihe mRNA. As we S8W in Chapter 12, this tail is added enzymaticall y by the enzyme poly-A polymerase. Despite its location al Ihe 3' end ofthe mRNA. the poly-A tait enhallces the level of translation af the mRNA by promoting cffidenl recycling ofribosomes (as \ve shall di scllSs later).

TRANSFER RNA tRNAs Are Adaptors between Codons and Amino Acids Al thtl heart or proteill synthcsis is the "lranslation" or nudeotirle sequenee inrormation (in the form 01' codol1s) inlo alllino acids. This is aeeomplished by IRNA molecules , which ae! as adaptors hctween mdol1 s and the ami no adds they speci fy. There are many types of tRNA moJeeules. hui eaeh is attached lo a s pecific amino acid and cach recognizes a particular codon , or codons. in the mRNA (mosl tRNAs recognize more Ihan one eodon). tRNA molecul es are between 75 and 95 ribonucleotides in le llgth . Although the exacl sequencc varies. all tRNAs have certaio fcalures in c:ommoll. F'irsl. all tRNAs end at the 3' termilllls w ith Ihe seq uenee S'-CCA-3'. Thi s is the site thal is attached lo Ihe cognale amino acid by the e nzyrne aminoacyl tRNA sYlltheta se, as we will considcr below. A sccond slriking aspect oC tRNAs is the presenee o[ several ul1usual bases in their primary slrucfure. These ullu sual [catures are created post-lnmsr:riplionally by enzymali c modilkiltion of normal bases in Iha polynucleolide d16in. For cxa mple. pseudouridine (i'U) is derived from uridine by an isomerization in which Ihe si le of altachment or Ihe uradl base to the ribose is switched from the nitrogen al ring posi tion 1 lo the carboll al ring position 5 (Figure 14-3). Likewise. dihydrouridine (D) is derived from uridine by enzymatic reduclíon of Ihe double bond bclwecn the earbons at positions 5 and 6. Other un usual bases found in tRNA incJude hypoxanlhine. thyminc. and methylguaninc. Theso modified base:;- ore not cssential for tRNA function , bul cell s lacking those modified bases show reducad rates of growth. This suggests Ihat Ihe I\lodified bases lead lo improved tRNA function. For exampl c, as \Ve \Viii see in Chapler 15, hypoxanlhine

uridine

pSeudOlKidine

dihydrouridine

",

~o"

N3 "5

H

:e; " .)z. , 6

H

FIGURE 14-3 Asubsetofmodifiec

nudeosides found in IRNA.

416

Translatian

plays an importanl role in Ihe process of codoll recognition by r:ertain tRNAs.

tRNAs Share a Cornrnon Secondary Structure that Resembles a Cloverleaf A. tve S8 W in Chapler 6. RNA molecules Iypically contain regiol1s of seJf-compl emc ntari ly Ihal enable Ibem lo form Iimited stretches of double helix Ihal are held together by base painng. Other regions of RNA molecules have no complement and hence. are sing1e-stranded. tRNA molecules exhibí! a characteristic pattern of single-sLranded and double-stranded regions (secolldary structure) that can be iIIustrated as a clove r/eaf (Figl,re 14-41. Thc prinr:ipi11 fcalures nf the tRNA r.Ioverleaf are al1 ac:ceplor ste m; Ihree stem· loops. which are referred lo as th e *U loop . Ihe O loop. and Ibe antieodon loop; and a fourth variable loo]). DescriptiollS of each of Ihese fcalure s follows:

• The acceplor stem, so-namee.! boca use it is the site of attaehment of the ami no aeid . is formed by pairing between Ihe 5' and 3' ends of the tRNA molecule. The S'-CCA-3' sequence al Lhe extreme 3' e nd ofthe molecule protrudes from this doubJe-strand ed stem. • The "'U loop is so-named beeau se of the characteristic presence of Ihe un us ual base 1.JrU in the loop. Th e modified base is often round within Ihe sequ ence 5' -T'VUCG-3'. • Tn e O loop takes its nam e from the charaderi stic presence or dihydrouridines in the loop. • The antieodon loop, ilS ils name implies. contaills the antieodon. a three-nucJeotide-long decoding element thal is responsibl e for ceeognizillg the codoll by base-pairing with the mRNA. The antieodon is bracketed on the 3' cnd by a purine and on its 5' ond by uracil. • The variable loop sits between the anlieodon loop and the 'IJIU loop. and. as its name implies. varies in size from 3 to 21 bases.

FI (; U R E

14-4 Ooverfeaf represenlafion

of!he secondal)' structure of tRNA. In mis representation 01 a tRNA. Ihe base-pairing between different parlS 01 the tRNA are irJdi..cated by dottecl led fines.

me

acceptor arm

\jiu loop

o

anticodon loop

Attochment 01 Amino Acids to tRNA 3'

• 5'

e

b

417

~UIoOp

I

;:=-3'

-accepla

Oloop~

"m

-

antiaxion Sfem anliCodon loop

F I CU RE 14-5 Conversion betwH n Ihe doverleal ..,d Ihe ildual thrH -dimensional strvcture (a) CIoverleal represenlafion. (b) l -shaped representilfion showin¡¡ tf-.e Iociltion 01 fhe basepaired regions of !'he final foIded fRNA. (e) Ribbon representiltlon ot !he actual toIded sffUCture 01 a tRNA Note Iha, alfhough lh6 diagram ~lustrales ha.Y!he actual tRNA structure IS relatcd 10 the doverIeaf reptesemoJtior\ a tRNA does not attam its final sfructute by firsI base-painng and then loIdIng N"1o an l-shape. 01 a tRNA.

tRNAs H ave an L.S haped T hree-Dimensional Structure Tite cloverleaf reveals regiolls of self-complem elllarity within tRNAs. Whot is the actual tltree-dimens ionaJ configuration oC this adaptor malocule? X-ray crysta llography rcveals nn L-shaped tertiary struclure in whic:h the terOlinus of Ihe acceplor stem is al one end oC tha molecule I:I lld the anticodon loop is fl bou l 70 A away al the other end. To undel"s tand th e relatiollship of lru s L-shnped structure (d cpid ed as an upside-down L in Figure 14 -5 ) lo the cJoverlear. cons ider the foll owing: the acceptor stem a nd the stem of the 'JIU loop forOl an extended h eHx in the fin al tRNA structure. Similarly. the anticodon stcm an d the slem of the D loop form a second extcnded helix. These two extended hclices align al a right angle lo each other. with the O loop a nd the 'l/U loop coming togclhcr. In the fin a l ¡mage. tha two extended he lices adopl their proper helical configuratioll . Thrce killds of interoctiolls stabi lize thi s L-shapcd struclurc. The firsl is hydrogen bonds between bases in differenl helica l regions thal are broughl neer each olher in three-di mensionnl spaee by the terliary structute. These are generally unconvelltionaJ (non-Watson-Crick) bonding. The second are illteractions between the bases and the sugar-phosphate baddx)I1e. The lhird kind of slabiliz ing interaction is Ihe addil ional base staoo ng gained fmm forma tion oC fue two extended regions of base puiring.

ATTACHMENT OF AMINO AClDS TO tRNA IRNAs Are C harged by the:: Attachment of an Amino Acid to the 3 / Terminal Adenosine N ucleotide via a High-Energy A cyl Linkage IRNA 1l1olecules lo which a n am ino acid is attaehed are sa id lo be charged. and tRNAs that lack an amino acid are sai d lo be unchargcd. Charging raquires an aey! linkage behveen the carboxyl group o[ the

418

1hmslalion

amil10 flc id and the 2'- or 3'-hydroxyl gmup (see belowJ 01" the adenosine Ilucleotide that protrudes from the aecep tor stem. This acyl lin kage is c:onsidered lo be a high-energy bond in Ihat its hydrolysis resulls in a large change in free ellergy. This is significant for protein synthesis: the energy released when the bond is broken holps drive the formation of the peptido honds thal link amino acids lo each other in polypeplide chains, as we will seo heJow.

Aminoacyl tRNA Synthetases Charge tRNAs in Two Steps AH aminoacyl tRNA synthetases attach al1 ami no acid lo 8 tRNA in Iwo enzymatic steps (Figure 14-6). Step one is adenylylation in which the amino acid reacts with ATP lo uecome adenylyLated witl) the CO Il-

FIC:;UR E 14-6 The two steps of IIminOflC)'J-IRNA chllrging. (a) Aden~rion of amno acid (b) Transfer of the adef1)fy1aled amino acid 10 tR~ The

process shown is for

a R

O

1

11

+

NH-C - C -O~

,

a dass It IRNA synthetase.

1

O 11

oo 11

11

O'

o-

o-

JB

-o - i -O- i -O - f-O~

H

ATP

amino acid

"i--( OH

3'

1 NH

_¡J-o-Lo~ I !_

2

+

H adenylylaled ?mino acid

o 11 -o-P-o-P-oI I ooO

OH

11

pyrophosphale

3'

b adenylylaled amino acid

o ~ NH-~R -MO -o-~-o

'

1

H

I

O"

OH

3' 3' OH

---------"'"

! +

~-[-o~ OH

3'

JABLE 14·1 Casses of Aminoacyl tRNA synthetases· Class 1I

Quartemary $tructure

Class I

G~

(a~2)

Aja

I",l

GI, Glc

P.e

(az)

Se,

«(\2)

,",

(a~)

Hes A,p

(a2) (a2)

A'9 Cy'

Mel Val lte

fw>

(a:,.)

Leu

Ly'

la,)

Phe

(a2f.\;¡)

Ty, T,p

Quartema~

$truclure

lol lal lol (a2)

(a2)

lol lol lol lal lal

SOurce. Data Irom Delarue M. 1995. Jltmnoacyl. lANA synU1etases. Wrcnt (}pIr"Wn ff1 Sf1vC11J(8/ B.UIogy 5: 48 - 55. adaptecl lrom Table 1. 'CIass I en~Bfe general~ monomeric. whereas class 11 er"rZ)'I"'"lE!" are r:lirreric Of letratlle1fc. v.olh ,~1Ó,.Jt!S

lfU'" I""J SLtIunils CQrllributioylo!he bindfig sile lur El ""'yle tnNA o. and 13 refeo 10 >;ub ....,ill;

Ql Ihe lANA syn¡hewses ane:! Ihe sub5CI lpls íPdicate~" YOjChioolBlry,

comitant release of pyrophosphate. Adenylylation refers to transfer of AMP, as opposed lo adenylati.on, which would indicate the transfer of adenine. As we have seen in fhe case of polynucleotide synlhesis (sce Chapler 8), the principa l dri ving force fo r Ihe adenylylation rcacHon is the subsequent hydrolysis of pyrophosphate by pyrophosphatase. As a result 01" adenylylation , Ihe runino acid is attached to adenylic acid via a high-energy ester bon d L., which the carbonyl group of the amino acid is join cd to Ihe phosphoryl group af AMP, Stcp two is tRNA charging in which the adenylylated amino aeid, which remains tightly bound to Ihe synthetase. reacts with tR:NA. Thi s rcad ion result s in the ITalls fer of the amino acid lo the 3' end or the tRNA via the 2' , or 3'-hydroxyl find the concomitant release al" AME There are t\Vo c1asses of tRNA synthetases (Table 14-1). Class I cnzym es attach th e amino add to the Z'OH of the tRNA and are gener-ally monomeric. Class n enzynms altac h the ilmino acid lo Ihu 3'OH of !he tRNA and are typicaUy dimeric or tetrarnari c. Although the inilial ooup ling between Ihe tRNA and the amino acid are different. ouce relcased from the synthetase. the a mino add rapi dly equilibrates hetwcen attachme nt al Ihe 3'OH and the z'OH.

Each Aminoacyl tRNA Synthetase Attaches a Single Amino Acid to One or More tRNAs Each of the 20 ami no acids is attached to the appropriate tRNA by a single, dedicated tRNA synthetase. Because masl umino acids are specified by more tban one codon (see Qlapter 151. iI is not uncommon for one synthetase to !"ccognize and charge more than one tRNA (known as isoal:cepting tRNAs). Nev~rtheles s. the same tRNA syl1thetase is respollsible for charging all tRNAs for a particular amino aeid. Thus, olle and only one tRNA synthetase atlaches each amino acid to aH a l' lhe appropriate tRNAs. Mosl organis ms ha ve zo diJferen t tRNA synth etases, but t his is nol aJways th e case. Fa!" exampl e, sorne bacteria lack a synthetase for chargi ng Ihe IRNA for gJutam ine (tRNAGln) with its coguate

420

TIunslulion

amino acid. Instead. a singl e spades of aminoacyl IRNA syn thelase charges tRNA G1n as ",eH as tRNAGlu with glutamale. A socond enzyme Ihen converts (by amination) Ihe gJutamate moíely of the charged tRNAGln molecules lo glutamine. Thal ¡s, Clu-tRNACln is aminated to GJn_IRNA G1n (lhe prefix ¡dentifies Ihe amino acid nnd the superscript identines the nature of the tRNA). The presence of Ihis second ellzyme removes the need for a gJutamine tRNA synthelase. NevertheJess, all aminoacyJ tRNA synth etase can never 811ach more than one kind of amino acid lo 8 given IRNA.

tRNA Synthetases Recognize Unique Structural Features oí Cognate tRNAs As we can see from the above considerations, aminoacyl tRNA syllIhe lases face l'wo important chaJlenges: they must recogllize tlle corra1 sel of tRNAs for a particular amino add, and they musl charge all of lhese isoaccepting tRNAs with the corred amino aeid. 80th processcs mus! be carried out wilh high fidelity. Let us fU'St CQnsider the specificity of tRNA recogniHon: wha! featores of the tRNA molecule cnable a synthetase lo discriminate cognate, isoaccepting tRNAs from the IRNAs for the other 19 amino acids? Genetic, biochemk..al. and X-ruy crystaJ10gruphic evidcm:a indicate that Ihe specificity detenninallls are cJuslcroo al t\Vo distanl siles 011 the molec.ule: Ihe acr.eplor slem 8nd thl1 antir:odon loop (Figure 14-7). The acceptor stem is an especialJy importanl determinant ror the specifici!y of tRNA sy nthetase recognition. In some cases changing a single base pair in tJle acceptor stem (a particular base pair known as tho discrimínator) is surficient lo eonvert the recognition specificity of a tRNA hom one synthetase to another. Nonetheless, the antieodon loop frequelltJy contribules to discrimination as well. The synthetase for glutamine, for example, makes numerous contacts in both the acceplor stem and fieros!> the anticodol1 loop, induding the 8ntir.odOll ¡tself (Figure 14-8). acceptor

F I G U R E 14-7 st,ucture 01 tRNA:

elements l"equi,ed '0' aminoacyl synthetase fecognition.

antícodon M.O" ',--

anticodon Ioop __-':

discrimlnafOr

3' acceplor

/ Hluchmenl 01 Amino Adds fo IRNA

421

f I e u R E 14-8 Co-ayslal Slructure o, glutaminyi ammoacyl tRNA synthetase with tRNAGIR. The enlyfOC IS shovm in gray and tRNA,c-. IS shoINrJ in purple. The yetlOlN, red, and green molecule;s glufaminyi Nv1P. NOIe ¡he proximity c:J this moleOJle lO the 3' end of the tRNA aOO the points of contact between !he IRNA and the synthelase. (Ra¡h V.L, SiMan LF., Beijer B. SplOat B.5., and Steitz TA 1998. 5mJcture 6 : 439 - 449.) lmage preparedv.ith BOOScñpl. MoIScnpl. an' 'in Ihe fut ura P site of Ihe small subunit, rasu lting in the formation of lha 438 pre-initiation complex,

1F2

• +