B. Alberts, A. Johnson, J. Lewis и др. - Molecular Biology of The Cell (6th edition) (1120996), страница 47
Текст из файла (страница 47)
Erickson,Cell 84:155–164, 1996. With permission from Elsevier.)A second feature of these protein domains that explains their utility is the easewith which they can be integrated into other proteins. Two of the three domainsillustrated in Figure 3–15 have their N- and C-terminal ends at opposite poles ofthe domain. When the DNA encoding such a domain undergoes tandem duplication, which is not unusual in the evolution of genomes (discussed in Chapter 4),the duplicated domains with this “in-line” arrangement can be readily linked inseries to form extended structures—either with themselves or with other in-linedomains (Figure 3–16).
Stiff extended structures composed of a series of domainsare especially common in extracellular matrix molecules and in the extracellularportions of cell-surface receptor proteins. Other frequently used domains, including the kringle domain illustrated in Figure 3–15 and the SH2 domain, are of a“plug-in” type, with their N- and C-termini close together. After genomic rearrangements, such domains are usually accommodated as an insertion into a loopregion of a second protein.A comparison of the relative frequency of domain utilization in differenteukaryotes reveals that, for many common domains, such as protein kinases, thisfrequency is similar in organisms as diverse as yeast, plants, worms, flies, andhumans. But there are some notable exceptions, such as the Major Histocompatibility Complex (MHC) antigen-recognition domain (see Figure 24–36) thatis present in 57 copies in humans, but absent in the other four organisms justmentioned.
Domains such as these have specialized functions that are not sharedwith the other eukaryotes; they are assumed to have been strongly selected forduring recent evolution to produce the multiple copies observed. Similarly, theSH2 domain shows an unusual increase in its numbers in higher eukaryotes; suchdomains might be assumed to be especially useful for multicellularity.(A)(B)MBoC6 m3.17/3.16Certain Pairs of Domains Are Found Together in Many ProteinsWe can construct a large table displaying domain usage for each organism whosegenome sequence is known.
For example, the human genome contains the DNAsequences for about 1000 immunoglobulin domains, 500 protein kinase domains,250 DNA-binding homeodomains, 300 SH3 domains, and 120 SH2 domains. Inaddition, we find that more than two-thirds of all proteins consist of two or moredomains, and that the same pairs of domains occur repeatedly in the same relative arrangement in a protein. Although half of all domain families are commonto archaea, bacteria, and eukaryotes, only about 5% of the two-domain combinations are similarly shared.
This pattern suggests that most proteins containingespecially useful two-domain combinations arose through domain shuffling relatively late in evolution.yeastEp1PHDEp2wormEp1The Human Genome Encodes a Complex Set of Proteins,Revealing That Much Remains UnknownThe result of sequencing the human genome has been surprising, because itreveals that our chromosomes contain only about 21,000 protein-coding genes.Based on this number alone, we would appear to be no more complex than thetiny mustard weed, Arabidopsis, and only about 1.3-fold more complex than anematode worm.
The genome sequences also reveal that vertebrates have inherited nearly all of their protein domains from invertebrates—with only 7% of identified human domains being vertebrate-specific.Each of our proteins is on average more complicated, however (Figure 3–17).Domain shuffling during vertebrate evolution has given rise to many novelPHDPHDPHDEp2BrEp2BrhumanZnfEp1PHDPHDBMBFigure 3–17 Domain structure of a groupof evolutionarily related proteins thatare thought to have a similar function. Ingeneral, there is a tendency for the proteinsin more complex organisms, such ashumans, to contain additional domains—asis the case for the DNA-binding proteincompared here.MBoC6 m3.19/3.17THE SHAPE AND STRUCTURE OF PROTEINS123Figure 3–18 Two identical proteinsubunits binding together to forma symmetric protein dimer. The Crorepressor protein from bacteriophagelambda binds to DNA to turn off a specificsubset of viral genes. Its two identicalsubunits bind head-to-head, held togetherby a combination of hydrophobic forces(blue) and a set of hydrogen bonds (yellowregion).
(Adapted from D.H. Ohlendorf,D.E. Tronrud and B.W. Matthews, J. Mol.Biol. 280:129–136, 1998. With permissionfrom Academic Press.)combinations of protein domains, with the result that there are nearly twice asmany combinations of domains found in human proteins as in a worm or a fly.Thus, for example, the trypsinlike serine protease domain is linked to at least 18other types of protein domains in human proteins, whereas it is found covalentlyjoined to only 5 different domains in the worm.
This extra variety in our proteinsMBoC6 m3.20/3.18greatly increases the range of protein–protein interactions possible (see Figure3–79), but how it contributes to making us human is not known.The complexity of living organisms is staggering, and it is quite sobering tonote that we currently lack even the tiniest hint of what the function might befor more than 10,000 of the proteins that have thus far been identified throughexamining the human genome. There are certainly enormous challenges aheadfor the next generation of cell biologists, with no shortage of fascinating mysteriesto solve.Larger Protein Molecules Often Contain More Than OnePolypeptide ChainThe same weak noncovalent bonds that enable a protein chain to fold into a specific conformation also allow proteins to bind to each other to produce largerstructures in the cell. Any region of a protein’s surface that can interact withanother molecule through sets of noncovalent bonds is called a binding site.
Aprotein can contain binding sites for various large and small molecules. If a binding site recognizes the surface of a second protein, the tight binding of two foldedpolypeptide chains at this site creates a larger protein molecule with a preciselydefined geometry.
Each polypeptide chain in such a protein is called a proteinsubunit.In the simplest case, two identical folded polypeptide chains bind to eachother in a “head-to-head” arrangement, forming a symmetric complex of twoprotein subunits (a dimer) held together by interactions between two identicalbinding sites. The Cro repressor protein—a viral gene regulatory protein that bindsto DNA to turn specific viral genes off in an infected bacterial cell—provides anexample (Figure 3–18).
Cells contain many other types of symmetric protein complexes, formed from multiple copies of a single polypeptide chain (for example,see Figure 3–20 below).Many of the proteins in cells contain two or more types of polypeptide chains.Hemoglobin, the protein that carries oxygen in red blood cells, contains twoidentical α-globin subunits and two identical β-globin subunits, symmetricallyarranged (Figure 3–19).
Such multisubunit proteins are very common in cells,and they can be very large (Movie 3.6).Some Globular Proteins Form Long Helical FilamentsMost of the proteins that we have discussed so far are globular proteins, in whichthe polypeptide chain folds up into a compact shape like a ball with an irregular surface. Some of these protein molecules can nevertheless assemble to formfilaments that may span the entire length of a cell. Most simply, a long chain ofidentical protein molecules can be constructed if each molecule has a bindingββααFigure 3–19 A protein formed as asymmetric assembly using two each oftwo different subunits.
Hemoglobin is anabundant protein in red blood cells thatcontains two copies of α-globin (green)and two copies of β-globin (blue). Each ofMBoC6 m3.22/3.19these four polypeptide chains contains aheme molecule (red), which is the site thatbinds oxygen (O2). Thus, each moleculeof hemoglobin in the blood carries fourmolecules of oxygen. (PDB code: 2DHB.)124Chapter 3: ProteinsFigure 3–20 Protein assemblies. (A) A protein with just one binding site canform a dimer with another identical protein. (B) Identical proteins with twodifferent binding sites often form a long helical filament.