David Eisenberg » Publications
2007
- Anderson DH, Kickhoefer VA, Sievers SA, Rome LH, Eisenberg D. (2007).
Draft crystal structure of the vault shell at 9-A resolution.
PLoS Biol.. Nov 2007. 5(11):e318.
[Abstract]
Vaults are the largest known cytoplasmic ribonucleoprotein structures and may function in innate immunity. The vault shell self-assembles from 96 copies of major vault protein and encapsulates two other proteins and a small RNA. We crystallized rat liver vaults and several recombinant vaults, all among the largest non-icosahedral particles to have been crystallized. The best crystals thus far were formed from empty vaults built from a cysteine-tag construct of major vault protein (termed cpMVP vaults), diffracting to about 9-A resolution. The asymmetric unit contains a half vault of molecular mass 4.65 MDa. X-ray phasing was initiated by molecular replacement, using density from cryo-electron microscopy (cryo-EM). Phases were improved by density modification, including concentric 24- and 48-fold rotational symmetry averaging. From this, the continuous cryo-EM electron density separated into domain-like blocks. A draft atomic model of cpMVP was fit to this improved density from 15 domain models. Three domains were adapted from a nuclear magnetic resonance substructure. Nine domain models originated in ab initio tertiary structure prediction. Three C-terminal domains were built by fitting poly-alanine to the electron density. Locations of loops in this model provide sites to test vault functions and to exploit vaults as nanocapsules.
- Kim SM, Bowers PM, Pal D, Strong M, Terwilliger TC, Kaufmann M, Eisenberg D. (2007).
Functional linkages can reveal protein complexes for structure determination.
Structure. Sep 2007. 15(9):1079-89.
[Abstract]
In the study of protein complexes, is there a computational method for inferring which combinations of proteins in an organism are likely to form a crystallizable complex? Here we attempt to answer this question, using the Protein Data Bank (PDB) to assess the usefulness of inferred functional protein linkages from the Prolinks database. We find that of the 242 nonredundant prokaryotic protein complexes shared between the current PDB and Prolinks, 44% (107/242) contain proteins linked at high confidence by one or more methods of computed functional linkages. Similarly, high-confidence linkages detect 47% of known Escherichia coli protein complexes, with 45% accuracy. Together these findings suggest that functional linkages will be useful in defining protein complexes for structural studies, including for structural genomics. We offer a database of inferred linkages corresponding to likely protein complexes for some 629,952 pairs of proteins in 154 prokaryotes and archaea.
- Goldschmidt L, Cooper DR, Derewenda ZS, Eisenberg D. (2007).
Toward rational protein crystallization: A Web server for the design of crystallizable protein variants.
Protein Sci.. Aug 2007. 16(8):1569-76.
[Abstract]
Growing well-diffracting crystals constitutes a serious bottleneck in structural biology. A recently proposed crystallization methodology for "stubborn crystallizers" is to engineer surface sequence variants designed to form intermolecular contacts that could support a crystal lattice. This approach relies on the concept of surface entropy reduction (SER), i.e., the replacement of clusters of flexible, solvent-exposed residues with residues with lower conformational entropy. This strategy minimizes the loss of conformational entropy upon crystallization and renders crystallization thermodynamically favorable. The method has been successfully used to crystallize more than 15 novel proteins, all stubborn crystallizers. But the choice of suitable sites for mutagenesis is not trivial. Herein, we announce a Web server, the surface entropy reduction prediction server (SERp server), designed to identify mutations that may facilitate crystallization. Suggested mutations are predicted based on an algorithm incorporating a conformational entropy profile, a secondary structure prediction, and sequence conservation. Minor considerations include the nature of flanking residues and gaps between mutation candidates. While designed to be used with default values, the server has many user-controlled parameters allowing for considerable flexibility. Within, we discuss (1) the methodology of the server, (2) how to interpret the results, and (3) factors that must be considered when selecting mutations. We also attempt to benchmark the server by comparing the server's predictions with successful SER structures. In most cases, the structure yielding mutations were easily identified by the SERp server. The server can be accessed at http://www.doe-mbi.ucla.edu/Services/SER.
- Salwinski L, Eisenberg D. (2007).
The MiSink Plugin: Cytoscape as a graphical interface to the Database of Interacting Proteins.
Bioinformatics. Aug 2007. 23(16):2193-5.
[Abstract]
The MiSink Plugin converts Cytoscape, an open-source bioinformatics platform for network visualization, to a graphical interface for the database of interacting proteins (DIP: http://dip.doe-mbi.ucla.edu). Seamless integration is possible by providing bi-directional communication between Cytoscape and any Web site supplying data in XML or tab-delimited format. Availability: MiSink is freely available for download at http://dip.doe-mbi.ucla.edu/Software.cgi.
- Sawaya MR, Sambashivan S, Nelson R, Ivanova MI, Sievers SA, Apostol MI, Thompson MJ, Balbirnie M, Wiltzius JJ, McFarlane HT, Madsen AØ, Riekel C, Eisenberg D. (2007).
Atomic structures of amyloid cross-beta spines reveal varied steric zippers.
Nature. May 2007. 447(7143):453-7.
[Abstract]
Amyloid fibrils formed from different proteins, each associated with a particular disease, contain a common cross-beta spine. The atomic architecture of a spine, from the fibril-forming segment GNNQQNY of the yeast prion protein Sup35, was recently revealed by X-ray microcrystallography. It is a pair of beta-sheets, with the facing side chains of the two sheets interdigitated in a dry 'steric zipper'. Here we report some 30 other segments from fibril-forming proteins that form amyloid-like fibrils, microcrystals, or usually both. These include segments from the Alzheimer's amyloid-beta and tau proteins, the PrP prion protein, insulin, islet amyloid polypeptide (IAPP), lysozyme, myoglobin, alpha-synuclein and beta(2)-microglobulin, suggesting that common structural features are shared by amyloid diseases at the molecular level. Structures of 13 of these microcrystals all reveal steric zippers, but with variations that expand the range of atomic architectures for amyloid-like fibrils and offer an atomic-level hypothesis for the basis of prion strains.
- Guo Z, Eisenberg D. (2007).
The mechanism of the amyloidogenic conversion of T7 endonuclease I.
J. Biol. Chem.. May 2007. 282(20):14968-74.
[Abstract]
Amyloid fibrils are associated with a range of human disorders. Understanding the conversion of amyloidogenic proteins from their soluble forms to amyloid fibrils is critical for developing effective therapeutics. Previously we showed that T7 endonuclease I forms amyloid-like fibrils. Here we study the mechanism of the amyloidogenic conversion of T7 endonuclease I. We show that T7 endonuclease I forms fibrils at pH 6.8, but not at pH 6.0 or 8.0. The amyloidogenicity at pH 6.8 is not correlated with thermodynamic stability, unfolding cooperativity, or solubility. Thermal melting experiments at various pH values show that the protein has a distinctive thermal transition at pH 6.8. The transition at pH 6.8 has a lower transition temperature than the unfolding transitions observed at pH 6.0 and 8.0 and leads to a beta-rich conformation instead of an unfolded state. Electron microscopy shows that the thermal transition at pH 6.8 results in fibril formation. The thermal transition at pH 6.8 leads to a protein state that is not accessible at pH 6.0 or 8.0, showing that the existence of the amyloidogenic conformation of T7 endonuclease I depends sensitively on solution conditions. Therefore, we propose that fibrillizing proteins need to be "prepared" for fibrillization. Preparation may consist of amino acid replacements or changing solution conditions and may require retention of some aspects of native structure. In this model, some amyloid-enhancing mutations decrease protein stability, whereas others have little effect.
- Strong M, Eisenberg D. (2007).
The protein network as a tool for finding novel drug targets.
Progress in drug research. Fortschritte der Arzneimittelforschung. Progrès des recherches pharmaceutiques. 2007. 64:191, 193-215.
[Abstract]
Proteins are often referred to as the molecular workhorses of the cell since they are responsible for the majority of functions within a living cell. From the generation of energy, to the replication of DNA, proteins play a central role in most cellular functions. Because of their importance to cellular viability, proteins are commonly the target of therapeutic drugs, ranging from antimicrobial to anticancer drugs. With the rise of drug resistant and multi-drug resistant forms of many diseases, it has become increasingly important to develop new strategies to identify alternative drug targets. One such strategy arises from the analysis of protein networks. Protein networks help define individual proteins within the context of all other cellular proteins. In this chapter we discuss methods for the identification and analysis of genome-wide protein networks, and discuss how protein networks can be used to aid the identification of novel drug targets.
- Goulding CW, Bowers PM, Segelke B, Lekin T, Kim CY, Terwilliger TC, Eisenberg D. (2007).
The structure and computational analysis of Mycobacterium tuberculosis protein CitE suggest a novel enzymatic function.
J. Mol. Biol.. Jan 2007. 365(2):275-83.
[Abstract]
Fatty acid biosynthesis is essential for the survival of Mycobacterium tuberculosis and acetyl-coenzyme A (acetyl-CoA) is an essential precursor in this pathway. We have determined the 3-D crystal structure of M. tuberculosis citrate lyase beta-subunit (CitE), which as annotated should cleave protein bound citryl-CoA to oxaloacetate and a protein-bound CoA derivative. The CitE structure has the (beta/alpha)(8) TIM barrel fold with an additional alpha-helix, and is trimeric. We have determined the ternary complex bound with oxaloacetate and magnesium, revealing some of the conserved residues involved in catalysis. While the bacterial citrate lyase is a complex with three subunits, the M. tuberculosis genome does not contain the alpha and gamma subunits of this complex, implying that M. tuberculosis CitE acts differently from other bacterial CitE proteins. The analysis of gene clusters containing the CitE protein from 168 fully sequenced organisms has led us to identify a grouping of functionally related genes preserved in M. tuberculosis, Rattus norvegicus, Homo sapiens, and Mus musculus. We propose a novel enzymatic function for M. tuberculosis CitE in fatty acid biosynthesis that is analogous to bacterial citrate lyase but producing acetyl-CoA rather than a protein-bound CoA derivative.
2006
- Nelson R, Eisenberg D. (2006).
Structural models of amyloid-like fibrils.
Adv. Protein Chem.. 2006. 73:235-82.
[Abstract]
Amyloid fibrils are elongated, insoluble protein aggregates deposited in vivo in amyloid diseases, and amyloid-like fibrils are formed in vitro from soluble proteins. Both of these groups of fibrils, despite differences in the sequence and native structure of their component proteins, share common properties, including their core structure. Multiple models have been proposed for the common core structure, but in most cases, atomic-level structural details have yet to be determined. Here we review several structural models proposed for amyloid and amyloid-like fibrils and relate features of these models to the common fibril properties. We divide models into three classes: Refolding, Gain-of-Interaction, and Natively Disordered. The Refolding models propose structurally distinct native and fibrillar states and suggest that backbone interactions drive fibril formation. In contrast, the Gain-of-Interaction models propose a largely native-like structure for the protein in the fibril and highlight the importance of specific sequences in fibril formation. The Natively Disordered models have aspects in common with both Refolding and Gain-of-Interaction models. While each class of model suggests explanations for some of the common fibril properties, and some models, such as Gain-of-Interaction models with a cross-beta spine, fit a wider range of properties than others, no one class provides a complete explanation for all amyloid fibril behavior.
- Eisenberg D, Nelson R, Sawaya MR, Balbirnie M, Sambashivan S, Ivanova MI, Madsen AØ, Riekel C. (2006).
The structural biology of protein aggregation diseases: Fundamental questions and some answers.
Acc. Chem. Res.. Sep 2006. 39(9):568-75.
[Abstract]
Amyloid fibrils are found in association with at least two dozen fatal diseases. The tendency of numerous proteins to convert into amyloid-like fibrils poses fundamental questions for structural biology and for protein science in general. Among these are the following: What is the structure of the cross-beta spine, common to amyloid-like fibrils? Is there a sequence signature for proteins that form amyloid-like fibrils? What is the nature of the structural conversion from native to amyloid states, and do fibril-forming proteins have two distinct stable states, the native state and the amyloid state? What is the basis of protein complementarity, in which a protein chain can bind to itself? We offer tentative answers here, based on our own recent structural studies.
- Guo Z, Eisenberg D. (2006).
Runaway domain swapping in amyloid-like fibrils of T7 endonuclease I.
Proc. Natl. Acad. Sci. U.S.A.. May 2006. 103(21):8042-7.
[Abstract]
Amyloid fibrils are associated with >20 fatal human disorders, including Alzheimer's, Parkinson's, and prion diseases. Knowledge of how soluble proteins assemble into amyloid fibrils remains elusive despite its potential usefulness for developing diagnostics and therapeutics. In at least some fibrils, runaway domain swapping has been proposed as a possible mechanism for fibril formation. In runaway domain swapping, each protein molecule swaps a domain into the complementary domain of the adjacent molecule along the fibril. Here we show that T7 endonuclease I, a naturally domain-swapped dimeric protein, can form amyloid-like fibrils. Using protein engineering, we designed a double-cysteine mutant that forms amyloid-like fibrils in which molecules of T7 endonuclease I are linked by intermolecular disulfide bonds. Because the disulfide bonds are designed to form only at the domain-swapped dimer interface, the resulting covalently linked fibrils show that T7 endonuclease I forms fibrils by a runaway domain swap. In addition, we show that the disulfide mutant exists in two conformations, only one of which is able to form fibrils. We also find that domain-swapped dimers, if locked in a close-ended dimeric form, are unable to form fibrils. Our study provides strong evidence for runaway domain swapping in the formation of an amyloid-like fibril and, consequently, a molecular explanation for specificity and stability of fibrils. In addition, our results suggest that inhibition of fibril formation for domain-swapped proteins may be achieved by stabilizing domain-swapped dimers.
- Bennett MJ, Sawaya MR, Eisenberg D. (2006).
Deposition diseases and 3D domain swapping.
Structure. May 2006. 14(5):811-24.
[Abstract]
Protein aggregation is a feature of both normal cellular assemblies and pathological protein depositions. Although the limited order of aggregates has often impeded their structural characterization, 3D domain swapping has been implicated in the formation of several protein aggregates. Here, we review known structures displaying 3D domain swapping in the context of amyloid and related fibrils, prion proteins, and macroscopic aggregates, and we discuss the possible involvement of domain swapping in protein deposition diseases.
- Strong M, Sawaya MR, Wang S, Phillips M, Cascio D, Eisenberg D. (2006).
Toward the structural genomics of complexes: crystal structure of a PE/PPE protein complex from Mycobacterium tuberculosis.
Proc. Natl. Acad. Sci. U.S.A.. May 2006. 103(21):8060-5.
[Abstract]
The developing science called structural genomics has focused to date mainly on high-throughput expression of individual proteins, followed by their purification and structure determination. In contrast, the term structural biology is used to denote the determination of structures, often complexes of several macromolecules, that illuminate aspects of biological function. Here we bridge structural genomics to structural biology with a procedure for determining protein complexes of previously unknown function from any organism with a sequenced genome. From computational genomic analysis, we identify functionally linked proteins and verify their interaction in vitro by coexpression/copurification. We illustrate this procedure by the structural determination of a previously unknown complex between a PE and PPE protein from the Mycobacterium tuberculosis genome, members of protein families that constitute approximately 10% of the coding capacity of this genome. The predicted complex was readily expressed, purified, and crystallized, although we had previously failed in expressing individual PE and PPE proteins on their own. The reason for the failure is clear from the structure, which shows that the PE and PPE proteins mate along an extended apolar interface to form a four-alpha-helical bundle, where two of the alpha-helices are contributed by the PE protein and two by the PPE protein. Our entire procedure for the identification, characterization, and structural determination of protein complexes can be scaled to a genome-wide level.
- Nelson R, Eisenberg D. (2006).
Recent atomic models of amyloid fibril structure.
Curr. Opin. Struct. Biol.. Apr 2006. 16(2):260-5.
[Abstract]
Despite the difficulties associated with determining atomic-level structures for materials that are fibrous, structural biologists are making headway in understanding the architecture of amyloid-like fibrils. It has long been recognized that these fibrils contain a cross-beta spine, with beta-strands perpendicular to the fibril axis. Recently, atomic structures have been determined for some of these cross-beta spines, revealing a pair of beta-sheets mated closely together by intermeshing sidechains in what has been termed a steric zipper. To explain the conversion of proteins from soluble to fibrous forms, several types of models have been proposed: refolding, natively disordered and gain of interaction. The gain-of-interaction models may additionally be subdivided into direct stacking, cross-beta spine, three-dimensional domain swapping and three-dimensional domain swapping with a cross-beta spine.
- Ivanova MI, Thompson MJ, Eisenberg D. (2006).
A systematic screen of beta(2)-microglobulin and insulin for amyloid-like segments.
Proc. Natl. Acad. Sci. U.S.A.. Mar 2006. 103(11):4079-82.
[Abstract]
Identifying sequence determinants of fibril-forming proteins is crucial for understanding the processes causing >20 proteins to form pathological amyloid depositions. Our approach to identifying which sequences form amyloid-like fibrils is to screen the amyloid-forming proteins human insulin and beta(2)-microglobulin for segments that form fibrils. Our screen is of 60 sequentially overlapping peptides, 59 being six residues in length and 1 being five residues, covering every noncysteine-containing segment in these two proteins. Each peptide was characterized as amyloid-like or nonfibril-forming. Amyloid-like peptides formed fibrils visible in electron micrographs or needle-like microcrystals showing a cross-beta diffraction pattern. Eight of the 60 peptides (three from insulin and five from beta(2)-microglobulin) were identified as amyloid-like. The results of the screen were used to assess the computational method, and good agreement between prediction and experiments was found. This agreement suggests that the pair-of-sheets, zipper spine model on which the computational method is based is at least approximately correct for the structure of the fibrils and suggests the nature of the sequence signal for formation of amyloid-like fibrils.
- Thompson MJ, Sievers SA, Karanicolas J, Ivanova MI, Baker D, Eisenberg D. (2006).
The 3D profile method for identifying fibril-forming segments of proteins.
Proc. Natl. Acad. Sci. U.S.A.. Mar 2006. 103(11):4074-8.
[Abstract]
Based on the crystal structure of the cross-beta spine formed by the peptide NNQQNY, we have developed a computational approach for identifying those segments of amyloidogenic proteins that themselves can form amyloid-like fibrils. The approach builds on experiments showing that hexapeptides are sufficient for forming amyloid-like fibrils. Each six-residue peptide of a protein of interest is mapped onto an ensemble of templates, or 3D profile, generated from the crystal structure of the peptide NNQQNY by small displacements of one of the two intermeshed beta-sheets relative to the other. The energy of each mapping of a sequence to the profile is evaluated by using ROSETTADESIGN, and the lowest energy match for a given peptide to the template library is taken as the putative prediction. If the energy of the putative prediction is lower than a threshold value, a prediction of fibril formation is made. This method can reach an accuracy of approximately 80% with a P value of approximately 10(-12) when a conservative energy threshold is used to separate peptides that form fibrils from those that do not. We see enrichment for positive predictions in a set of fibril-forming segments of amyloid proteins, and we illustrate the method with applications to proteins of interest in amyloid research.
- Eisenberg D, Marcotte E, McLachlan AD, Pellegrini M. (2006).
Bioinformatic challenges for the next decade(s).
Philos. Trans. R. Soc. Lond., B, Biol. Sci.. Mar 2006. 361(1467):525-7.
[Abstract]
The science of bioinformatics has developed in the wake of methods to determine the sequences of the informational macromolecules--DNAs, RNAs and proteins. But in a wider sense, the biological world depends in its every process on the transmission of information, and hence bioinformatics is the fundamental core of biology. We here give a consideration of some of the key problems of bioinformatics in the coming decade, and perhaps longer.
- Liu PT, Stenger S, Li H, Wenzel L, Tan BH, Krutzik SR, Ochoa MT, Schauber J, Wu K, Meinken C, Kamen DL, Wagner M, Bals R, Steinmeyer A, Zügel U, Gallo RL, Eisenberg D, Hewison M, Hollis BW, Adams JS, Bloom BR, Modlin RL. (2006).
Toll-like receptor triggering of a vitamin D-mediated human antimicrobial response.
Science. Mar 2006. 311(5768):1770-3.
[Abstract]
In innate immune responses, activation of Toll-like receptors (TLRs) triggers direct antimicrobial activity against intracellular bacteria, which in murine, but not human, monocytes and macrophages is mediated principally by nitric oxide. We report here that TLR activation of human macrophages up-regulated expression of the vitamin D receptor and the vitamin D-1-hydroxylase genes, leading to induction of the antimicrobial peptide cathelicidin and killing of intracellular Mycobacterium tuberculosis. We also observed that sera from African-American individuals, known to have increased susceptibility to tuberculosis, had low 25-hydroxyvitamin D and were inefficient in supporting cathelicidin messenger RNA induction. These data support a link between TLRs and vitamin D-mediated innate immunity and suggest that differences in ability of human populations to produce vitamin D may contribute to susceptibility to microbial infection.
2005
- Bowers PM, O'Connor BD, Cokus SJ, Sprinzak E, Yeates TO, Eisenberg D. (2005).
Utilizing logical relationships in genomic data to decipher cellular processes.
FEBS J.. Oct 2005. 272(20):5110-8.
[Abstract]
The wealth of available genomic data has spawned a corresponding interest in computational methods that can impart biological meaning and context to these experiments. Traditional computational methods have drawn relationships between pairs of proteins or genes based on notions of equality or similarity between their patterns of occurrence or behavior. For example, two genes displaying similar variation in expression, over a number of experiments, may be predicted to be functionally related. We have introduced a natural extension of these approaches, instead identifying logical relationships involving triplets of proteins. Triplets provide for various discrete kinds of logic relationships, leading to detailed inferences about biological associations. For instance, a protein C might be encoded within an organism if, and only if, two other proteins A and B are also both encoded within the organism, thus suggesting that gene C is functionally related to genes A and B. The method has been applied fruitfully to both phylogenetic and microarray expression data, and has been used to associate logical combinations of protein activity with disease state phenotypes, revealing previously unknown ternary relationships among proteins, and illustrating the inherent complexities that arise in biological data.
- Riley R, Lee C, Sabatti C, Eisenberg D. (2005).
Inferring protein domain interactions from databases of interacting proteins.
Genome Biol.. 2005. 6(10):R89.
[Abstract]
We describe domain pair exclusion analysis (DPEA), a method for inferring domain interactions from databases of interacting proteins. DPEA features a log odds score, Eij, reflecting confidence that domains i and j interact. We analyzed 177,233 potential domain interactions underlying 26,032 protein interactions. In total, 3,005 high-confidence domain interactions were inferred, and were evaluated using known domain interactions in the Protein Data Bank. DPEA may prove useful in guiding experiment-based discovery of previously unrecognized domain interactions.
- Sambashivan S, Liu Y, Sawaya MR, Gingery M, Eisenberg D. (2005).
Amyloid-like fibrils of ribonuclease A with three-dimensional domain-swapped and native-like structure.
Nature. Sep 2005. 437(7056):266-9.
[Abstract]
Amyloid or amyloid-like fibrils are elongated, insoluble protein aggregates, formed in vivo in association with neurodegenerative diseases or in vitro from soluble native proteins, respectively. The underlying structure of the fibrillar or 'cross-beta' state has presented long-standing, fundamental puzzles of protein structure. These include whether fibril-forming proteins have two structurally distinct stable states, native and fibrillar, and whether all or only part of the native protein refolds as it converts to the fibrillar state. Here we show that a designed amyloid-like fibril of the well-characterized enzyme RNase A contains native-like molecules capable of enzymatic activity. In addition, these functional molecular units are formed from a core RNase A domain and a swapped complementary domain. These findings are consistent with the zipper-spine model in which a cross-beta spine is decorated with three-dimensional domain-swapped functional units, retaining native-like structure.
- Nelson R, Sawaya MR, Balbirnie M, Madsen AØ, Riekel C, Grothe R, Eisenberg D. (2005).
Structure of the cross-beta spine of amyloid-like fibrils.
Nature. Jun 2005. 435(7043):773-8.
[Abstract]
Numerous soluble proteins convert to insoluble amyloid-like fibrils that have common properties. Amyloid fibrils are associated with fatal diseases such as Alzheimer's, and amyloid-like fibrils can be formed in vitro. For the yeast protein Sup35, conversion to amyloid-like fibrils is associated with a transmissible infection akin to that caused by mammalian prions. A seven-residue peptide segment from Sup35 forms amyloid-like fibrils and closely related microcrystals, from which we have determined the atomic structure of the cross-beta spine. It is a double beta-sheet, with each sheet formed from parallel segments stacked in register. Side chains protruding from the two sheets form a dry, tightly self-complementing steric zipper, bonding the sheets. Within each sheet, every segment is bound to its two neighbouring segments through stacks of both backbone and side-chain hydrogen bonds. The structure illuminates the stability of amyloid fibrils, their self-seeding characteristic and their tendency to form polymorphic structures.
- Li H, Sawaya MR, Tabita FR, Eisenberg D. (2005).
Crystal structure of a RuBisCO-like protein from the green sulfur bacterium Chlorobium tepidum.
Structure. May 2005. 13(5):779-89.
[Abstract]
Ribulose 1,5-bisphosphate carboxylase/oxygenase (RuBisCO) catalyzes the incorporation of atmospheric CO(2) into ribulose 1,5-bisphosphate (RuBP). RuBisCOs are classified into four forms based on sequence similarity: forms I, II and III are bona fide RuBisCOs; form IV, also called the RuBisCO-like protein (RLP), lacks several of the substrate binding and catalytic residues and does not catalyze RuBP-dependent CO(2) fixation in vitro. To contribute to understanding the function of RLPs, we determined the crystal structure of the RLP from Chlorobium tepidum. The overall structure of the RLP is similar to the structures of the three other forms of RuBisCO; however, the active site is distinct from those of bona fide RuBisCOs and suggests that the RLP is possibly capable of catalyzing enolization but not carboxylation. Bioinformatic analysis of the protein functional linkages suggests that this RLP coevolved with enzymes of the bacteriochlorophyll biosynthesis pathway and may be involved in processes related to photosynthesis.
- Goulding CW, Apostol MI, Sawaya MR, Phillips M, Parseghian A, Eisenberg D. (2005).
Regulation by oligomerization in a mycobacterial folate biosynthetic enzyme.
J. Mol. Biol.. May 2005. 349(1):61-72.
[Abstract]
Folate derivatives are essential cofactors in the biosynthesis of purines, pyrimidines and amino acids across all forms of life. Mammals uptake folate from their diets, whereas most bacteria must synthesize folate de novo. Therefore, the enzymes in the folate biosynthetic pathway are attractive drug targets against bacterial pathogens such as Mycobacterium tuberculosis, the cause of the world's most deadly infectious disease, tuberculosis (TB). M.tuberculosis 7,8-dihydroneopterin aldolase (Mtb FolB, DHNA) is the second enzyme in the folate biosynthetic pathway, which catalyzes the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin and glycoaldehyde. The 1.6A X-ray crystal structure of Mtb FolB complexed with its product, 6-hydroxymethyl-7,8-dihydropterin, reveals an octameric assembly similar to that seen in crystal structures of other FolB homologs. However, the 2.5A crystal structure of unliganded Mtb FolB reveals a novel tetrameric oligomerization state, with only partially formed active sites. A substrate induced conformational change appears to be necessary to convert the inactive tetramer to the active octamer. Ultracentrifugation confirmed that in solution unliganded Mtb FolB is mainly tetrameric and upon addition of substrate FolB is predominantly octameric. Kinetic analysis of substrate binding gives a Hill coefficient of 2.0, indicating positive cooperativity. We hypothesize that Mtb FolB displays cooperativity in substrate binding to regulate the cellular concentration of 7,8-dihydroneopterin, so that it may function not only as a precursor to folate but also as an antioxidant for the survival of M.tuberculosis against host defenses.
- Müller P, Sawaya MR, Pashkov I, Chan S, Nguyen C, Wu Y, Perry LJ, Eisenberg D. (2005).
The 1.70 angstroms X-ray crystal structure of Mycobacterium tuberculosis phosphoglycerate mutase.
Acta Crystallogr. D Biol. Crystallogr.. Mar 2005. 61(Pt 3):309-15.
[Abstract]
The single-crystal X-ray structure of phosphoglycerate mutase from Mycobacterium tuberculosis has been determined at a resolution of 1.70 angstroms. The C-terminal tail of each of the subunits is flexible and disordered; however, for one of the four chains (chain A) all but five residues of the chain could be modeled. Noteworthy features of the structure include the active site and a proline-rich segment in each monomer forming a short left-handed polyprolyl helix. These segments lie on the enzyme surface and could conceivably participate in protein-protein interactions.
- Pal D, Eisenberg D. (2005).
Inference of protein function from protein structure.
Structure. Jan 2005. 13(1):121-30.
[Abstract]
Structural genomics has brought us three-dimensional structures of proteins with unknown functions. To shed light on such structures, we have developed ProKnow (http://www.doe-mbi.ucla.edu/Services/ProKnow/), which annotates proteins with Gene Ontology functional terms. The method extracts features from the protein such as 3D fold, sequence, motif, and functional linkages and relates them to function via the ProKnow knowledgebase of features, which links features to annotated functions via annotation profiles. Bayes' theorem is used to compute weights of the functions assigned, using likelihoods based on the extracted features. The description level of the assigned function is quantified by the ontology depth (from 1 = general to 9 = specific). Jackknife tests show approximately 89% correct assignments at ontology depth 1 and 40% at depth 9, with 93% coverage of 1507 distinct folded proteins. Overall, about 70% of the assignments were inferred correctly. This level of performance suggests that ProKnow is a useful resource in functional assessments of novel proteins.
2004
- Bowers PM, Cokus SJ, Eisenberg D, Yeates TO. (2004).
Use of logic relationships to decipher protein network organization.
Science. Dec 2004. 306(5705):2246-9.
[Abstract]
A major focus of genome research is to decipher the networks of molecular interactions that underlie cellular function. We describe a computational approach for identifying detailed relationships between proteins on the basis of genomic data. Logic analysis of phylogenetic profiles identifies triplets of proteins whose presence or absence obey certain logic relationships. For example, protein C may be present in a genome only if proteins A and B are both present. The method reveals many previously unidentified higher order relationships. These relationships illustrate the complexities that arise in cellular networks because of branching and alternate pathways, and they also facilitate assignment of cellular functions to uncharacterized proteins.
- Ivanova MI, Sawaya MR, Gingery M, Attinger A, Eisenberg D. (2004).
An amyloid-forming segment of beta2-microglobulin suggests a molecular model for the fibril.
Proc. Natl. Acad. Sci. U.S.A.. Jul 2004. 101(29):10584-9.
[Abstract]
In humans suffering from dialysis-related amyloidosis, the protein beta2-microglobulin (beta2M) is deposited as an amyloid; however, an amyloid of beta2M is unknown in mice. beta2M sequences from human and mouse are 70% identical, but there is a seven-residue peptide in which six residues differ. This peptide from human beta2M forms amyloid in vitro, whereas the mouse peptide does not. Substitution of the human peptide for its counterpart in the mouse sequence results in the formation of amyloid in vitro. These results show that a seven-residue segment of human beta2M is sufficient to convert beta2M to the amyloid state, and that specific residue interactions are crucial to the conversion. These observations are consistent with a proposed Zipper-spine model for beta2M amyloid, in which the spine of the fibril consists of an anhydrous beta-sheet.
- Salwinski L, Eisenberg D. (2004).
In silico simulation of biological network dynamics.
Nat. Biotechnol.. Aug 2004. 22(8):1017-9.
[Abstract]
Realistic simulation of biological networks requires stochastic simulation approaches because of the small numbers of molecules per cell. The high computational cost of stochastic simulation on conventional microprocessor-based computers arises from the intrinsic disparity between the sequential steps executed by a microprocessor program and the highly parallel nature of information flow within biochemical networks. This disparity is reduced with the Field Programmable Gate Array (FPGA)-based approach presented here. The parallel architecture of FPGAs, which can simulate the basic reaction steps of biological networks, attains simulation rates at least an order of magnitude greater than currently available microprocessors.
- Reinhardt A, Eisenberg D. (2004).
DPANN: improved sequence to structure alignments following fold recognition.
Proteins. Aug 2004. 56(3):528-38.
[Abstract]
In fold recognition (FR) a protein sequence of unknown structure is assigned to the closest known three-dimensional (3D) fold. Although FR programs can often identify among all possible folds the one a sequence adopts, they frequently fail to align the sequence to the equivalent residue positions in that fold. Such failures frustrate the next step in structure prediction, protein model building. Hence it is desirable to improve the quality of the alignments between the sequence and the identified structure. We have used artificial neural networks (ANN) to derive a substitution matrix to create alignments between a protein sequence and a protein structure through dynamic programming (DPANN: Dynamic Programming meets Artificial Neural Networks). The matrix is based on the amino acid type and the secondary structure state of each residue. In a database of protein pairs that have the same fold but lack sequences-similarity, DPANN aligns over 30% of all sequences to the paired structure, resembling closely the structural superposition of the pair. In over half of these cases the DPANN alignment is close to the structural superposition, although the initial alignment from the step of fold recognition is not close. Conversely, the alignment created during fold recognition outperforms DPANN in only 10% of all cases. Thus application of DPANN after fold recognition leads to substantial improvements in alignment accuracy, which in turn provides more useful templates for the modeling of protein structures. In the artificial case of using actual instead of predicted secondary structures for the probe protein, over 50% of the alignments are successful.
- Bowers PM, Pellegrini M, Thompson MJ, Fierro J, Yeates TO, Eisenberg D. (2004).
Prolinks: a database of protein functional linkages derived from coevolution.
Genome Biol.. 2004. 5(5):R35.
[Abstract]
The advent of whole-genome sequencing has led to methods that infer protein function and linkages. We have combined four such algorithms (phylogenetic profile, Rosetta Stone, gene neighbor and gene cluster) in a single database--Prolinks--that spans 83 organisms and includes 10 million high-confidence links. The Proteome Navigator tool allows users to browse predicted linkage networks interactively, providing accompanying annotation from public databases. The Prolinks database and the Proteome Navigator tool are available for use online at http://dip.doe-mbi.ucla.edu/pronav.
- Kleiger G, Panina EM, Mallick P, Eisenberg D. (2004).
PFIT and PFRIT: bioinformatic algorithms for detecting glycosidase function from structure and sequence.
Protein Sci.. Jan 2004. 13(1):221-9.
[Abstract]
The identification of the enzymes involved in the metabolism of simple and complex carbohydrates presents one bioinformatic challenge in the post-genomic era. Here, we present the PFIT and PFRIT algorithms for identifying those proteins adopting the alpha/beta barrel fold that function as glycosidases. These algorithms are based on the observation that proteins adopting the alpha/beta barrel fold share positions in their tertiary structures having equivalent sets of atomic interactions. These are conserved tertiary interaction positions, which have been implicated in both structure and function. Glycosidases adopting the alpha/beta barrel fold share more conserved tertiary interactions than alpha/beta barrel proteins having other functions. The enrichment pattern of conserved tertiary interactions in the glycosidases is the information that PFIT and PFRIT use to predict whether any given alpha/beta barrel will function as a glycosidase or not. Using as a test set a database of 19 glycosidase and 45 nonglycosidase alpha/beta barrel proteins with low sequence similarity, PFIT and PFRIT can correctly predict glycosidase function for 84% of the proteins known to function as glycosidases. PFIT and PFRIT incorrectly predict glycosidase function for 25% of the nonglycosidases. The program PSI-BLAST can also correctly identify 84% of the 19 glycosidases, however, it incorrectly predicts glycosidase function for 50% of the nonglycosidases (twofold greater than PFIT and PFRIT). Overall, we demonstrate that the structure-based PFIT and PFRIT algorithms are both more selective and sensitive for predicting glycosidase function than the sequence-based PSI-BLAST algorithm.
- Goulding CW, Apostol MI, Gleiter S, Parseghian A, Bardwell J, Gennaro M, Eisenberg D. (2004).
Gram-positive DsbE proteins function differently from Gram-negative DsbE homologs. A structure to function analysis of DsbE from Mycobacterium tuberculosis.
J. Biol. Chem.. Jan 2004. 279(5):3516-24.
[Abstract]
Mycobacterium tuberculosis, a Gram-positive bacterium, encodes a secreted Dsb-like protein annotated as Mtb DsbE (Rv2878c, also known as MPT53). Because Dsb proteins in Escherichia coli and other bacteria seem to catalyze proper folding during protein secretion and because folding of secreted proteins is thought to be coupled to disulfide oxidoreduction, the function of Mtb DsbE may be to ensure that secreted proteins are in their correctly folded states. We have determined the crystal structure of Mtb DsbE to 1.1 A resolution, which reveals a thioredoxin-like domain with a typical CXXC active site. These cysteines are in their reduced state. Biochemical characterization of Mtb DsbE reveals that this disulfide oxidoreductase is an oxidant, unlike Gram-negative bacteria DsbE proteins, which have been shown to be weak reductants. In addition, the pK(a) value of the active site, solvent-exposed cysteine is approximately 2 pH units lower than that of Gram-negative DsbE homologs. Finally, the reduced form of Mtb DsbE is more stable than the oxidized form, and Mtb DsbE is able to oxidatively fold hirudin. Structural and biochemical analysis implies that Mtb DsbE functions differently from Gram-negative DsbE homologs, and we discuss its possible functional role in the bacterium.
2003
- Strong M, Graeber TG, Beeby M, Pellegrini M, Thompson MJ, Yeates TO, Eisenberg D. (2003).
Visualization and interpretation of protein networks in Mycobacterium tuberculosis based on hierarchical clustering of genome-wide functional linkage maps.
Nucleic Acids Res.. Dec 2003. 31(24):7099-109.
[Abstract]
Genome-wide functional linkages among proteins in cellular complexes and metabolic pathways can be inferred from high throughput experimentation, such as DNA microarrays, or from bioinformatic analyses. Here we describe a method for the visualization and interpretation of genome-wide functional linkages inferred by the Rosetta Stone, Phylogenetic Profile, Operon and Conserved Gene Neighbor computational methods. This method involves the construction of a genome-wide functional linkage map, where each significant functional linkage between a pair of proteins is displayed on a two-dimensional scatter-plot, organized according to the order of genes along the chromosome. Subsequent hierarchical clustering of the map reveals clusters of genes with similar functional linkage profiles and facilitates the inference of protein function and the discovery of functionally linked gene clusters throughout the genome. We illustrate this method by applying it to the genome of the pathogenic bacterium Mycobacterium tuberculosis, assigning cellular functions to previously uncharacterized proteins involved in cell wall biosynthesis, signal transduction, chaperone activity, energy metabolism and polysaccharide biosynthesis.
- Lee S, Sawaya MR, Eisenberg D. (2003).
Structure of superoxide dismutase from Pyrobaculum aerophilum presents a challenging case in molecular replacement with multiple molecules, pseudo-symmetry and twinning.
Acta Crystallogr. D Biol. Crystallogr.. Dec 2003. 59(Pt 12):2191-9.
[Abstract]
The crystal structure of superoxide dismutase from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum was determined by molecular replacement at 1.8 A resolution. The structure determination was made especially challenging by the large number of molecules (24) in the asymmetric unit, the presence of a pseudo-crystallographic twofold operator close to a twinning operator and the inability to detect twinning by conventional means. Molecular replacement proceeded at low resolution in pseudo (apparent) space group P3(2)12 and was facilitated by examination of the self-rotation function and native Patterson map. Refinement, however, stalled at an R factor of 40% when high-resolution data were included. Expanding to the lower symmetry space group P3(2) decreased R (to 22%) and R(free) (to 26%), but not by as much as expected for the quality of data. Finally, despite the apparent lack of evidence from conventional twinning tests [i.e. plots of the second moment of I and N(Z) distributions], a twinning operator was included in the refinement, lowering R and R(free) to 16.2 and 21.7%, respectively. The early detection of twinning appears to have been masked by a deviation in the expected intensity distribution caused by the presence of non-crystallographic translational symmetry. These findings suggest the importance of testing twinning operators in cases where pseudo-translational symmetry can explain negative results from conventional twinning tests. The structure reveals a tetrameric assembly with 222 symmetry, similar to superoxide dismutase structures from other organisms. The current structural model represents the metal-free state of the enzyme.
- Goulding CW, Perry LJ, Anderson D, Sawaya MR, Cascio D, Apostol MI, Chan S, Parseghian A, Wang SS, Wu Y, Cassano V, Gill HS, Eisenberg D. (2003).
Structural genomics of Mycobacterium tuberculosis: a preliminary report of progress at UCLA.
Biophys. Chem.. Sep 2003. 105(2-3):361-70.
[Abstract]
The growing list of fully sequenced genomes, combined with innovations in the fields of structural biology and bioinformatics, provides a synergy for the discovery of new drug targets. With this background, the TB Structural Genomics Consortium has been formed. This international consortium is comprised of laboratories from 31 universities and institutes in 13 countries. The goal of the consortium is to determine the structures of over 400 potential drug targets from the genome of Mycobacterium tuberculosis and analyze their structures in the context of functional information. We summarize the efforts of the UCLA consortium members. Potential drug targets were selected using a variety of bioinformatics methods and screened for certain physical and species-specific properties to yield a starting group of protein targets for structure determination. Target determination methods include protein phylogenetic profiles and Rosetta Stone methods, and the use of related biochemical pathways to select genes linked to essential prokaryotic genes. Criteria imposed on target selection included potential protein solubility, protein or domain size, and targets that lack homologs in eukaryotic organisms. In addition, some protein targets were chosen that are specific to M. tuberculosis, such as PE and PPE domains. Thus far, the UCLA group has cloned 263 targets, expressed 171 proteins and purified 40 proteins, which are currently in crystallization trials. Our efforts have yielded 13 crystals and eight structures. Seven structures are summarized here. Four of the structures are secreted proteins: antigen 85B; MPT 63, which is one of the three major secreted proteins of M. tuberculosis; a thioredoxin derivative Rv2878c; and potentially secreted glutamate synthetase. We also report the structures of three proteins that are potentially essential to the survival of M. tuberculosis: a protein involved in the folate biosynthetic pathway (Rv3607c); a protein involved in the biosynthesis of vitamin B5 (Rv3602c); and a pyrophosphatase, Rv2697c. Our approach to the M. tuberculosis structural genomics project will yield information for drug design and vaccine production against tuberculosis. In addition, this study will provide further insights into the mechanisms of mycobacterial pathogenesis.
- Bleharski JR, Li H, Meinken C, Graeber TG, Ochoa MT, Yamamura M, Burdick A, Sarno EN, Wagner M, Röllinghoff M, Rea TH, Colonna M, Stenger S, Bloom BR, Eisenberg D, Modlin RL. (2003).
Use of genetic profiling in leprosy to discriminate clinical forms of the disease.
Science. Sep 2003. 301(5639):1527-30.
[Abstract]
Leprosy presents as a clinical and immunological spectrum of disease. With the use of gene expression profiling, we observed that a distinction in gene expression correlates with and accurately classifies the clinical form of the disease. Genes belonging to the leukocyte immunoglobulin-like receptor (LIR) family were significantly up-regulated in lesions of lepromatous patients suffering from the disseminated form of the infection. In functional studies, LIR-7 suppressed innate host defense mechanisms by shifting monocyte production from interleukin-12 toward interleukin-10 and by blocking antimicrobial activity triggered by Toll-like receptors. Gene expression profiles may be useful in defining clinical forms of disease and providing insights into the regulation of immune responses to pathogens.
- Eisenberg D. (2003).
The discovery of the alpha-helix and beta-sheet, the principal structural features of proteins.
Proc. Natl. Acad. Sci. U.S.A.. Sep 2003. 100(20):11207-10.
[Abstract]
PNAS papers by Linus Pauling, Robert Corey, and Herman Branson in the spring of 1951 proposed the alpha-helix and the beta-sheet, now known to form the backbones of tens of thousands of proteins. They deduced these fundamental building blocks from properties of small molecules, known both from crystal structures and from Pauling's resonance theory of chemical bonding that predicted planar peptide groups. Earlier attempts by others to build models for protein helices had failed both by including nonplanar peptides and by insisting on helices with an integral number of units per turn. In major respects, the Pauling-Corey-Branson models were astoundingly correct, including bond lengths that were not surpassed in accuracy for >40 years. However, they did not consider the hand of the helix or the possibility of bent sheets. They also proposed structures and functions that have not been found, including the gamma-helix.
- Strong M, Mallick P, Pellegrini M, Thompson MJ, Eisenberg D. (2003).
Inference of protein function and protein linkages in Mycobacterium tuberculosis based on prokaryotic genome organization: a combined computational approach.
Genome Biol.. 2003. 4(9):R59.
[Abstract]
The genome of Mycobacterium tuberculosis was analyzed using recently developed computational approaches to infer protein function and protein linkages. We evaluated and employed a method to infer genes likely to belong to the same operon, as judged by the nucleotide distance between genes in the same genomic orientation, and combined this method with those of the Rosetta Stone, Phylogenetic Profile and conserved Gene Neighbor computational methods for the inference of protein function.
- Lee S, Eisenberg D. (2003).
Seeded conversion of recombinant prion protein to a disulfide-bonded oligomer by a reduction-oxidation process.
Nat. Struct. Biol.. Sep 2003. 10(9):725-30.
[Abstract]
The infectious form of prion protein, PrP(Sc), self-propagates by its conversion of the normal, cellular prion protein molecule PrP(C) to another PrP(Sc) molecule. It has not yet been demonstrated that recombinant prion protein can convert prion protein molecules from PrP(C) to PrP(Sc). Here we show that recombinant hamster prion protein is converted to a second form, PrP(RDX), by a redox process in vitro and that this PrP(RDX) form seeds the conversion of other PrP(C) molecules to the PrP(RDX) form. The converted form shows properties of oligomerization and seeded conversion that are characteristic of PrP(Sc). We also find that the oligomerization can be reversed in vitro. X-ray fiber diffraction suggests an amyloid-like structure for the oligomerized prion protein. A domain-swapping model involving intermolecular disulfide bonds can account for the stability and coexistence of two molecular forms of prion protein and the capacity of the second form for self-propagation.
- Salwinski L, Eisenberg D. (2003).
Computational methods of analysis of protein-protein interactions.
Curr. Opin. Struct. Biol.. Jun 2003. 13(3):377-82.
[Abstract]
Computational methods play an important role at all stages of the process of determining protein-protein interactions. They are used to predict potential interactions, to validate the results of high-throughput interaction screens and to analyze the protein networks inferred from interaction databases.
- Wang S, Eisenberg D. (2003).
Crystal structures of a pantothenate synthetase from M. tuberculosis and its complexes with substrates and a reaction intermediate.
Protein Sci.. May 2003. 12(5):1097-108.
[Abstract]
Pantothenate biosynthesis is essential for the virulence of Mycobacterium tuberculosis, and this pathway thus presents potential drug targets against tuberculosis. We determined the crystal structure of pantothenate synthetase (PS) from M. tuberculosis, and its complexes with AMPCPP, pantoate, and a reaction intermediate, pantoyl adenylate, with resolutions from 1.6 to 2 A. PS catalyzes the ATP-dependent condensation of pantoate and beta-alanine to form pantothenate. Its structure reveals a dimer, and each subunit has two domains with tight association between domains. The active-site cavity is on the N-terminal domain, partially covered by the C-terminal domain. One wall of the active site cavity is flexible, which allows the bulky AMPCPP to diffuse into the active site to nearly full occupancy when crystals are soaked in solutions containing AMPCPP. Crystal structures of the complexes with AMPCPP and pantoate indicate that the enzyme binds ATP and pantoate tightly in the active site, and brings the carboxyl oxygen of pantoate near the alpha-phosphorus atom of ATP for an in-line nucleophilic attack. When crystals were soaked with, or grown in the presence of, both ATP and pantoate, a reaction intermediate, pantoyl adenylate, is found in the active site. The flexible wall of the active site cavity becomes ordered when the intermediate is in the active site, thus protecting it from being hydrolyzed. Binding of beta-alanine can occur only after pantoyl adenylate is formed inside the active site cavity. The tight binding of the intermediate pantoyl adenylate suggests that nonreactive analogs of pantoyl adenylate may be inhibitors of the PS enzyme with high affinity and specificity.
- Freedland SJ, Pantuck AJ, Paik SH, Zisman A, Graeber TG, Eisenberg D, McBride WH, Nguyen D, Tso CL, Belldegrun AS. (2003).
Heterogeneity of molecular targets on clonal cancer lines derived from a novel hormone-refractory prostate cancer tumor system.
Prostate. Jun 2003. 55(4):299-307.
[Abstract]
OBJECTIVE: We recently described a new hormone refractory prostate cancer cell line, CL1, derived from LNCaP via in vitro androgen deprivation. To study gene expression during prostate cancer progression and to identify molecular targets for therapy, a pure clonal tumor system was generated. METHODS: Limiting dilution of CL1 stably transfected with a green fluorescent protein, generated 35 single-cell clones, which were expanded into stable cell lines. In vitro responses to various therapeutic modalities were assessed in each clone. Gene expression was determined using reverse transcriptase-polymerase chain reaction and oligonucleotide microarrays. In vivo biology was assessed following orthotopic injection into intact and castrated severe combined immunodeficient mice. RESULTS: In vitro, all clones demonstrated similar resistance to traditional therapeutic efforts including chemotherapy and radiation therapy, but differential sensitivity to cell-mediated cytotoxicity. The clones demonstrated differential gene expression relative to each other and to the parental CL1 and LNCaP cell lines. Following orthotopic injection into mice, three distinct growth patterns were observed: fast growth with widespread metastasis; slower grower with widespread metastasis; and no tumor formation. Using oligonucleotide microarrays, several genes were identified as differentially expressed between the most aggressive and the nontumorigenic clone. CONCLUSIONS: We have described a novel fluorescent-labeled clonal hormone refractory prostate cancer tumor system that exhibited marked heterogeneity in its response to various therapeutic modalities, gene expression, and in vivo biology. Our data suggests that given the marked clonal heterogeneity, multi-modality approaches directed against multiple molecular targets rather than single agent therapy will be necessary to adequately eradicate the entire malignant cell population. Clonal tumor lines may allow more accurate examination of molecular pathways involved in tumor progression and resistance to treatment.
- Mura C, Phillips M, Kozhukhovsky A, Eisenberg D. (2003).
Structure and assembly of an augmented Sm-like archaeal protein 14-mer.
Proc. Natl. Acad. Sci. U.S.A.. Apr 2003. 100(8):4539-44.
[Abstract]
To better understand the roles of Sm proteins in forming the cores of many RNA-processing ribonucleoproteins, we determined the crystal structure of an atypical Sm-like archaeal protein (SmAP3) in which the conserved Sm domain is augmented by a previously uncharacterized, mixed alpha/beta C-terminal domain. The structure reveals an unexpected SmAP3 14-mer that is perforated by a cylindrical pore and is bound to 14 cadmium (Cd(2+)) ions. Individual heptamers adopt either "apical" or "equatorial" conformations that chelate Cd(2+) differently. SmAP3 forms supraheptameric oligomers (SmAP3)(n = 7,14,28) in solution, and assembly of the asymmetric 14-mer is modulated by differential divalent cation-binding in apical and equatorial subunits. Phylogenetic and sequence analyses substantiate SmAP3s as a unique subset of SmAPs. These results distinguish SmAP3s from other Sm proteins and provide a model for the structure and properties of Sm proteins >100 residues in length, e.g., several human Sm proteins.
- Mura C, Kozhukhovsky A, Gingery M, Phillips M, Eisenberg D. (2003).
The oligomerization and ligand-binding properties of Sm-like archaeal proteins (SmAPs).
Protein Sci.. Apr 2003. 12(4):832-47.
[Abstract]
Intron splicing is a prime example of the many types of RNA processing catalyzed by small nuclear ribonucleoprotein (snRNP) complexes. Sm proteins form the cores of most snRNPs, and thus to learn principles of snRNP assembly we characterized the oligomerization and ligand-binding properties of Sm-like archaeal proteins (SmAPs) from Pyrobaculum aerophilum (Pae) and Methanobacterium thermautotrophicum (Mth). Ultracentrifugation shows that Mth SmAP1 is exclusively heptameric in solution, whereas Pae SmAP1 forms either disulfide-bonded 14-mers or sub-heptameric states (depending on the redox potential). By electron microscopy, we show that Pae and Mth SmAP1 polymerize into bundles of well ordered fibers that probably form by head-to-tail stacking of heptamers. The crystallographic results reported here corroborate these findings by showing heptamers and 14-mers of both Mth and Pae SmAP1 in four new crystal forms. The 1.9 A-resolution structure of Mth SmAP1 bound to uridine-5'-monophosphate (UMP) reveals conserved ligand-binding sites. The likely RNA binding site in Mth agrees with that determined for Archaeoglobus fulgidus (Afu) SmAP1. Finally, we found that both Pae and Mth SmAP1 gel-shift negatively supercoiled DNA. These results distinguish SmAPs from eukaryotic Sm proteins and suggest that SmAPs have a generic single-stranded nucleic acid-binding activity.
- Eisenberg D. (2003).
John T. Edsall as tutor and teacher.
Biophys. Chem.. 2003. 100(1-3):91-3.
[Abstract]
- McCarty AS, Kleiger G, Eisenberg D, Smale ST. (2003).
Selective dimerization of a C2H2 zinc finger subfamily.
Mol. Cell. Feb 2003. 11(2):459-70.
[Abstract]
The C2H2 zinc finger is the most prevalent protein motif in the mammalian proteome. Two C2H2 fingers in Ikaros are dedicated to homotypic interactions between family members. We show here that these fingers comprise a bona fide dimerization domain. Dimerization is highly selective, however, as homologous domains from the TRPS-1 and Drosophila Hunchback proteins support homodimerization, but not heterodimerization with Ikaros. Ikaros-Hunchback selectivity is determined by 11 residues concentrated within the alpha-helical regions typically involved in base recognition. Preferential homodimerization of one chimeric protein predicts a parallel dimer interface and establishes the feasibility of creating novel dimer specificities. These results demonstrate that the C2H2 motif provides a versatile platform for both sequence-specific protein-nucleic acid interactions and highly specific dimerization.
- Mura C, Katz JE, Clarke SG, Eisenberg D. (2003).
Structure and function of an archaeal homolog of survival protein E (SurEalpha): an acid phosphatase with purine nucleotide specificity.
J. Mol. Biol.. Mar 2003. 326(5):1559-75.
[Abstract]
The survival protein E (SurE) family was discovered by its correlation to stationary phase survival of Escherichia coli and various repair proteins involved in sustaining this and other stress-response phenotypes. In order to better understand this ancient and well-conserved protein family, we have determined the 2.0A resolution crystal structure of SurEalpha from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum (Pae). This first structure of an archaeal SurE reveals significant similarities to and differences from the only other known SurE structure, that from the eubacterium Thermatoga maritima (Tma). Both SurE monomers adopt similar folds; however, unlike the Tma SurE dimer, crystalline Pae SurEalpha is predominantly non-domain swapped. Comparative structural analyses of Tma and Pae SurE suggest conformationally variant regions, such as a hinge loop that may be involved in domain swapping. The putative SurE active site is highly conserved, and implies a model for SurE bound to a potential substrate, guanosine-5'-monophosphate (GMP). Pae SurEalpha has optimal acid phosphatase activity at temperatures above 90 degrees C, and is less specific than Tma SurE in terms of metal ion requirements. Substrate specificity also differs between Pae and Tma SurE, with a more specific recognition of purine nucleotides by the archaeal enzyme. Analyses of the sequences, phylogenetic distribution, and genomic organization of the SurE family reveal examples of genomes encoding multiple surE genes, and suggest that SurE homologs constitute a broad family of enzymes with phosphatase-like activities.
- Anderson DH, Sawaya MR, Cascio D, Ernst W, Modlin R, Krensky A, Eisenberg D. (2003).
Granulysin crystal structure and a structure-derived lytic mechanism.
J. Mol. Biol.. Jan 2003. 325(2):355-65.
[Abstract]
Our crystal structure of granulysin suggests a mechanism for lysis of bacterial membranes by granulysin, a 74-residue basic protein from human cytolytic T lymphocyte and natural killer cells. We determined the initial crystal structure of selenomethionyl granulysin by MAD phasing at 2A resolution. We present the structure model refined using native diffraction data to 0.96A resolution. The five-helical bundle of granulysin resembles other "saposin folds" (such as NK-lysin). Positive charges distribute in a ring around the granulysin molecule, and one face has net positive charge. Sulfate ions bind near the segment of the molecule identified as most membrane-lytic and of highest hydrophobic moment. The ion locations may indicate granulysin's orientation of initial approach towards the membrane. The crystal packing reveals one way to pack a sheet of granulysin molecules at the cell surface for a concerted lysis effort. The energy of binding granulysin charges to the bacterial membrane could drive the subsequent lytic processes. The loosely packed core facilitates a hinge or scissors motion towards exposure of hydrophobic surface that we propose tunnels the granulysin into the fracturing target membrane.
2002
- Mallick P, Weiss R, Eisenberg D. (2002).
The directional atomic solvation energy: an atom-based potential for the assignment of protein sequences to known folds.
Proc. Natl. Acad. Sci. U.S.A.. Dec 2002. 99(25):16041-6.
[Abstract]
The Directional Atomic Solvation EnergY (DASEY) is an atom-based description of the environment of an amino acid position within a known 3D protein structure. The DASEY has been developed to align and score a probe amino acid sequence to a library of template protein structures for fold assignment. DASEY is computed by summing the atomic solvation parameters of atoms falling within a tetrahedral sector, or petal, extending 16 A along each of the four bond axes of each alpha-carbon atom of the protein. The DASEY discriminates between pairs of structurally equivalent positions and random pairs in protein structures sharing a fold but belonging to different superfamilies, unlike some previous descriptors of protein environments, such as buried area. Furthermore, the DASEY values have characteristic patterns of residue replacement, an essential feature of a successful fold assignment method. Benchmarking fold assignment with DASEY achieves coverage of 56% of sequences with 90% accuracy when probe sequences are matched to protein structural templates belonging to the same fold but to a different superfamily, an improvement of greater than 200% over a previous method.
- Goulding CW, Parseghian A, Sawaya MR, Cascio D, Apostol MI, Gennaro ML, Eisenberg D. (2002).
Crystal structure of a major secreted protein of Mycobacterium tuberculosis-MPT63 at 1.5-A resolution.
Protein Sci.. Dec 2002. 11(12):2887-93.
[Abstract]
MPT63 is a small, major secreted protein of unknown function from Mycobacterium tuberculosis that has been shown to have immunogenic properties and has been implicated in virulence. A BLAST search identified that MPT63 has homologs only in other mycobacteria, and is therefore mycobacteria specific. As MPT63 is a secreted protein, mycobacteria specific, and implicated in virulence, MPT63 is an attractive drug target against the deadliest infectious disease, tuberculosis (TB). As part of the TB Structural Genomics Consortium, the X-ray crystal structure of MPT63 was determined to 1.5-Angstrom resolution with the hope of yielding functional information about MPT63. The structure of MPT63 is an antiparallel beta-sandwich immunoglobulin-like fold, with the unusual feature of the first beta-strand of the protein forming a parallel addition to the small antiparallel beta-sheet. MPT63 has weak structural similarity to many proteins with immunoglobulin folds, in particular, Homo sapiens beta2-adaptin, bovine arrestin, and Yersinia pseudotuberculosis invasin. Although the structure of MPT63 gives no conclusive evidence to its function, structural similarity suggests that MPT63 could be involved in cell-host interactions to facilitate endocytosis/phagocytosis.
- Kleiger G, Eisenberg D. (2002).
GXXXG and GXXXA motifs stabilize FAD and NAD(P)-binding Rossmann folds through C(alpha)-H... O hydrogen bonds and van der waals interactions.
J. Mol. Biol.. Oct 2002. 323(1):69-76.
[Abstract]
Here we present evidence that domains in soluble proteins containing either the GXXXG or GXXXA motif are stabilized by the interaction of a beta-strand with the following alpha-helix. As an example, we characterized a beta-strand-helix interaction from the FAD or NAD(P)-binding Rossmann fold. The Rossmann fold is one of the three most highly represented folds in the Protein Data Bank (PDB). A subset of the proteins that adopt the Rossmann fold also bind to nucleotide cofactors such as FAD and NAD(P) and function as oxidoreductases. These Rossmann folds can often be identified by the short amino acid sequence motif, GX(1-2)GXXG. Here, we present evidence that in addition to this sequence motif, Rossmann folds that bind FAD and NAD(P) also typically contain either GXXXG or GXXXA motifs, where the first glycyl residue of these motifs and the third glycyl residue of the GX(1-2)GXXG motif are the same residue. These two motifs appear to stabilize the Rossmann fold: the first glycyl residue of either the GXXXG or GXXXA motif contacts the carbonyl oxygen atom from the first glycyl residue of the GX(1-2)GXXG motif consistent with the formation of a C(alpha)-H cdots, three dots, centered O hydrogen bond. In addition, both the glycyl and alanyl residues of the GXXXG or GXXXA motifs form van der Waals interactions with either a valine or isoleucine residue located either seven or eight residues further back along the polypeptide chain from the first glycine of the GXXXG or GXXXA motifs. Therefore, we combine both the GX(1-2)GXXG and GXXXG/A motifs into an extended motif, V/IXGX(1-2)GXXGXXXG/A, that is more strongly indicative than previously described motifs of Rossmann folds that bind FAD or NAD(P). The V/IXGX(1-2)GXXGXXXG/A motif can be used to search genomic sequence data and to annotate the function of proteins containing the motif as oxidoreductases, including proteins of previously unknown function.
- Gill HS, Pfluegl GM, Eisenberg D. (2002).
Multicopy crystallographic refinement of a relaxed glutamine synthetase from Mycobacterium tuberculosis highlights flexible loops in the enzymatic mechanism and its regulation.
Biochemistry. Aug 2002. 41(31):9863-72.
[Abstract]
The crystal structure of glutamine synthetase (GS) from Mycobacterium tuberculosis determined at 2.4 A resolution reveals citrate and AMP bound in the active site. The structure was refined with strict 24-fold noncrystallographic symmetry (NCS) constraints and has an R-factor of 22.7% and an R-free of 25.5%. Multicopy refinement using 10 atomic models and strict 24-fold NCS constraints further reduced the R-factor to 20.4% and the R-free to 23.2%. The multicopy model demonstrates the range of atomic displacements of catalytic and regulatory loops in glutamine synthesis, simulating loop motions. A comparison with loop positions in substrate complexes of GS from Salmonella typhimurium shows that the Asp50 and Glu327 loops close over the active site during catalysis. These loop closures are preceded by a conformational change of the Glu209 beta-strand upon metal ion or ATP binding that converts the enzyme from a relaxed to a taut state. We propose a model of the GS regulatory mechanism based on the loop motions in which adenylylation of the Tyr397 loop reverses the effect of metal ion binding, and regulates intermediate formation by preventing closure of the Glu327 loop.
- Deane CM, Salwiński Ł, Xenarios I, Eisenberg D. (2002).
Protein interactions: two methods for assessment of the reliability of high throughput observations.
Mol. Cell Proteomics. May 2002. 1(5):349-56.
[Abstract]
High throughput methods for detecting protein interactions require assessment of their accuracy. We present two forms of computational assessment. The first method is the expression profile reliability (EPR) index. The EPR index estimates the biologically relevant fraction of protein interactions detected in a high throughput screen. It does so by comparing the RNA expression profiles for the proteins whose interactions are found in the screen with expression profiles for known interacting and non-interacting pairs of proteins. The second form of assessment is the paralogous verification method (PVM). This method judges an interaction likely if the putatively interacting pair has paralogs that also interact. In contrast to the EPR index, which evaluates datasets of interactions, PVM scores individual interactions. On a test set, PVM identifies correctly 40% of true interactions with a false positive rate of approximately 1%. EPR and PVM were applied to the Database of Interacting Proteins (DIP), a large and diverse collection of protein-protein interactions that contains over 8000 Saccharomyces cerevisiae pairwise protein interactions. Using these two methods, we estimate that approximately 50% of them are reliable, and with the aid of PVM we identify confidently 3003 of them. Web servers for both the PVM and EPR methods are available on the DIP website (dip.doe-mbi.ucla.edu/Services.cgi).
- Mallick P, Boutz DR, Eisenberg D, Yeates TO. (2002).
Genomic evidence that the intracellular proteins of archaeal microbes contain disulfide bonds.
Proc. Natl. Acad. Sci. U.S.A.. Jul 2002. 99(15):9679-84.
[Abstract]
Disulfide bonds have only rarely been found in intracellular proteins. That pattern is consistent with the chemically reducing environment inside the cells of well-studied organisms. However, recent experiments and new calculations based on genomic data of archaea provide striking contradictions to this pattern. Our results indicate that the intracellular proteins of certain hyperthermophilic archaea, especially the crenarchaea Pyrobaculum aerophilum and Aeropyrum pernix, are rich in disulfide bonds. This finding implicates disulfide bonding in stabilizing many thermostable proteins and points to novel chemical environments inside these microbes. These unexpected results illustrate the wealth of biochemical insights available from the growing reservoir of genomic data.
- Duan XJ, Xenarios I, Eisenberg D. (2002).
Describing biological protein interactions in terms of protein states and state transitions: the LiveDIP database.
Mol. Cell Proteomics. Feb 2002. 1(2):104-16.
[Abstract]
Biological protein-protein interactions differ from the more general class of physical interactions; in a biological interaction, both proteins must be in their proper states (e.g. covalently modified state, conformational state, cellular location state, etc.). Also in every biological interaction, one or both interacting molecules undergo a transition to a new state. This regulation of protein states through protein-protein interactions underlies many dynamic biological processes inside cells. Therefore, understanding biological interactions requires information on protein states. Toward this goal, DIP (the Database of Interacting Proteins) has been expanded to LiveDIP, which describes protein interactions by protein states and state transitions. This additional level of characterization permits a more complete picture of the protein-protein interaction networks and is crucial to an integrated understanding of genome-scale biology. The search tools provided by LiveDIP, Pathfinder, and Batch Search allow users to assemble biological pathways from all the protein-protein interactions collated from the scientific literature in LiveDIP. Tools have also been developed to integrate the protein-protein interaction networks of LiveDIP with large scale genomic data such as microarray data. An example of these tools applied to analyzing the pheromone response pathway in yeast suggests that the pathway functions in the context of a complex protein-protein interaction network. Seven of the eleven proteins involved in signal transduction are under negative or positive regulation of up to five other proteins through biological protein-protein interactions. During pheromone response, the mRNA expression levels of these signaling proteins exhibit different time course profiles. There is no simple correlation between changes in transcription levels and the signal intensity. This points to the importance of proteomic studies to understand how cells modulate and integrate signals. Integrating large scale, yeast two-hybrid data with mRNA expression data suggests biological interactions that may participate in pheromone response. These examples illustrate how LiveDIP provides data and tools for biological pathway discovery and pathway analysis.
- Goulding CW, Sawaya MR, Parseghian A, Lim V, Eisenberg D, Missiakas D. (2002).
Thiol-disulfide exchange in an immunoglobulin-like fold: structure of the N-terminal domain of DsbD.
Biochemistry. Jun 2002. 41(22):6920-7.
[Abstract]
Escherichia coli DsbD transports electrons across the plasma membrane, a pathway that leads to the reduction of protein disulfide bonds. Three secreted thioredoxin-like factors, DsbC, DsbE, and DsbG, reduce protein disulfide bonds whereby an active site C-X-X-C motif is oxidized to generate a disulfide bond. DsbD catalyzes the reduction of the disulfide of DsbC, DsbE, and DsbG but not of the thioredoxin-like oxidant DsbA. The reduction of DsbC, DsbE, and DsbG occurs by transport of electrons from cytoplasmic thioredoxin to the C-terminal thioredoxin-like domain of DsbD (DsbD(C)). The N-terminal domain of DsbD, DsbD(N), acts as a versatile adaptor in electron transport and is capable of forming disulfides with oxidized DsbC, DsbE, or DsbG as well as with reduced DsbD(C). Isolated DsbD(N) is functional in electron transport in vitro. Crystallized DsbD(N) assumes an immunoglobulin-like fold that encompasses two active site cysteines, C103 and C109, forming a disulfide bond between beta-strands. The disulfide of DsbD(N) is shielded from the environment and capped by a phenylalanine (F70). A model is discussed whereby the immunoglobulin fold of DsbD(N) may provide for the discriminating interaction with thioredoxin-like factors, thereby triggering movement of the phenylalanine cap followed by disulfide rearrangement.
- Liu Y, Eisenberg D. (2002).
3D domain swapping: as domains continue to swap.
Protein Sci.. Jun 2002. 11(6):1285-99.
[Abstract]
Three-dimensional (3D) domain swapping creates a bond between two or more protein molecules as they exchange their identical domains. Since the term '3D domain swapping' was first used to describe the dimeric structure of diphtheria toxin, the database of domain-swapped proteins has greatly expanded. Analyses of the now about 40 structurally characterized cases of domain-swapped proteins reveal that most swapped domains are at either the N or C terminus and that the swapped domains are diverse in their primary and secondary structures. In addition to tabulating domain-swapped proteins, we describe in detail several examples of 3D domain swapping which show the swapping of more than one domain in a protein, the structural evidence for 3D domain swapping in amyloid proteins, and the flexibility of hinge loops. We also discuss the physiological relevance of 3D domain swapping and a possible mechanism for 3D domain swapping. The present state of knowledge leads us to suggest that 3D domain swapping can occur under appropriate conditions in any protein with an unconstrained terminus. As domains continue to swap, this review attempts not only a summary of the known domain-swapped proteins, but also a framework for understanding future findings of 3D domain swapping.
- Kleiger G, Grothe R, Mallick P, Eisenberg D. (2002).
GXXXG and AXXXA: common alpha-helical interaction motifs in proteins, particularly in extremophiles.
Biochemistry. May 2002. 41(19):5990-7.
[Abstract]
The GXXXG motif is a frequently occurring sequence of residues that is known to favor helix-helix interactions in membrane proteins. Here we show that the GXXXG motif is also prevalent in soluble proteins whose structures have been determined. Some 152 proteins from a non-redundant PDB set contain at least one alpha-helix with the GXXXG motif, 41 +/- 9% more than expected if glycine residues were uniformly distributed in those alpha-helices. More than 50% of the GXXXG-containing alpha-helices participate in helix-helix interactions. In fact, 26 of those helix-helix interactions are structurally similar to the helix-helix interaction of the glycophorin A dimer, where two transmembrane helices associate to form a dimer stabilized by the GXXXG motif. As for the glycophorin A structure, we find backbone-to-backbone atomic contacts of the C alpha-H...O type in each of these 26 helix-helix interactions that display the stereochemical hallmarks of hydrogen bond formation. These glycophorin A-like helix-helix interactions are enriched in the general set of helix-helix interactions containing the GXXXG motif, suggesting that the inferred C alpha-H...O hydrogen bonds stabilize the helix-helix interactions. In addition to the GXXXG motif, some 808 proteins from the non-redundant PDB set contain at least one alpha-helix with the AXXXA motif (30 +/- 3% greater than expected). Both the GXXXG and AXXXA motifs occur frequently in predicted alpha-helices from 24 fully sequenced genomes. Occurrence of the AXXXA motif is enhanced to a greater extent in thermophiles than in mesophiles, suggesting that helical interaction based on the AXXXA motif may be a common mechanism of thermostability in protein structures. We conclude that the GXXXG sequence motif stabilizes helix-helix interactions in proteins, and that the AXXXA sequence motif also stabilizes the folded state of proteins.
- Wang S, Mura C, Sawaya MR, Cascio D, Eisenberg D. (2002).
Structure of a Nudix protein from Pyrobaculum aerophilum reveals a dimer with two intersubunit beta-sheets.
Acta Crystallogr. D Biol. Crystallogr.. Apr 2002. 58(Pt 4):571-8.
[Abstract]
Nudix proteins, formerly called MutT homolog proteins, are a large family of proteins that play an important role in reducing the accumulation of potentially toxic compounds inside the cell. They hydrolyze a wide variety of substrates that are mainly composed of a nucleoside diphosphate linked to some other moiety X and thus are called Nudix hydrolases. Here, the crystal structure of a Nudix hydrolase from the hyperthermophilic archaeon Pyrobaculum aerophilum is reported. The structure was determined by the single-wavelength anomalous scattering method with data collected at the peak anomalous wavelength of an iridium-derivatized crystal. It reveals an extensive dimer interface, with each subunit contributing two strands to the beta-sheet of the other subunit. Individual subunits consist of a mixed highly twisted and curved beta-sheet of 11 beta-strands and two alpha-helices, forming an alpha-beta-alpha sandwich. The conserved Nudix box signature motif, which contains the essential catalytic residues, is located at the first alpha-helix and the beta-strand and loop preceding it. The unusually short connections between secondary-structural elements, together with the dimer form of the structure, are likely to contribute to the thermostability of the P. aerophilum Nudix protein.
- Liu Y, Gotte G, Libonati M, Eisenberg D. (2002).
Structures of the two 3D domain-swapped RNase A trimers.
Protein Sci.. Feb 2002. 11(2):371-80.
[Abstract]
When concentrated in mildly acidic solutions, bovine pancreatic ribonuclease (RNase A) forms long-lived oligomers including two types of dimer, two types of trimer, and higher oligomers. In previous crystallographic work, we found that the major dimeric component forms by a swapping of the C-terminal beta-strands between the monomers, and that the minor dimeric component forms by swapping the N-terminal alpha-helices of the monomers. On the basis of these structures, we proposed that a linear RNase A trimer can form from a central molecule that simultaneously swaps its N-terminal helix with a second RNase A molecule and its C-terminal strand with a third molecule. Studies by dissociation are consistent with this model for the major trimeric component: the major trimer dissociates into both the major and the minor dimers, as well as monomers. In contrast, the minor trimer component dissociates into the monomer and the major dimer. This suggests that the minor trimer is cyclic, formed from three monomers that swap their C-terminal beta-strands into identical molecules. These conclusions are supported by cross-linking of lysyl residues, showing that the major trimer swaps its N-terminal helix, and the minor trimer does not. We verified by X-ray crystallography the proposed cyclic structure for the minor trimer, with swapping of the C-terminal beta-strands. This study thus expands the variety of domain-swapped oligomers by revealing the first example of a protein that can form both a linear and a cyclic domain-swapped oligomer. These structures permit interpretation of the enzymatic activities of the RNase A oligomers on double-stranded RNA.
- Xenarios I, Salwínski L, Duan XJ, Higney P, Kim SM, Eisenberg D. (2002).
DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions.
Nucleic Acids Res.. Jan 2002. 30(1):303-5.
[Abstract]
The Database of Interacting Proteins (DIP: http://dip.doe-mbi.ucla.edu) is a database that documents experimentally determined protein-protein interactions. It provides the scientific community with an integrated set of tools for browsing and extracting information about protein interaction networks. As of September 2001, the DIP catalogs approximately 11 000 unique interactions among 5900 proteins from >80 organisms; the vast majority from yeast, Helicobacter pylori and human. Tools have been developed that allow users to analyze, visualize and integrate their own experimental data with the information about protein-protein interactions available in the DIP database.
2001
- Kleiger G, Perry J, Eisenberg D. (2001).
3D structure and significance of the GPhiXXG helix packing motif in tetramers of the E1beta subunit of pyruvate dehydrogenase from the archeon Pyrobaculum aerophilum.
Biochemistry. Dec 2001. 40(48):14484-92.
[Abstract]
As part of a structural genomics project, we have determined the 2.0 A structure of the E1beta subunit of pyruvate dehydrogenase from Pyrobaculum aerophilum (PA), a thermophilic archaeon. The overall fold of E1beta from PA is closely similar to the previously determined E1beta structures from humans (HU) and P. putida (PP). However, unlike the HU and PP structures, the PA structure was determined in the absence of its partner subunit, E1alpha. Significant structural rearrangements occur in E1beta when its E1alpha partner is absent, including rearrangement of several secondary structure elements such as helix C. Helix C is buried by E1alpha in the HU and PP structures, but makes crystal contacts in the PA structure that lead to an apparent beta(4) tetramer. Static light scattering and sedimentation velocity data are consistent with the formation of PA E1beta tetramers in solution. The interaction of helix C with its symmetry-related counterpart stabilizes the tetrameric interface, where two glycine residues on the same face of one helix create a packing surface for the other helix. This GPhiXXG helix-helix interaction motif has previously been found in interacting transmembrane helices, and is found here at the E1alpha-E1beta interface for both the HU and PP alpha(2)beta(2) tetramers. As a case study in structural genomics, this work illustrates that comparative analysis of protein structures can identify the structural significance of a sequence motif.
- Salwinski L, Eisenberg D. (2001).
Motif-based fold assignment.
Protein Sci.. Dec 2001. 10(12):2460-9.
[Abstract]
Conventional fold recognition techniques rely mainly on the analysis of the entire sequence of a protein. We present an MBA method to improve performance of any conventional sequence-based fold assignment. The method uses sequence motifs, such as those defined in the Prosite database, and the SwissProt annotation of the fold library. When combined with a simple SDP method, the coverage of MBA is comparable to the results obtained with PSI-BLAST. However, the set of the MBA predictions is significantly different from that of PSI-BLAST, leading to a 40% increase of the coverage for the combined MBA/PSI-BLAST method. The MBA approach can be easily adopted to include the results of sequence-independent function prediction methods and alternative motif and annotation databases. The method is available through the web server localized at http://www.doe-mbi.ucla.edu/mba.
- Graeber TG, Eisenberg D. (2001).
Bioinformatic identification of potential autocrine signaling loops in cancers from gene expression profiles.
Nat. Genet.. Nov 2001. 29(3):295-300.
[Abstract]
Many biological signaling pathways involve autocrine ligand-receptor loops; misregulation of these signaling loops can contribute to cancer phenotypes. Here we present an algorithm for detecting such loops from gene expression profiles. Our method is based on the hypothesis that for some autocrine pathways, the ligand and receptor are regulated by coupled mechanisms at the level of transcription, and thus ligand-receptor pairs comprising such a loop should have correlated mRNA expression. Using our database of experimentally known ligand-receptor signaling partners, we found examples of ligand-receptor pairs with significantly correlated expression in five cancer-based gene expression datasets. The correlated ligand-receptor pairs we identified are consistent with known autocrine signaling events in cancer cells. In addition, our algorithm predicts new autocrine signaling loops that can be verified experimentally. Chemokines were commonly members of these potential autocrine pathways. Our analysis also revealed ligand-receptor pairs with expression patterns that may indicate cellular mechanisms for preventing autocrine signaling.
- Singer E, Landgraf R, Horan T, Slamon D, Eisenberg D. (2001).
Identification of a heregulin binding site in HER3 extracellular domain.
J. Biol. Chem.. Nov 2001. 276(47):44266-74.
[Abstract]
HER3 (also known as c-Erb-b3) is a type I receptor tyrosine kinase similar in sequence to the epidermal growth factor (EGF) receptor. The extracellular segment of this transmembrane receptor contains four domains. Domains I and II are similar in sequence to domains III and IV, respectively, and domains II and IV are cysteine-rich. We show that the EGF-like domain of heregulin (hrg) binds to domains I and II of HER3, in contrast to the EGF receptor, for which prior studies have shown that a construct consisting of domains III and portions of domain IV binds EGF. Next, we identified a putative hrg binding site by limited proteolysis of the recombinant extracellular domains of HER3 (HER3-ECD(I-IV)) in both the presence and absence of hrg. In the absence of hrg, HER3-ECD(I-IV) is cleaved after position Tyr(50), near the beginning of domain I. Binding of hrg to HER3-ECD(I-IV) fully protects position Tyr(50) from proteolysis. To confirm that domain I contains a hrg binding site, we expressed domains I and II (HER3-ECD(I-II)) and find that it binds hrg with 68 nm affinity. These data suggest that domains I and II of HER3-ECD(I-IV) act as a functional unit in folding and binding of hrg. Thus, our biochemical findings reinforce the structural hypothesis of others that HER3-ECD(I-IV) is similar to the insulin-like growth factor-1 receptor (IGF-1R), as follows: 1) The protected cleavage site in HER3-ECD(I-IV) corresponds to a binding footprint in domain I of IGF-1R; 2) HER3-ECD(I-II) binds hrg with a 68 nm dissociation constant, supporting the hypothesis that domain I is involved in ligand binding; and 3) the large accessible surface area (1749 A) of domain L1 of IGF-1R that is buried by domain S1, as well as the presence of conserved contacts in this interface of type 1 RTKs, suggests that domains L1 and S1 of IGF-1R function as a unit as observed for HER3-ECD(I-II). Our results are consistent with the proposal that HER3 has a structure similar to IGF-1R and binds ligand at a site in corresponding domains.
- Xenarios I, Eisenberg D. (2001).
Protein interaction databases.
Curr. Opin. Biotechnol.. Aug 2001. 12(4):334-9.
[Abstract]
Life depends on the interaction of proteins. The availability of the complete human genome sequence has highlighted the need for a tool to analyse protein interactions and several databases have been compiled for this purpose. These databases document, categorize, and analyze interacting proteins and the cellular functions of the interactions.
- Dym O, Eisenberg D. (2001).
Sequence-structure analysis of FAD-containing proteins.
Protein Sci.. Sep 2001. 10(9):1712-28.
[Abstract]
We have analyzed structure-sequence relationships in 32 families of flavin adenine dinucleotide (FAD)-binding proteins, to prepare for genomic-scale analyses of this family. Four different FAD-family folds were identified, each containing at least two or more protein families. Three of these families, exemplified by glutathione reductase (GR), ferredoxin reductase (FR), and p-cresol methylhydroxylase (PCMH) were previously defined, and a family represented by pyruvate oxidase (PO) is newly defined. For each of the families, several conserved sequence motifs have been characterized. Several newly recognized sequence motifs are reported here for the PO, GR, and PCMH families. Each FAD fold can be uniquely identified by the presence of distinctive conserved sequence motifs. We also analyzed cofactor properties, some of which are conserved within a family fold while others display variability. Among the conserved properties is cofactor directionality: in some FAD-structural families, the adenine ring of the FAD points toward the FAD-binding domain, whereas in others the isoalloxazine ring points toward this domain. In contrast, the FAD conformation and orientation are conserved in some families while in others it displays some variability. Nevertheless, there are clear correlations among the FAD-family fold, the shape of the pocket, and the FAD conformation. Our general findings are as follows: (a) no single protein 'pharmacophore' exists for binding FAD; (b) in every FAD-binding family, the pyrophosphate moiety binds to the most strongly conserved sequence motif, suggesting that pyrophosphate binding is a significant component of molecular recognition; and (c) sequence motifs can identify proteins that bind phosphate-containing ligands.
- Mura C, Cascio D, Sawaya MR, Eisenberg DS. (2001).
The crystal structure of a heptameric archaeal Sm protein: Implications for the eukaryotic snRNP core.
Proc. Natl. Acad. Sci. U.S.A.. May 2001. 98(10):5532-7.
[Abstract]
Sm proteins form the core of small nuclear ribonucleoprotein particles (snRNPs), making them key components of several mRNA-processing assemblies, including the spliceosome. We report the 1.75-A crystal structure of SmAP, an Sm-like archaeal protein that forms a heptameric ring perforated by a cationic pore. In addition to providing direct evidence for such an assembly in eukaryotic snRNPs, this structure (i) shows that SmAP homodimers are structurally similar to human Sm heterodimers, (ii) supports a gene duplication model of Sm protein evolution, and (iii) offers a model of SmAP bound to single-stranded RNA (ssRNA) that explains Sm binding-site specificity. The pronounced electrostatic asymmetry of the SmAP surface imparts directionality to putative SmAP-RNA interactions.
- Gill HS, Eisenberg D. (2001).
The crystal structure of phosphinothricin in the active site of glutamine synthetase illuminates the mechanism of enzymatic inhibition.
Biochemistry. Feb 2001. 40(7):1903-12.
[Abstract]
Phosphinothricin is a potent inhibitor of the enzyme glutamine synthetase (GS). The resolution of the native structure of GS from Salmonella typhimurium has been extended to 2.5 A resolution, and the improved model is used to determine the structure of phosphinothricin complexed to GS by difference Fourier methods. The structure suggests a noncovalent, dead-end mechanism of inhibition. Phosphinothricin occupies the glutamate substrate pocket and stabilizes the Glu327 flap in a position which blocks the glutamate entrance to the active site, trapping the inhibitor on the enzyme. One oxygen of the phosphinyl group of phosphinothricin appears to be protonated, because of its proximity to the carboxylate group of Glu327. The other phosphinyl oxygen protrudes into the negatively charged binding pocket for the substrate ammonium, disrupting that pocket. The distribution of charges in the glutamate binding pocket is complementary to those of phosphinothricin. The presence of a second ammonium binding site within the active site is confirmed by its analogue thallous ion, marking the ammonium site and its protein ligands. The inhibition of GS by methionine sulfoximine can be explained by the same mechanism. These models of inhibited GS further illuminate its catalytic mechanism.
- Marcotte EM, Xenarios I, Eisenberg D. (2001).
Mining literature for protein-protein interactions.
Bioinformatics. Apr 2001. 17(4):359-63.
[Abstract]
MOTIVATION: A central problem in bioinformatics is how to capture information from the vast current scientific literature in a form suitable for analysis by computer. We address the special case of information on protein-protein interactions, and show that the frequencies of words in Medline abstracts can be used to determine whether or not a given paper discusses protein-protein interactions. For those papers determined to discuss this topic, the relevant information can be captured for the Database of Interacting PROTEINS: Furthermore, suitable gene annotations can also be captured. RESULTS: Our Bayesian approach scores Medline abstracts for probability of discussing the topic of interest according to the frequencies of discriminating words found in the abstract. More than 80 discriminating words (e.g. complex, interaction, two-hybrid) were determined from a training set of 260 Medline abstracts corresponding to previously validated entries in the Database of Interacting Proteins. Using these words and a log likelihood scoring function, approximately 2000 Medline abstracts were identified as describing interactions between yeast proteins. This approach now forms the basis for the rapid expansion of the Database of Interacting Proteins.
- Landgraf R, Xenarios I, Eisenberg D. (2001).
Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins.
J. Mol. Biol.. Apr 2001. 307(5):1487-502.
[Abstract]
Three-dimensional cluster analysis offers a method for the prediction of functional residue clusters in proteins. This method requires a representative structure and a multiple sequence alignment as input data. Individual residues are represented in terms of regional alignments that reflect both their structural environment and their evolutionary variation, as defined by the alignment of homologous sequences. From the overall (global) and the residue-specific (regional) alignments, we calculate the global and regional similarity matrices, containing scores for all pairwise sequence comparisons in the respective alignments. Comparing the matrices yields two scores for each residue. The regional conservation score (C(R)(x)) defines the conservation of each residue x and its neighbors in 3D space relative to the protein as a whole. The similarity deviation score (S(x)) detects residue clusters with sequence similarities that deviate from the similarities suggested by the full-length sequences. We evaluated 3D cluster analysis on a set of 35 families of proteins with available cocrystal structures, showing small ligand interfaces, nucleic acid interfaces and two types of protein-protein interfaces (transient and stable). We present two examples in detail: fructose-1,6-bisphosphate aldolase and the mitogen-activated protein kinase ERK2. We found that the regional conservation score (C(R)(x)) identifies functional residue clusters better than a scoring scheme that does not take 3D information into account. C(R)(x) is particularly useful for the prediction of poorly conserved, transient protein-protein interfaces. Many of the proteins studied contained residue clusters with elevated similarity deviation scores. These residue clusters correlate with specificity-conferring regions: 3D cluster analysis therefore represents an easily applied method for the prediction of functionally relevant spatial clusters of residues in proteins.
- Anderson DH, Harth G, Horwitz MA, Eisenberg D. (2001).
An interfacial mechanism and a class of inhibitors inferred from two crystal structures of the Mycobacterium tuberculosis 30 kDa major secretory protein (Antigen 85B), a mycolyl transferase.
J. Mol. Biol.. Mar 2001. 307(2):671-81.
[Abstract]
The Mycobacterium tuberculosis 30 kDa major secretory protein (antigen 85B) is the most abundant protein exported by M. tuberculosis, as well as a potent immunoprotective antigen and a leading drug target. A mycolyl transferase of 285 residues, it is closely related to two other mycolyl transferases, each of molecular mass 32 kDa: antigen 85A and antigen 85C. All three catalyze transfer of the fatty acid mycolate from one trehalose monomycolate to another, resulting in trehalose dimycolate and free trehalose, thus helping to build the bacterial cell wall. We have determined two crystal structures of M. tuberculosis antigen 85B (ag85B), initially by molecular replacement using antigen 85C as a probe. The apo ag85B model is refined against 1.8 A data, to an R-factor of 0.196 (R(free) is 0.276), and includes all residues except the N-terminal Phe. The active site immobilizes a molecule of the cryoprotectant 2-methyl-2,4-pentanediol. Crystal growth with addition of trehalose resulted in a second ag85B crystal structure (1.9 A resolution; R-factor is 0.195; R(free) is 0.285). Trehalose binds in two sites at opposite ends of the active-site cleft. In our proposed mechanism model, the trehalose at the active site Ser126 represents the trehalose liberated by temporary esterification of Ser126, while the other trehalose represents the incoming trehalose monomycolate just prior to swinging over to the first trehalose site to displace the mycolate from its serine ester. Our proposed interfacial mechanism minimizes aqueous exposure of the apolar mycolates. Based on the trehalose-bound structure, we suggest a new class of antituberculous drugs, made by connecting two trehalose molecules by an amphipathic linker.
- Balbirnie M, Grothe R, Eisenberg DS. (2001).
An amyloid-forming peptide from the yeast prion Sup35 reveals a dehydrated beta-sheet structure for amyloid.
Proc. Natl. Acad. Sci. U.S.A.. Feb 2001. 98(5):2375-80.
[Abstract]
X-ray diffraction and other biophysical tools reveal features of the atomic structure of an amyloid-like crystal. Sup35, a prion-like protein in yeast, forms fibrillar amyloid assemblies intrinsic to its prion function. We have identified a polar peptide from the N-terminal prion-determining domain of Sup35 that exhibits the amyloid properties of full-length Sup35, including cooperative kinetics of aggregation, fibril formation, binding of the dye Congo red, and the characteristic cross-beta x-ray diffraction pattern. Microcrystals of this peptide also share the principal properties of the fibrillar amyloid, including a highly stable, beta-sheet-rich structure and the binding of Congo red. The x-ray powder pattern of the microcrystals, extending to 0.9-A resolution, yields the unit cell dimensions of the well-ordered structure. These dimensions restrict possible atomic models of this amyloid-like structure and demonstrate that it forms packed, parallel-stranded beta-sheets. The unusually high density of the crystals shows that the packed beta-sheets are dehydrated, despite the polar character of the side chains. These results suggest that amyloid is a highly intermolecularly bonded, dehydrated array of densely packed beta-sheets. This dry beta-sheet could form as Sup35 partially unfolds to expose the peptide, permitting it to hydrogen-bond to the same peptide of other Sup35 molecules. The implication is that amyloid-forming units may be short segments of proteins, exposed for interactions by partial unfolding.
- Liu Y, Gotte G, Libonati M, Eisenberg D. (2001).
A domain-swapped RNase A dimer with implications for amyloid formation.
Nat. Struct. Biol.. Mar 2001. 8(3):211-4.
[Abstract]
Bovine pancreatic ribonuclease (RNase A) forms two types of dimers (a major and a minor component) upon concentration in mild acid. These two dimers exhibit different biophysical and biochemical properties. Earlier we reported that the minor dimer forms by swapping its N-terminal alpha-helix with that of an identical molecule. Here we find that the major dimer forms by swapping its C-terminal beta-strand, thus revealing the first example of three-dimensional (3D) domain swapping taking place in different parts of the same protein. This feature permits RNase A to form tightly bonded higher oligomers. The hinge loop of the major dimer, connecting the swapped beta-strand to the protein core, resembles a short segment of the polar zipper proposed by Perutz and suggests a model for aggregate formation by 3D domain swapping with a polar zipper.
- Ogihara NL, Ghirlanda G, Bryson JW, Gingery M, DeGrado WF, Eisenberg D. (2001).
Design of three-dimensional domain-swapped dimers and fibrous oligomers.
Proc. Natl. Acad. Sci. U.S.A.. Feb 2001. 98(4):1404-9.
[Abstract]
Three-dimensional (3D) domain-swapped proteins are intermolecularly folded analogs of monomeric proteins; both are stabilized by the identical interactions, but the individual domains interact intramolecularly in monomeric proteins, whereas they form intermolecular interactions in 3D domain-swapped structures. The structures and conditions of formation of several domain-swapped dimers and trimers are known, but the formation of higher order 3D domain-swapped oligomers has been less thoroughly studied. Here we contrast the structural consequences of domain swapping from two designed three-helix bundles: one with an up-down-up topology, and the other with an up-down-down topology. The up-down-up topology gives rise to a domain-swapped dimer whose structure has been determined to 1.5 A resolution by x-ray crystallography. In contrast, the domain-swapped protein with an up-down-down topology forms fibrils as shown by electron microscopy and dynamic light scattering. This demonstrates that design principles can predict the oligomeric state of 3D domain-swapped molecules, which should aid in the design of domain-swapped proteins and biomaterials.
- Xenarios I, Fernandez E, Salwinski L, Duan XJ, Thompson MJ, Marcotte EM, Eisenberg D. (2001).
DIP: The Database of Interacting Proteins: 2001 update.
Nucleic Acids Res.. Jan 2001. 29(1):239-41.
[Abstract]
The Database of Interacting Proteins (DIP; http://dip.doe-mbi.ucla. edu) is a database that documents experimentally determined protein-protein interactions. Since January 2000 the number of protein-protein interactions in DIP has nearly tripled to 3472 and the number of proteins to 2659. New interactive tools have been developed to aid in the visualization, navigation and study of networks of protein interactions.
2000
- Steere B, Eisenberg D. (2000).
|