|
Instructions for running RosettaDesign (rDesign) (last modified 12/02/03)
rDesign can be run in several different modes. They include:
1) repacking side chains on a fixed backbone
2) redesigning on a fixed backbone
3) redesigning with a flexible backbone
See below for a description of the files and commands you will need to run
in these different modes.
The RosettaDesign distribution comes with directories that contain
sample runs for the different modes.
------------------------------------------------------------------------------
files you will want in your running directory:
1) starting pdb structure
2) paths.txt (this file specifies the location of input/output
for rDesign, a default paths.txt file is supplied with the
rosetta source code)
3) resfile (this file is optional, it is used to specifiy a subset
of residues to redesign. The format for this file is
described below.)
additional files for designing with a flexible backbone
4) fragment files (see below)
5) a pdb file with a name of the form xxxx.pdb that agrees
with the fragement names (see below)
other files/directories you will need
rosetta_database directory (this directory is supplied with rosetta
and contains a number of databases needed by rosetta. This
directory can be anywhere in your file system, it is just
neccessary to specify the location in the paths.txt file.)
------------------------------------------------------------------------------
editing the paths.txt file
most lines can be left unchanged
make sure the path for 'data files' points to your rosetta_database
if running with a flexible backbone, make sure that the names
for the fragment files are correctly defined.
------------------------------------------------------------------------------
obtaining fragment files for designing with a flexible backbone.s
fragment files can be obtained at:
http://robetta.bakerlab.org/fragmentsubmit.jsp
------------------------------------------------------------------------------
command line flags:
flags common to all modes of rosettadesign
-design :tells rosetta to run in design mode,
-s [filename] :starting structure
-l [filename] :list of starting structures
-resfile [filename] :specifies which residues to redesign
-ex1 :extra chi 1 rotamers
-ex2 :extra chi 2 rotamers
-extrachi_cutoff [integer] :residues with more neighbors than
the specified number get extra
chi 1 and chi 2 rotamers if
ex1 and ex2 are true
-ala_ref [real] :this energy will be subtracted
from all ala
-cys_ref [real] :ditto for cys
flags for repacking sidechains
-onlypack :just repack sidechains
flags for redesigning a fixed backbone
-fixbb :just redesign sequence
-ndruns [int] :number of design runs, used when fixbb set true
-pdbout [char] :specify prefix for pdb files that will be output,
only used when fixbb set true
flags for designing with a flexible backbone
-mvbb :move the backbone
------------------------------------------------------------------------------
example command lines:
redesign using a resfile, output 3 structures each with a name that
begins with test1
% pFOLD.absoft -s 1ubq.pdb -design -fixbb -resfile restest
-ndruns 3 -pdbout test1 >& log
just repack a protein with extra chi1 rotamers
% pFOLD.absoft -s 1ubq.pdb -design -onlypack -ex1
move the backbone and design:
-in this case the 3 standard arguments to rosetta must be used. these are:
(1) a two letter id code, for example 'ab'
(2) 4 character code name for the protein, this must agree with
the name of the fragement files and a pdb file named
xxxx.pdb that has the same number of residues as the
fragments
(3) chain id
-your paths.txt file must point to the appropriate fragment files
-the starting structure must be idealized, see below for a description
on how to idealize a structure
% pFOLD.absoft aa 1hz5 _ -s 1hz5_idl.pdb -design -mvbb >& log
------------------------------------------------------------------------------
how to create a resfile: an input file that specifies which residues you
would like to redesign and with which amino acids
to make initial file use the program makeresfile provided with rosetta
command line arguments:
-p [protein name]
-nchains [integer] : default is 1
-chain [protein chain] (default _)
-chain2 [protein chain2]
-chain3 [protein chain3]
-chain4 [protein chain4]
-resfile [outout file name]
-default [id] (default setting for each residue, default is NATRO)
example:
%> makeresfile -p 1ubq.pdb -default NATAA -resfile restest
Once the intitial resfile is made, you can edit it by hand to select the
residues that will be redesigned. Here is a sample resfile. All lines
before the start are just header explaining the format of the file.
---- sample resfile ----------------------------------
Column 2: Chain
Column 4-7: sequential residue number
Column 9-12: pdb residue number
Column 14-18: id (described below)
Column 20-40: amino acids to be used
NATAA => use native amino acid
ALLAA => all amino acids
NATRO => native amino acid and rotamer
PIKAA => select inividual amino acids (the program is dumb,
leave two spaces before inserting one letter codes)
POLAR => polar amino acids
APOLA => apolar amino acids
The following demo lines are in the proper format
A 1 3 NATAA
A 2 4 ALLAA
A 3 6 NATRO
A 4 7 NATAA
B 5 1 PIKAA DFLM (two spaces before D important)
B 6 2 PIKAA HIL
B 7 3 POLAR
-------------------------------------------------
start
_ 1 1 NATAA
_ 2 2 NATAA
_ 3 3 NATAA
_ 4 4 PIKAA LKDIG
_ 5 5 NATAA
_ 6 6 POLAR
_ 7 7 NATAA
_ 8 8 NATRO
_ 9 9 NATAA
_ 10 10 ALLAA
_ 11 11 NATAA
_ 12 12 NATAA
_ 13 13 NATAA
_ 14 14 NATAA
.
.
.
_ 73 73 NATAA
_ 74 74 NATAA
_ 75 75 NATAA
_ 76 76 NATAA
------------------------------------------------------------------------------
how to idealize a structure: creates a structure that resembles the starting
structure but has ideal bond lengths and angles.
command line:
pFOLD.gnu -s 1ubq.pdb -idealize -fa_input > & log_idealize &
------------------------------------------------------------------------------
Troubleshooting
1) The program seg faults immediately. Check if your box has enough memory. You can use the command size -A [executable name] to see how much memory the program wants.
2) You get the error message that max_res is exceeded. Increase max_res in param.h and recompile. If the program now requires too much memory try lowering maxrot.
RosettaDesign runs on Linux machines; the executable is:
The program is installed in /joule2/programs/rosetta/RosettaDesign/rosetta/pFOLD.gnu: The documentation is available from /joule2/programs/rosetta/RosettaDesign/README_user_guide Examples are available from /joule2/programs/rosetta/RosettaDesign/example_fixbb_design and also /joule2/programs/rosetta/RosettaDesign/example_mvbb_design
The program is available from http://depts.washington.edu/ventures/UW_Technology/Express_Licenses/Rosetta/ but UCLA's technology transfer office has indicated that they will not accept the terms of the RosettaDesign academic license for the source code. So, they sent us an executable version of RosettaDesign. Reference: Simons et al. J Mol Biol 1997;268:209-225 Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. | ||||||||