||Loop Modelling 1.0
There exists an option in WHAT IF scans the entire PDB (or PDB_REDO, or any
subset of either of those two) for loops that could fit nicely in your protein.
With fitting I mean that the loop has the appropriate length, and that the
N first and N last residues of the loop found in the PDB match the N
residues before and the N residues after the point of insertion in your
protein with a RMSd after optimal superposition less than your desired
cut-off. N typically is 2, 3, or 4.
The algorithm is conceptually explained in the figure to the left.
Unfortunately, this WHAT IF option easily takes 6-12 hours CPU time to scan 120K
PDB entries for just one loop. We therefore made dedicated filter software,
called LoopFinder. LoopFinder typically scans the whole PDB in minutes rather
than hours. LoopFinder runs over all PDB files and produces
several output files that hold information about loops in the PDB that
fit in your protein at some location where a loop is missing or should
be replaced by protein engineering. One of the output files is a list with
names of PDB files that WHAT IF should look at to find all loops that
were already found by LoopFinder. So, LoopFinder is a fast screening filter
for WHAT IF; but it can, of course, equally well be a fast filter for your
The output files
LoopFinder produces two output files (and a log-file; see below):
- NEW_PDB.LIS holds really minimal loop
- PDB.LISThis file is meant as input to
the WHAT IF loop matching option (that will, by the way, do many
- LoopFinder.HIT holds the same information
as NEW_PDB.LIS, but additionally provides for each loop the
transformation matrix needed to fit the PDB loop in the user protein.
The input file
LoopFinder needs some information from you about what it must do. You
have to provide that information in the file
The log file
LoopFinder writes a log file called LoopFinder.LOG.
This file has no value other than that you need to send it to G Vriend when
there is something you do not understand, or you believe is going wrong.
To make its predictions, LoopFinder scans three of the files from its data
directory. These files are big... LoopFinder starts with making an estimate of
how many blocks of 100.000 lines need to be read from those files, and
during the run tells you after every hit, and after each 100.000 lines how many blocks
it has done already. The database is explained
Installing software and databases
The LoopFinder was really designed to facilitate research by others.
We therefore did everything possible to keep the set-up simple and
flexible. Consequently, there are no installers, no environment
variables, etcetera, just one source code and a set of files that we
call the database.
To get your own LoopFinder, you must first obtain the file
LoopFinder.f and compile it with the linux
gfortran -O2 -o LoopFinder LoopFinder.f
Second, you need the database(s). I suggest you start getting just the
database for anchor length 2 and obtain the other database at some later
stage. You can obtain the database files by executing 26 times the command:
in which xxx should take all values from 105 till 130.
Be aware that most of these files are multiple gigabytes big, so do not sit and
wait for the files to arrive. (You can try to use this
script to obtain the database. Make sure
you give the script execute permission chmod Get_LoopFinder_Database 777
and that you run it in the directory where the database files should finally
reside. Use ./Get_LoopFinder_Database to run the script. Running multiple wget
commands in parallel will slow down the download rather than speed
Results and discussion
We tested the software extensively on a series of published studies in which
loops were designed or transplanted.