Bindres
Table of contents
Top
Introduction
The Binding Response, first described in ref. [1], is a method to select binding sites on a protein for virtual database screening. As outlined in ref. [1], Scheme 1, it involves several steps, performed by different computational chemistry programs. Shijun Zhong provided a script named br.csh that calls all these programs in order and does the necessary format conversions. The present script, bindres, was written by Kenno Vanommeslaeghe. It essentially replicates the behavior of br.csh , but is more flexible.
Top
Summary of new features:
- lets the user control some parameters of the Binding Response procedure by setting command-line parameters
- allows the different binaries to be located in different directories
- can be called from any location on the system, and the names of the input files are not fixed
- can restart a calculation without having to redo every step
- can be used to perform virtual screening on some or all of the freshly identified binding sites
- produces shell scripts for running dock on a different machine
Top
Installation
In principle, it suffices to untar the "Binding Response" distribution and to set an environment variable "brdir" that points to the installation directory. For example:
cd
gunzip -c binding_response.tar.gz | tar xf -
echo "brdir=$HOME/binding_response" >> .profile
echo "export brdir" >> .profile
. .profile
All bindres-related files are in the subdirectory bindres . Bindres calls the following binaries:
- dms (UCSF) Note
- sphgen, showbox, grid, dock (Dock 4.0; UCSF) Note
- charmm (Harvard University)
- br.exe (Shijun Zhong; University of Maryland, School of Pharmacy)
It searches for these binaries in $brdir/bin (or $brbin, if it is set) and $PATH .
Note: dms generally won't run if not properly installed; ie. just copying the binary to another machine doesn't work. Also, different versions of DMS often give slightly different results. During some test runs, even the same version of DMS gave different results on different architecures.
Note: only limited testing was performed with versions of Dock other than Dock 4.0. In general, different versions produce different results. Also, the current grid.in and dock.in are not compatible with Dock 6 and should be edited if used with this version.
Top
Usage
Invoking bindres without any arguments prints the following instructions:
Generates sphere clusters on a protein surface and performs docking on them.
Usage: bindres [-t <top>] [-c <col>] [-f] [-e]
[-b '<binding sites>'] [-l <ligand[%]>] [-p <nproc>] [-n]
[-o <orderly>] [-s] <pdb file> [<mol2 file> [<library>]]
Where: <pdb file> does not have hydrogens
(it is recommended to add the suffix "_noh")
<mol2 file> does have hydrogens and Gasteiger charges. If this file is
omitted, the program will gracefully exit after saving the top <top> sphere
clusters.
<library> is a mol2 library of compounds to dock. If omitted, the
program will gracefully exit after generating the docking grids. If <library>
is not found on the specified location (relative to the current working
directory), the program will search for it in the installation directory.
For example, specifying "ligand1000.mol2" reproduces the behavior of the
original binding response script.
-t <top> is the number of sphere clusters (ie. binding sites) to save.
Default=10.
-c <col> is the column in the .cst file on which to prioritize
the sphere clusters (ie. binding sites).
Possible values: 2 = the number of spheres (default)
3 = the standard deviation
4 = the maximum distance
-f force redoing the whole procedure. If this switch is not present, the
program will try to re-use any files already present on the system.
-e request extra output, ie. an ascii file to make a histogram of the
clusters and a pdb file containing only the centers of the clusters.
-b '<binding sites>' selects the binding sites (ie. sphere clusters)
used for docking by means of a space-separated list of 3-digit numbers
enclosed within single quotes. If -b is omitted, the program will prompt for
a selection, allowing the user to visually examine the sphere clusters first.
Wildcards are allowed.
Examples: -b '???' selects all binding sites, reproducing the behavior of the
original binding response script.
-b '0[01][0-9] 020' selects binding sites 000-020. Note that the
highest ranking binding site is designated 001, so 000 is silently ignored.
-l <ligand[%]> is the number of ligands used to calculate the final
binding response for each binding site. I followed by a % sign, it designates
a percentage of <library>. Default: 90% if <library> contains 100 compounds
or more, otherwise 100% .
-p <nproc> is the number of sites to dock simultaneously. Default=1.
Setting this to the number of processors in your machine is not always a good
idea; if you have multiple processors, you may still want to use "-p 1" and
specify "parallel_jobs yes" in your dock.in file instead.
-n run dock now. Docking is started on the background. If this switch is
not present, a shell script for running dock will be created but not started.
-o <orderly> controls to which extent the output files of the different
programs are cleaned up. Possible values:
0 = leave everything
1 = leave the files needed to restart the procedure from any point
(default)
2 = reproduce the behavior of the original binding response script
(not optimal for bindres!)
3 = delete everything except the final report (not recommended!!!)
-s create a separate script for calculating the binding response.
If this switch is not present, the docking script will take care of this.
Note that it *generally* doesn't make sense to combine this switch with -n.
Example: bindres -feno 2 -b '*' protein.pdb protein.mol2 ligand1000.mol2
Will reproduce the behavior of the original binding response script, except
that it works in any directory, that the input files can have a different
name than "protein", and that a different directory structure is created.
Top
Details
Methodology
This section is a step-by-step description of the procedure, including the effect of the different command-line parameters.
Bindres will:
Remarks:
- Throughout the procedure, bindres will try to re-use intermediary files already present on the system. This is useful to resume a crashed job or to redo the docking on another set of binding sites or with another library of ligands. However, under some circumstances, you may want to restart the calculation cleanly from scratch. In that case, use the -f switch.
- Throughout the procedure, unless -o 0 is specified, bindres will delete files generated by the different subprograms to avoid cluttering the directory structure and filling the hard disk. Note that -o 2 is only provided for users who want rigorous compatibility with br.csh . The behavior of -o 2 is arbitrary and it may break the "re-runnability" of bindres at certain stages. Similarly, -o 3 deletes as much as it can at every stage and will completely break the "re-runnability".
Top
breport
breport is a self-contained script, which can be called from the working directory as follows:
breport <subdir>
, where <subdir> is the subdirectory.
breport copies the draft report generated by bindres from the subdirectory to the working directory and appends a table of binding responses, optionally recalculating the binding response using only a subset of the ligands. Specifically, it uses the <ligand[%]> ligands with the best binding response, where <ligand[%]> is determined by bindres' -l option (<ligand[%]> is passed from bindres to breport via the .par file). Recalculating the binding response with a different subset can be accomplished either by editing the .par file (not recommended because the table will not be consistent with the part that was copied from the draft report) or rerunning bindres with a different -l value and without the -f switch.
Top
Environment variables and files
Bindres searches for files at different locations. This process is controlled by environment variables. Bindres needs $brdir to point to the binding_response installation directory. It accesses the following subdirectories:
- bin (or $brbin , if it is set): bindres searches for binaries at this location before resorting to $PATH . The following binaries are called:
- dms (UCSF; Note) : generates the protein surface
- sphgen (Dock 4.0; UCSF; Note) : generates spheres on the surface
- showbox (Dock 4.0; UCSF; Note) : generates a box around a sphereset
- grid (Dock 4.0; UCSF; Note) : generates a docking grid
- dock (Dock 4.0; UCSF; Note) : performs docking
- charmm (Harvard University) : general purpose computational chemistry program; invoked to re-cluster the spheres
- br.exe (Shijun Zhong; University of Maryland, School of Pharmacy) : calculates the binding response
- bindres (or $brscript , if it is set): here, bindres searches the following files
- sph_cut , a slightly modified version of sph_cut.pl by Shijun Zhong that allows for arbitrary file names. sph_cut.pl takes a sphgen-generated .sph file and creates a separate .sph file containing only the clustered spheres.
- clustst5.inp , a CHARMM script for reclustering the sphgen-generated spheres
- predockbr , a file that gets prepended to the shell scripts bindres generates for running dock and br.exe
- grid.in and dock.in , which bindres uses as input files for grid and dock, respectively. Using a version of Dock different than Dock 4 (in particular Dock 6) requires editing these files.
- breport , a separate script that generates a report after br.exe is finished
- parameter (or $brparam, if it is set) :
- grid.in and dock.in refer to the parameter files vdw_AMBER_parm99.defn , flex.defn , flex_drive.tbl and chem.defn, which are expected to be present in this directory. This behavior can be altered by editing grid.in and dock.in .
- bindres searches for <library> (ie. the library of compounds to be docked) in this directory before resorting to the current working directory (ie. the directory from which bindres was invokes).
Top
References
[1] S. Zhong, Alexander D. MacKerell Jr., J. Chem. Inf. Model. 2007, 00, 0000-0000.
Top