PEOPLE

GPU-SC: GPU-accelerated calculation of shape complementarity

Overview

The gpu-sc program is a port of CCP4's sc program that calculates the Lawrence and Coleman shape complementarity statistic of two interacting molecular surfaces, such as proteins-protein and protein-ligands interfaces. It takes advantage of CUDA-capable GPUs (such as NVIDIA GeForce, Quadro or Tesla video cards) to achieve an up to 200x speed up of run time without scarifying accuracy. Without a GPU, the calculation is performed on the CPU.

Care has been taken to port the original Fortran code to C++ with minimal changes to the algorithm. The Grasp input/output code has not been ported and is not available in this version. The interface has been changed from the STDIN-driven commands to command line flags for ease of use. Two output formats are available: short and detailed. The short format reports just the sc statistic, mean interface separation and contact area. The detailed format contains essentially all information that is reported by the original sc code, but formatted differently.

The code is freely available for download below. It has been tested on 32-bit and 64-bit Linux OS (various flavors) on our computational cluster. At MBI, the GPU-sc binaries are available in /software/tools on cassini.

Please direct feedback and comments to Luki Goldschmidt, luki@mbi.ucla.edu.

Obtaining and Compiling the code

The code is available for download here. Simply extract the archive and build the binary with make. To compile with GPU-support, you will need NVIDIA's nvcc compiler, available here. For execution, only the sc binary and the sc_radii.lib files are required (gpu-sc also links against CUDA's runtime shared library, which needs to be available at runtime). Pre-compiled binaries for 64-bit Linux (CentOS 5.4) are available here (compiled against CUDA Toolkit v4.0.17).

Usage

Run the program with no command line arguments to obtain a brief usage note:

This program computes the Lawrence & Coleman shape complementarity
between two molecules (based on the sc code from CCP4).
Luki Goldschmidt, March 2011

Usage: [options]

Options:
-1: Molecule 1 chain IDs (default: first chain)
-2: Molecule 1 chain IDs (default: second chain)
-q: Quick mode settings (lower accuracy)
-v: Show verbose output
-l: Filename of the atom radius library (default: sc_radii.lib)
-r: Probe radius (default: 1.7A)
-t: Trimming distance for the peripheral shell (default: 1.5A)
-w: Weight factor using in sc calculation (default: 0.5A)
-d: Dot density for molecular surface (default: 15 dots/A^2)
-s: Interface separation distance (default: 8A)
-g: Enable CUDA GPU acceleration (default: use if available)
-T: GPU thread limit (default: max supported by hardware, max 1024)

The simplest way to use the program is to provide one or more PDB files as arguments. By default, the first two chains in the PDB file will be used as molecule 1 and 2, respectively. To obtain detailed output, use the -v switch. See example at the end.

Performance

We have benchmarked the performance of the GPU-acceleration on three complexes, a small, a medium and a large one. The CPU on the host system was a AMD Phenom II 9950 (2.6 GHz) in all cases, except for the Tesla C2050 which used an Intel Xeon E5620. Run times, GPU-processing times and speed-up over the C++ CPU-only version are reported below. For reference, run times of the original sc program from CCP4 (v6.1.13) are also provided.

  NNQQNY Peptide
30+30 residues
2G38 M. Tb. PE/PPE Complex
98+198 residues
1AI5 Penicillin Acylase Complex
209+557 residues
C++ sc (CPU only) 2.60 sec 28.8 sec 418 sec
GeForce GTX580 0.15 sec
(17x)
0.61 sec
(47x)
2.10 sec
(200x)
Tesla C2050 0.31 sec
(8x)
1.10 sec
(26x)
3.18 sec
(131x)
GeForce 6800GT 0.35 sec
(7x)
1.325 sec
(22x)
9.48 sec
(44x)
QuadroFX 380 0.43 sec
(6x)
2.40 sec
(12x)
21.4 sec
(20x)
CCP4 sc (CPU only) 21.1 sec 143 sec 1579 sec

The NNQQNY-peptide calculation was performed after removing Hydrogens from the PDB file with egrep -v '^.{13}H'.

The performance on a NVIDIA GeForce GTX580 video card (1594 MHz, 16 processors, 512 cores, 1024 threads) is graphed below.

 

Example

# ./gpu-sc /pdb/pdb2g38.ent
/pdb/pdb2g38.ent 0.6860 0.5558 2789
# ./gpu-sc -v /pdb/pdb2g38.ent
Atom radii read: 78
GPU support enabled: GeForce GTX 580
[1594 MHz, capability 2.0] with 16 processors, 1024 threads.
Generating molecular surface, 15 dots/A^2
Convex dots: 32883
Toroidal dots: 39581
Concave dots: 37741
Total surface dots (1): 49055
Total surface dots (2): 61150
Total surface dots: 110205
Trimming peripheral band, 1.5A range
Peripheral trimming GPU processing time: 50 ms
Peripheral trimming GPU processing time: 30 ms
Computing surface separation and vectors
Find Neighbors GPU processing time: 10 ms
Find Neighbors GPU processing time: 20 ms

[...]

Molecule 1:
Total Atoms: 579
Buried Atoms: 451
Blocked Atoms: 128
Total Dots: 49055
Trimmed Surface Dots: 20516
Trimmed Area: 1368.2

Molecule 2:
Total Atoms: 1381
Buried Atoms: 634
Blocked Atoms: 747
Total Dots: 61150
Trimmed Surface Dots: 21493
Trimmed Area: 1420.94

Total/Average for both molecules:
Total Atoms: 1960
Buried Atoms: 1198
Blocked Atoms: 1085
Total Dots: 110205
Trimmed Surface Dots: 42009
Trimmed Area: 2789.14


Molecule 1->2:
Mean Separation: 0.730461
Median Separation: 0.5492
Mean Shape Compl.: 0.603063
Median Shape Compl.: 0.684586

Molecule 2->1:
Mean Separation: 0.748373
Median Separation: 0.562416
Mean Shape Compl.: 0.604624
Median Shape Compl.: 0.687483

Average for both molecules:
Mean Separation: 0.739417
Median Separation: 0.555808
Mean Shape Compl.: 0.603843
Median Shape Compl.: 0.686035

==================================================
Shape Complementarity: 0.686035
Mean interface separation (A): 0.555808
Area buried in interface (A^2): 2789.14
==================================================

See Also

The shape complementarity calculation code been added to Rosetta molecular modeling suite.