PockDrug: Pocket Druggability Prediction

Introduction Methodology User guide References

Predicting protein pocket’s ability to bind drug-like molecules with high affinity, i.e., druggability, is of major interest in the target identification phase of drug discovery (9). Therefore, pocket druggability investigations represent a key step of compound clinical progression projects. Currently computational druggability prediction models are attached to one unique pocket estimation method despite pocket estimation uncertainties. In this site, we propose “PockDrug-Server” that predicts pocket druggability, efficient on both; estimated pockets guided by the ligand proximity (extracted by proximity to a ligand from a holo protein structure using several thresholds) and estimated pockets not guided by the ligand proximity (based on amino atoms that form the surface of potential binding cavities). PockDrug-Server provides consistent druggability results using different pocket estimation methods. It is robust with respect to pocket boundary and estimation uncertainties, thus efficient using apo pockets that are challenging to estimate. It clearly distinguishes druggable from less druggable pockets using different estimation methods and outperformed recent druggability models for apo pockets. It can be carried out from one or a set of apo/holo proteins using different pocket estimation methods proposed by our webserver or from any pocket previously estimated by the user.

*Fig. 1: Druggable pocket correspond to a protein region capable of binding a drug-like molecule.*

You can submit your job in the druggability computation tab.

The following flowchart describes the general workflow of PockDrug-Server:

Fig. 2: PockDrug-Server worflow; 3 main phases: (a) identifying protein pocket(s) (Input & Pocket estimation methods) , (b) PockDrug: fast computation of physico-chemical and geometry descriptors to characterize pocket(s) (c) Output: pocket(s) druggability prediction

PockDrug workflow is divided into three different phases: (a) identifying protein pocket(s) (Input & pocket estimation methods) , (b) PockDrug: fast computation of physico-chemical and geometry descriptors to characterize pocket(s) (c) Output: pocket(s) druggability prediction. In the first phase two query types can be submitted:

Druggability prediction using pocket estimated by the user
Druggability prediction using protein structure

For more details about Pockdrug statistical model, please see the reference below (1):
PockDrug: a model for predicting pocket druggability that overcomes pocket estimation uncertainties. J. Chem. Inf. Model., 10.1021/ci5006004.Borrel,A., Regad,L., Xhaard,H.G., Petitjean,M. and Camproux,A.-C. (2015)

In the following section input, options and output details will be displayed in order to help the user to understand each of the steps involved and infer the output of the results produced in the workflow.

Input

The main function of PockDrug-Server is to predict pocket druggability through the Druggability section. To do so, two types of query can be submitted:

Pocket structure
Protein structure

A. Druggability prediction for protein pocket structure estimated by the user:

PockDrug-Sever allows users to predict druggability probability for a protein pocket estimated using an estimation pocket approach/ software of his choice. In this case, the required input is the , which corresponds to a PDB format file listing the atom pocket coordinates. The corresponding protein PDB file is also required to compute pocket descriptors and its predicted druggability probability.

*Fig. 3: Druggability prediction for protein pocket estimated by the user; PDB structure files of the pocket and its corresponding protein should be submitted.*

B. Druggability prediction using protein(s):

The protein structure corresponds to:

a PDB code;
a PDB protein file;
or a file of PDB code list.

For this type of query PockDrug-Server protocol is divided into two main steps:

Step 1: pocket(s) estimation using one or both different pocket estimation methods proposed by our web server;
Step 2: pocket druggability probability prediction. For each pocket, previously estimated in step 1, pocket druggability probability and standard deviation are provided by PockDrug model.

Protein query can be an apo or a holo protein. In the case of a holo-protein the ligand can be included in the protein file or uploaded apart so an optional field for ligand information is additionally proposed. The user can also submit a single job for a set of apo or holo proteins in the form of a list of PDB codes. In this case the same previous protocol is applied so for each PDB code included in the list, pocket(s) is (are) estimated and its corresponding druggability probability is then predicted.

For these three types of protein structure information, prox and/or fpocket estimation methods can be choosen.

prox: estimation method of holo protein pockets

This method is based on ligand proximity information giving the user the possibility of choosing a threshold going from 4 Å to 12 Å (by step of 0.5 Å) in order to extract the protein atoms localized within the chosen distance of the ligand. Indeed, this threshold choice was recently shown to have a strong influence on the pocket descriptors (2) and it seems pertinent to give the user the opportunity to choose it. Two commonly used distance thresholds are recommended: 4 Å as used by Krasowski et al. (3), to enable the extraction of a well-defined pocket limited to short ligand interactions (as hydrogen bonds or ionic interactions) and 5.5 Å, to enable the identification of all significant contact points and a more complete environment of the binding site. This method is suitable for holo-proteins and threshold of 4 Å is chosen by default.

fpocket: estimation method of holo or apo proteins

not guided by the ligand information, is an automated geometry-based method based on the decomposition of a 3D protein into Voronoi polyhedrals. It extracts all the pockets from the apo- or holo- protein surface using spheres of varying diameters. Its advantages include calculation speed and satisfactory performance in terms of overlaying known binding sites with the predicted sites (4). This method is used by default since it is suitable for both apo- and holo- proteins.

Optional ligand information

Only in single entry mode (submission of PDB ID or PDB structure), and when prox estimation method is checked, the field allowing user to submit ligand information is visible. In multiple entry mode, where a PDB ID code list is provided, and this optional ligand field is no more accessible.
3 types of ligand information input are possible:

Default option: All ligand in the holo protein structure having "HETATM" label are considered
Ligand HET code
Ligand structure, in a pdb mol or sdf format

In the figure 3 and 6, user can find an amount of Help text necessary for each input field so he can easily submit his job.

*Fig. 6: Form section of query type 2 for druggability prediction using protein(s).*

After this first stage of pocket estimation, in the background some geometrical and physico-chemical pocket descriptors and protein descriptors involved in PockDrug model are computed, in the aim to predict pocket druggability. In the following part, the output details will be displayed.

Output: Pocket descriptors and druggability prediction

The output page may consist of one or two tab(s), varying accordingly to the choice of one or two estimation method(s): one result tab per selected estimation method. Relative to the input type, two result displays are possible:

If the submitted query corresponds to a single pocket structure or a single protein entry (PDB code/file)
As shown in the figure 7, each tab is structured as following:

A sortable table providing for each protein pocket:

Six out of the eighteen pocket descriptors involved in PockDrug model. As pocket estimation method affects directly the descriptors values, the descriptor averages with associated standard deviations computed on NRDLD set estimated using three different estimations (prox4, prox5.5 and fpocket) are given as reference (Analysis help section) to facilitate the user analysis of the pocket descriptor values.

The average druggability probability and its associated standard deviation that indicates the druggability probability confidence on the seven best models included in PockDrug. For a probability greater than 0.5, pockets are considered as druggable. In the case where several pockets are considered, the table can be ordered in ascending or descending order of druggability probability to facilitate the identification of druggable pockets.

Pocket visualization using the Jmol web browser applet (5) pocket(s) and protein structures can be visualized and manipulated on the server through jmol applet. All computed results: pockets structures, eighteen descriptors and druggability scores can be downloaded.

Compressed file containing all the results can be downloaded using the download button. Only when the pockets are estimated using both fpocket and prox (for all distance threshold), overlapping scores between two pocket estimations are also computed and provided to the user through the compressed result file in order to allow pocket estimation comparisons and correspondence between two estimation methods. See section of pocket comparison in the supplementary data for the definition of overlapping scores. A bookmark button saving the link on which the user can follow the evolution of his job, access and download it within 7 days.

*Fig. 7: output example of the catalytic domain of humain ADAM33,1R55.*

If the submitted query corresponds to a list of PDB codes, each tab corresponds to a sortable table showing:

Protein PDB code: giving access to the detailed result page as it is described previously in this paragraph (case a)
Number of estimated pockets (for each method)
Number of druggable pockets (druggability probability greater than 0.5)
The highest druggability probability and its standard deviation

Analysis help section

Scores of overlapping between pockets estimated differently (prox vs fpocket):

If different pocket estimation methods are applied, pockets comparison is possible through the overlapping scores. Figure 8 shows a pocket of 1R55 binding the ligand "097", estimated by prox (in magenta) and fpocket (in blue).

*Fig. 8: Pocket comparison and overlapping scores.*

These overlappping scores are only included in the downloaded compressed results file.
If two default pocket estimation methods (prox and fpocket) are tested, overlapping scores between these two pocket estimations are computed and included in the downloaded compressed result file, in order to compare pocket estimates. The overlap between two estimated pockets was quantified using two scores:

Score of Overlap (SO) indicates the overlap between the two pocket estimates, i.e., pocket1 and pocket2, as follows:

SO = \frac{Ncommon}{Npocket 1 + Npocket 2 - Ncommon}

where Npocket1 and Npocket2 are the number of atoms in pocket1 and pocket2, respectively, and Ncommon is the number of atoms common to pocket1 and pocket2. SO yields values between 0 and 100%. An SO value of 100% indicates maximum overlap between the pair of estimated pockets used.

Relative Overlap (RO) was defined by Schmidtke et al. and indicates the overlap in terms of the exposed atoms between the two estimated pockets for the same binding site:

RO = \frac{SApocket 1 \cap SApocket 2}{SApocket 1}

where SApocket1 and SApocket2 are the solvent-accessible areas of pocket1 and pocket2, respectively, computed using NACCESS software. An RO value closer to 100% indicates all exposed area in pocket1 are included in pocket2.

Mutual overlap (MO) was defined by Schmidtke et al. (6), complementary to the RO.

MO = \frac{SApocket 1 \cap SApocket 2}{SApocket 2}

Descriptors definition and mean reference:

In the Table 1, each pocket descriptors provided by PockDrug-server is defined.

Table 1: Definition of the eighteen descriptors available in the PockDrug-Server
Descriptors	Description	References
Hydrophobicity descriptors
Hydrophobic kyte	Hydrophobicity based properties of residues	Kyte et al. 1982 (7)
Hydrophobic residues	Proportion of hydrophobic residues in pocket (C, G, A, T, V, L, I, M, F, W, Y, H, K)
Polarity descriptors
Polar residues	Frequency of polar residues in pocket (C, D, E, H, K, N, Q, R, S, T, W, Y)
Aromatic descriptors
Aromatic residues	Frequency of aromatic residues in pocket (F, Y, H, W)
Physicochemical descriptors
Aliphatic residues	Frequency of positive residues in pocket (I, L, V)
Otyr atom	Frequency of Otyr atoms in pocket	Milletti et al. 2010 (8)
Ne2 atom	Frequency of NE2 atoms in pocket	Milletti et al. 2010 (8)
Nlys atom	Frequency of Nlys atoms in pocket	Milletti et al. 2010 (8)
Ntrp atom	Frequency of Ntrp atoms in pocket	Milletti et al. 2010 (8)
Ooh atom	Frequency of Ooh atoms in pocket	Milletti et al. 2010 (8)
Nd1 atom	Frequency of ND1 atoms in pocket	Milletti et al. 2010 (8)
Geometric descriptors
Surface hull	Surface of convex hull (Å2)	RADI software
Diameter hull	Longest distance in the convex hull (Å)	RADI software
Volume hull	Volume of convex hull (Å3)	RADI software
Smallest size	Distance separating the two closest slabs enclosing the hull (Å)	RADI software
Radius cylinder	Radius of the smallest height cylinder enclosing the hull (Å)	RADI software
Nb RES	Number of pocket residues

As pocket estimation method affects directly the descriptors values, the descriptor averages with associated standard deviations computed on NRDLD set estimated using three different estimations (prox4, prox5.5 and fpocket) are given as reference (Table 2) to facilitate the user analysis of the pocket descriptor values.

Table 2: Nine PockDrug model descriptor averages with associated standard deviations (sd) computed on NRDLD set estimated using three different estimations (prox4, prox 5.5 and fpocket) to be used as reference.
Descriptors	prox4	prox5.5	fpocket
Hydrophobic kyte	-0.43 ± 1.17	-0.33 ± 1.11	-0.31 ± 0.99
Otyr atom	0.015 ± 0.02	0.009 ± 0.01	0.012 ± 0.015
Aromatic residues	0.216 ± 0.16	0.19 ± 0.13	0.17 ± 0.1
Radius hull (Å)	8.52 ± 1.78	10.19 ± 1.76	11.71 ± 3.84
Surface hull (Å2)	448.71 ± 162.64	724.571 ± 204.46	939.20 ± 582.82
Diameter hull (Å)	16.82 ± 3.64	20.16 ± 3.57	23.09 ± 7.65
Volume hull (Å3)	752.0 ± 408.45	1614.52 ± 686.23	2532.92 ± 2472.48
Smallest size (Å)	8.38 ± 1.68	11.36 ± 1.667	12.044 ± 3.44
Radius cylinder (Å)	8.307 ± 1.79	9.98 ± 1.77	11.45 ± 3.82
Nb RES	14.67 ± 5.53	21.88 ± 7.26	30.90 ± 18.22

Borrel,A., Regad,L., Xhaard,H.G., Petitjean,M. and Camproux,A.-C. (2015) PockDrug: a model for predicting pocket druggability that overcomes pocket estimation uncertainties. J. Chem. Inf. Model., 10.1021/ci5006004.
Krotzky,T., Rickmeyer,T.T., Fober,T. and Klebe,G. (2014) Extraction of Protein Binding Pockets in Close Neighborhood of Bound Ligands Makes Comparisons Simple due to Inherent Shape Similarity. J. Chem. Inf. Model., 54, 3229–3237.19.
Krasowski, A.; Muthas, D.; Sarkar, A.; Schmitt, S.; Brenk, R. DrugPred: a structure-based approach to predict protein druggability developed using an extensive nonredundant data set. Journal of chemical information and modeling 2011, 51, 2829–42.
Le Guilloux,V., Schmidtke,P. and Tuffery,P. (2009) Fpocket: an open source platform for ligand pocket detection. BMC Bioinformatics, 10, 168.
Herráez,A. (2006) Biomolecules in the computer: Jmol to the rescue. Biochem. Mol. Biol. Educ., 34, 255–261.
Schmidtke,P. and Barril,X. (2010) Understanding and predicting druggability. A high-throughput method for detection of drug binding sites. J. Med. Chem., 53, 5858–67.
Kyte,J. and Doolittle,R.F. (1982) A simple method for displaying the hydropathic character of a protein. J. Mol. Biol., 157, 105–32.
Milletti,F. and Vulpetti,A. (2010) Predicting polypharmacology by binding site similarity: from kinases to the protein universe. J. Chem. Inf. Model., 50, 1418–31.
Abi Hussein H, Geneix C, Petitjean M, Borrel A, Flatters D, Camproux AC. Global vision of druggability issues: applications and perspectives. Drug Discov Today. 2017 Feb;22(2):404-415.
Hussein HA, Borrel A, Geneix C, Petitjean M, Regad L, Camproux AC. PockDrug-Server: a new web server for predicting pocket druggability on holo and apo proteins. Nucleic Acids Res. 2015 Jul 1;43(W1):W436-42.

PockDrug

If you use PockDrug for publication, please cite:

Introduction

Methodology

User guide: Input, options and output