FLAP: GRID molecular interaction fields in virtual screening. validation using the DUD data set

Cross, Simon; Baroni, Massimo; Carosati, Emanuele; Benedetti, Paolo; Clementi, Sergio

doi:10.1021/ci100221g

The performance of FLAP (Fingerprints for Ligands and Proteins) in virtual screening is assessed using a subset of the DUD (Directory of Useful Decoys) benchmarking data set containing 13 targets each with more than 15 different chemotype classes. A variety of ligand and receptor-based virtual screening approaches are examined, using combinations of individual templates 2D structures of known actives, a cocrystallized ligand, a receptor structure, or a cocrystallized ligand-biased receptor structure. We examine several data fusion approaches to combine the results of the individual virtual screens. In doing so, we show that excellent chemotype enrichment is achieved in both single target ligand-based and receptor-based approaches, of approximately 17-fold over random on average at a false positive rate of 1%. We also show that using as much starting knowledge as possible improves chemotype enrichment, and that data fusion using Pareto ranking is an effective method to do this giving up to 50% improvement in enrichment over the single methods. Finally we show that if inactivity or decoy data is incorporated, automatically training the scoring function in FLAP improves recovery still further, with almost 2-fold improvement over the enrichments shown by the single methods. The results clearly demonstrate the utility of FLAP for virtual screening when either a limited or wide range of prior knowledge is available.