Speaker
Description
Computational prediction of small-molecule binding to protein targets has been central to drug discovery for decades, yet conventional physics-based docking methods have largely reached a performance plateau. Machine learning (ML)-based alternatives offer a promising shift, though concerns about prediction quality, chemical plausibility of generated poses, and the lack of rigorous evaluation standards have slowed their practical adoption.
This work presents an ML-based docking method and evaluates it across several practically relevant scenarios: binding sites containing non-protein entities such as metal ions, cofactors, and ordered water molecules, and docking into ligand-free (apo) protein conformations. To ensure unbiased assessment, the model was trained and tested on a large, structurally diverse benchmark derived from experimentally determined three-dimensional structures, with cluster-based data partitioning to minimize information leakage. Performance was compared against several widely used conventional docking programs.
In the general evaluation setting, the proposed method reduced positional error by 29–38% relative to conventional approaches while maintaining low computational cost. In binding pockets containing ions and conserved water molecules, median pose deviations decreased by 0.7-0.8 Å compared to the best classical alternative. Under the most challenging apo protein conditions, the method achieved a median improvement of approximately 2.1 Å over the leading conventional tool.
These results demonstrate that the ML-based approach consistently outperforms established techniques in accuracy, generalizes to complex binding environments and conformational uncertainty, and produces chemically and geometrically valid poses suitable for integration into modern drug discovery pipelines.