Pub. online:14 May 2024Type:Research ArticleOpen Access
Journal:Informatica
Volume 35, Issue 3 (2024), pp. 617–648
Abstract
This work introduces ALMERIA, a decision-support tool for drug discovery. It estimates compound similarities and predicts activity, considering conformation variability. The methodology spans from data preparation to model selection and optimization. Implemented using scalable software, it handles large data volumes swiftly. Experiments were conducted on a distributed computer cluster using the DUD-E database. Models were evaluated on different data partitions to assess generalization ability with new compounds. The tool demonstrates excellent performance in molecular activity prediction (ROC AUC: 0.99, 0.96, 0.87), indicating good generalization properties of the chosen data representation and modelling. Molecular conformation sensitivity is also evaluated.
Pub. online:7 Nov 2023Type:Research ArticleOpen Access
Journal:Informatica
Volume 34, Issue 4 (2023), pp. 743–769
Abstract
Ligand-Based Virtual Screening accelerates and cheapens the design of new drugs. However, it needs efficient optimizers because of the size of compound databases. This work proposes a new method called Tangram CW. The proposal also encloses a knowledge-based filter of compounds. Tangram CW achieves comparable results to the state-of-the-art tools OptiPharm and 2L-GO-Pharm using about a tenth of their computational budget without filtering. Activating it discards more than two thirds of the database while keeping the desired compounds. Thus, it is possible to consider molecular flexibility despite increasing the options. The implemented software package is public.