Description: BindGPT is a new framework for building drug discovery models that leverages compute-efficient pretraining, supervised funetuning, prompting, reinforcement learning, and tool use of LMs. This allows BindGPT to build a single pre-trained model that exhibits state-of-the-art performance in 3D Molecule Generation, 3D Conformer Generation, Pocket-Conditioned 3D Molecule Generation, posing them as downstream tasks for a pretrained model, while previous methods build task-specialized models without task transfer
nerf (195) d-nerf (90) nerfies (89)
Generating novel active molecules for a given protein is an extremely challenging task for generative models that requires an understanding of the complex physical interactions between the molecule and its environment. In this paper, we present a novel generative model, BindGPT which uses a conceptually simple but powerful approach to create 3D molecules within the protein's binding site. Our model produces molecular graphs and conformations jointly, eliminating the need for an extra graph reconstruction st
The key idea of our method is utilizing an autoregressive token generation model, influenced by GPT-based models, to solve several 3D small molecule generation tasks in one simple yet flexible paradigm. The main principle in our approach is to formulate several 3D molecular design task as prompted generation of text. To achieve that, we layout the tokens of a condition before the tokens of the object to generate. For instance, a prompt can be the protein pocket for the pocket-conditioned generation task or
Figure 1: Data layout during the pretraining. Arrows show the tokens sequence order. Nodes such as <POCKET> show special tokens. Training is done on a mixture of pocket and ligand datasets.