Developing new therapeutics is an expensive and time-consuming process that often takes 10-15 years and costs up to $2 billion. Most drug candidates fail during clinical trials, making it a high-risk venture. However, recent advances in artificial intelligence (AI) hold promise for improving therapeutic development.
Large language models (LLMs), particularly transformer-based models, have shown exceptional performance in natural language processing tasks through self-supervised learning on large datasets. Researchers from Google Research and Google DeepMind introduced Tx-LLM, a generalist LLM fine-tuned from PaLM-2 to handle diverse therapeutic tasks.
Tx-LLM achieved competitive performance on 43 tasks and surpassed state-of-the-art results on 22 tasks. It excels in tasks combining molecular representations with text, demonstrating positive transfer between different drug types. This model is a valuable tool for end-to-end drug development, from gene identification to clinical trials.
The researchers compiled a dataset collection called TxT, containing 709 drug discovery datasets from the Therapeutics Data Commons (TDC) repository. They fine-tuned Tx-LLM using this data and evaluated its performance using metrics such as AUROC and Spearman correlation. The model demonstrated strong performance on TDC datasets, surpassing or matching state-of-the-art results on 43 out of 66 tasks.
While Tx-LLM shows promise for therapeutic development, it still has limitations in natural language instruction and prediction accuracy, requiring further improvement and validation. Nevertheless, this breakthrough highlights the potential of fine-tuned LLMs for tasks involving drugs and text-based features.
Source: https://www.marktechpost.com/2024/10/10/tx-llm-a-large-language-model-llm-fine-tuned-from-palm-2-to-predict-properties-of-many-entities-that-are-relevant-to-therapeutic-development/