Neural Ranker (ranking Model) And Large-scale Language Model To Enhance Clinical Trial Search
3 main points
✔️ have come up with new ways to make medically relevant information easier to retrieve.
✔️ data models can help computers better understand information.
✔️ The ability to efficiently locate medical information is expected to facilitate medical research and trials, resulting in more effective treatments and medical advances.
Team IELAB at TREC Clinical Trial Track 2023: Enhancing Clinical Trial Retrieval with Neural Rankers and Large Language Models
written by Shengyao Zhuang, Bevan Koopman, Guido Zuccon
(Submitted on 3 Jan 2024)
Comments: TREC Notebook
Subjects: Information Retrieval (cs.IR)
code:
The images used in this article are from the paper, the introductory slides, or were created based on them.
Summary
CSIRO's ielab team and the University of Queensland are looking to improve the way medical trials and research are aided. Specifically, they have come up with a new way to make it easier to search for medical-related information at one of the 2023 events.
It uses computers to organize information about healthcare. To do this, we use something called a large data model. This data model helps the computer make sense of the information.
For example, have the computer generate a brief description of the patient's condition and the nature of the exam. This allows the computer to more effectively search for medical information.
We also use another technique to organize that information in more detail. That is to organize the information using rankings created by medical experts.
Combined, these methods will enable the efficient search for medical information. And it is hoped that this will facilitate medical research and trials, resulting in more effective treatments and medical advances.
Introduction
The research team participated in the TREC Clinical Trials Track to explore new ways to effectively retrieve medical information. In this track, they were given the task of entering a patient description as a query and retrieving clinical trials from ClinicalTrials.gov based on that description.
The research team built an acquisition and reranking pipeline that utilized a multi-stage, pre-trained language model, drawing from successful methods used in previous information retrieval tasks. Specifically, the acquisition and reranking of PubmedBERT-based information was incorporated, as well as the use of GPT-4 to assess the relevance of clinical trials.
However, there were several challenges to this study: first, the amount of data available for training was inadequate due to the limited amount of data available for training; second, patient descriptions were previously written in free natural language, whereas this time they were written in semi-structured XML data. This change in data format can lead to inconsistencies between training and inference data, which can affect system performance.
Proposed Method
The study proposed a new approach to developing an easier way to find medical information. Specifically, the study focused on the lack of additional information needed to organize medical data and train models. To this end, a method was devised that uses artificial intelligence techniques to generate descriptive text about patient conditions and clinical trials.
A system called "Retriever" has also been developed, which is used to search for medical information. This is designed to more effectively sort the information that is initially available. A "re-ranking" method was then proposed to further organize the information obtained from this system.
In addition, the research team used GPT-4, a model of artificial intelligence, to assess the relevance of medical information. This enabled the trained model to accurately determine the relevance of medical information.
However, there were some challenges to this study. For example, the ethical mode of the model did not always work properly and the format of the data was not always appropriate. Various methods were explored to address these issues.
Result
In this study we evaluated a medical information retrieval system. To do so, we reviewed the results in TREC CT 2022 and tested the system in TREC CT 2023. The study tested different ranking methods and compared their effectiveness. As a result, five runs were submitted. These used different ranking methods.
The 2023 results generally follow the previous year's trend, with lower NDCG@10 (a measure of relevance of the top 10 ranked documents) and P@10 (a measure of fit of the top 10 ranked documents) scores for the hybrid model. However, considering the re-ranking step, the highest Recall@1000 (an index that assesses the reproducibility of the top 1000 ranked documents) could be achieved.
This figure shows the results for the TREC CT 2022 (top) and 2023 (bottom). The overall performance of each model is shown as a bar graph, with the best results shown in bold. Letters above indicate that each model is statistically significantly different from the other.
In addition, a diagram is provided showing how different parts of the system have improved on a query-by-query basis. The query-by-query improvements at different stages of the pipeline are as follows
(a) The first phase of improvement is the enhancement of the hybrid retriever against Dense Retriever (DR).
(b) The next improvement is the hybrid retriever improvement to SPLADEv2.
(c) In addition, Cross-Encoder (CE) improvements were made for hybrid retrievers.
(d) Finally, the performance improvement of GPT-4 over Cross-Encoder was confirmed.
This has identified strategies to improve the overall performance of the search system.
In short, this study provides valuable insight into identifying the most effective methods of retrieving medical information and ensuring that patients and healthcare professionals have rapid access to the information they need.
Conclusion
This study developed a new clinical trial search system. The system used a combination of PLMs (large-scale language models) and LLMs (large-scale lexical models). Our approach used LLMs to generate data without relying on human labeling. This generated training data for developing powerful retrieval models and relancers. In addition, we took advantage of the multi-shot capability of LLM to improve the system's ranking. The system demonstrates the results of this research in the TREC clinical trials track and demonstrates the competitiveness of the multi-stage clinical trial search pipeline.
Looking ahead, it is important to further improve this system to increase the accuracy of clinical trial searches. It is also important to improve the performance of the system by introducing new data and algorithms. It is also important to improve the flexibility and scalability of the system to keep pace with rapid changes in medical information. This will result in a more efficient and reliable clinical trial retrieval system.
Categories related to this article