Catch up on the latest AI articles

GPT-Lab, A Fully Automated System Of Experimental Processes With LLM And Robotics

GPT-Lab, A Fully Automated System Of Experimental Processes With LLM And Robotics

Large Language Models

3 main points
✔️GPT-Lab, a large-scale language model that allows robots to automatically design, conduct, and optimize experiments
✔️ Large-scale language model extracts necessary information from the literature, greatly improving the accuracy of experimental protocol design
✔️ New relative humidity (RH) dye developed in the GPT-Lab. sensor predicts RH with high accuracy, validating the effectiveness of the system

GPT-Lab: Next Generation Of Optimal Chemistry Discovery By GPT Driven Robotic Lab
written by Xiaokai Qin, Mingda Song, Yangguan Chen, Zhehong Ai, Jing Jiang
(Submitted on 15 Sep 2023)
Comments: Published on arxiv.
Subjects: Artificial Intelligence (cs.AI); Robotics (cs.RO)


The images used in this article are from the paper, the introductory slides, or were created based on them.


Autonomous laboratories (self-driven laboratories, SDL) are now attracting attention, and at the forefront of science, the combination of robotics and advanced algorithms is opening up new possibilities in various fields such as materials science, chemical synthesis, biology, and medicine. Autonomous laboratories (SDL)accelerate research and development by allowing robots to automatically design, conduct, and even optimize experiments, generating high-quality data on a large scale. While this technology has achieved excellent results, especially in the development of new materials and pharmaceuticals, there are still areas that require a high level of expertise and experience from the researchers.

This is where text mining is gaining attention. Natural language processing (NLP) is being used to extract the information researchers need from the literature in an attempt to improve the efficiency of their research. Among others, the emergence of large-scale language models such as GPT-4 has greatly improved the accuracy of literature mining and experimental protocol design, achieving remarkable results with only a small amount of training data.

Researchers at Carnegie Mellon University have demonstrated how GPT can be used to support scientific research and successfully automate the design of experiments using the Opentrons API. This achievement is a major step toward the further development of autonomous laboratories (SDLs). However, there is still room for improvement in automating the extensive literature search required to discover new reagents and materials. If this challenge can be overcome, the research and development process is expected to be further accelerated.

Against this background, this paper develops a "GPT-Lab" with a GPT-enhanced autonomous laboratory (SDL) pipeline called "ARMFE" (Analysis - Retrieval - Mining - Feedback - Execution). This pipeline utilizes GPT-4-based agents to facilitate the research and development process quickly and accurately. In this paper, we use this pipeline to successfully develop a new type of dye sensor that detects relative humidity (RH). The sensor is able to predict relative humidity (RH) with high accuracy, demonstrating the effectiveness of ARMFE.

This achievement is a major step toward the realization of robots that can perform independent research and development with minimal human intervention. The evolution of autonomous laboratories has only just begun, and we can expect many more discoveries and innovations in the future.

GPT-Lab Overview

GPT-Lab consists of two components: one is an automated experiment design agent based on the GPT framework. The other is an algorithm-driven robotics experiment platform. Together, they form a system that automatically links the entire process from experiment preparation to results.

And we callthis series of flows "ARMFE(Analysis - Retrieval - Mining - Feedback - Execution )". The figure below is an overview of this workflow. The agentconsists of five steps:Analysis(Analysis),Retrieval(Acquisition of literature ) ,Mining(Text mining) ,Feedback(Feedbackfrom researcher ), and Execution(Experiment execution ).

In therequirementsanalysis(Analysis), the researcher presents specific experimental requirements to the agent, which uses the ChatGPT API to extract the five keywords needed for a literature search from the requirements presented by the researcher. If the request is unclear, the agent asks the researcher questions to clarify the methods and information needed.

In literature retrieval(Retrieval),agents who have retrieved keywords search online to gather relevant articles and summaries of them, reusing the ChatGPT API to sort out the more relevant documents from this information and obtain complete articles for analysis.

Text mining (Mining)utilizes GPT to understand the content of an article and extract information about the substances used in the experiment and their roles. This information is organized in JSON format and stored for later processing.

In the feedback from the researcher (Feedback),GPT-Lab presents information extracted from JSON to the researcher. Based on this information, the researcher selects the experimental material to be used and notifies the agent. Based on this feedback, the agent constructs the experimental parameters in JSON format and sends them to the robotic experiment platform.

In Execution, the robotic experimental platform performs the liquid formulation and subsequent experiments based on the parameters received from the agent and the design space of the material proposed by the research agent designed by GPT. A file containing the CAS codes and concentration values of the substances required for the experiment is sent to the robotics experimentplatform, where the actual experiment is performed.

Experiment: Article Mining by Agents

The evolution ofGPT-Lab is revolutionizing scientific research methodologies:GPT-Labagents have the ability to process an average of 100 research articles per hour, and by utilizing multi-threading technology, this speed can be increased 3 to 5 times. This saves more than 100 times the time compared to traditional manual literature extraction. The system also provides a comprehensive analysis of potential reagents relevant to the research topic and can effortlessly summarize ultra-high dimensional variables that human researchers have difficulty with.

From the 500 articles analyzed, 50 potential reagents were identified, from which 18 were selected with a relevance score of 80% or higher. This includes eight key material candidates, and the system identifies their experimental role, intended use, source, and rationale for relevance.This information is provided to researchers to assist them in making choices based on their expertise and experimental needs. The figure below is an example of a conversation with an agent.

Compared to GPT alone, this system has showngreater accuracy and feasibility; whilemany of thematerials provided byGPT often do not meet the requirements of subsequent robotic experiments,many of thematerials provided byGPT-Labhave been shown to be suitable and feasible for the Imperial family experimental setting The following is a list of the substances that have been used in the GPT-Lab.

Further demonstrating the versatility of this approach, we are exploring applications beyond the discovery of humidity sensor materials. From the search for key materials for perovskite solar cells to the discovery of methods to detect alkaloid content in mulberry leaves, a wide range of applications have been shown to be possible. Through these explorations, we have confirmed the applicability of our research beyond a single application domain to the discovery of a wide variety of materials and methods.

Experiments: Robot experiments conducted

Selected reagents are divided into three groups: colorants, additives, and solvents. Cobalt chloride (CoCl2), nickel iodide (NiI2), and nickel bromide (NiBr2) are selected as colorants, while calcium chloride (CaCl2), tetramethylammonium iodide (TMAI), polyethylene glycol (PEG), and ethyl cellulose (EC) are selected as additives, Isopropanol (IPA) was chosen as the solvent. In the specific experiment, the amount of each reagent is considered a variable, and together there are eight variables. Since the overall amount is constant, determining the amount of the first seven reagents automatically determines the amount of the last reagent, resulting in a seven-dimensional variable space.

The conduct of the experiments is closely related to the DBTM process already reported.This process is an efficient algorithmic induction process, implemented on a robotics experiment platform. An overview is shown in the figure below.

(a) shows a schematic diagram of the liquid processing workstation, which provides a state-of-the-art technology-based research area.(b) shows the functional modules of the liquid handling workstation, which includes several functional modules such as the undiluted solution area, pipette tip area, recipe setting area, and sensing unit manufacturing area to enhance research efficiency and accuracy. (c) is an image of the sensing unit, where each colored dot represents a gas sensing unit, and its color is identified by a computer vision algorithm. This allows for advanced analysis of gas sensing capabilities.(d) is a schematic of the gas pathway, where the nitrogen (N2) flow is split into two pathways through a desiccator and a humidifier, controlled by two mass flow controllers (MFCs). This provides a range of relative humidity (RH) for testing the gas sensing unit.(e) is the gas test setup, which includes a dark room, light source, camera, and gas chamber. The gas sensing unit is placed in a transparent upper chamber formed by a darkroom and a light source that provides uniform light conditions. The camera records in detail the color changes in different environments, allowing the researcher to collect precise data.

The optimal recipe is quickly found by adjusting the parameters according to the user's requirements. Specifically, the cycle is "recipe generation - preparation by robot - testing by robot - processing of data - generation of next recipe.

The robotic system consists of a liquid processor and a self-constructed darkroom. The preparation process takes place in the liquid processor, while the testing is performed in the darkroom. During testing, nitrogen gas of different humidity levels is passed through the sample, which is fixed in the gas chamber. A camera continuously records the color change under consistent lighting conditions and produces a curve showing the relationship between color and time. From this curve, indicators such as color change range, reaction time, reversibility, and sensitivity are calculated. These indices are evaluated comprehensively to derive a final score. This iterative process is guided by a Bayesian optimization algorithm, which guides the next sample selection in the direction of greater uncertainty or greater potential for score improvement. In the actual experiment, 96 samples are collected in one batch. After the initial 96 recipes are randomly generated, subsequent rounds of recipes are created using a Bayesian strategy. Each round includes exploration and utilization trends.

The distribution of sample scores for different experimental batches is shown in Figure(a)below. As the number of rounds increases, the highest score for each round gradually increases. From the third round, we see that many samples begin to concentrate in the 0-score range. This may be the result of a deliberate tendency toward exploration in order to avoid falling into a locally optimal solution. However, recipes affected by this exploratory tendency tend to be more uncertain and more prone to extreme values, resulting in lower scores. after 5 rounds of experimentation and the accumulation of 480 samples, the highest scores no longer increase significantly. The distribution of scores in the fifth round is also more spread out, with more samples achieving high scores, indicating that these scores are closer to zero than in the previous round. This suggests that finding a good recipe under high uncertainty is difficult and that the current optimal recipe is approaching a quasi-global optimal solution.

Figure (b) above shows the total amount of each substance used in the 96 recipes for each of the five iterative rounds. Because the recipes were randomly generated in the first round, the proportions of each substance within a particular recipe may vary significantly, but the total proportions are similar. As the iterations progressed, there was an overall increasing trend in the use of CoCl2 and an overall decreasing trend in the use of CaCl2, NiBr2, and TMAI, which were gradually eliminated. This trend suggests that recipes with more CoCl2 may yield better results, while CaCl2, NiBr2, and TMAI have limited or opposite effects.

Two selected recipes are also shown in Figure (c) above, which exclude NiBr2 and TMAI. Recipe 1 contains a small amount of NiI2 and Recipe 2 contains a small amount of CaCl2 to increase sensitivity to low and high humidity conditions. The predictive accuracy for relative humidity (RH) of the array consisting of these two recipesaccurately quantifies from 5% to 95% RH at room temperature with a mean squared error (RMSE) of 2.68%, as shown in Figure(d)above.


GPT-Lab addresses three main important initiatives. First, it achieves superior performance of GPT in experimental design. Second, it demonstrates the potential of automated processes from experiment proposal to tangible results. And third,chemists without computer science expertise are effectively utilizing the robotic experimental platform in their experiments, dramatically improving experimental efficiency. Indeed, the dye humidity sensor, built within a week with little human intervention, predicts relative humidity from 5-95% at room temperature with an error margin of 2.68%.

However, through this experimental process, several challenges have also been identified: the intelligence of GPT is limited, and inaccuracies in the output can be problematic. In the event of an incorrect response, programmatic verification and retries are required to ensure the robustness of the agent, which increases the cost of using GPT. In addition, while GPT-Labsaves researchers time by eliminating literature review and experimental work, it limits their ability to acquire domain-specific knowledge outside of the exposed literature. This means that researchers must manually filter experimental parameters.

Solutions may include learning larger models with richer chemical knowledge or expanding GPT's knowledge range through fine tuning on knowledge graphs and extensive data sets. As larger models are developed, the area of chemical research is expected to become more efficient and streamlined.

  • メルマガ登録(ver
  • ライター
  • エンジニア_大募集!!

If you have any suggestions for improvement of the content of the article,
please contact the AI-SCHOLAR editorial team through the contact form.

Contact Us