ChemReasoner] Catalyst Discovery Framework Utilizing Quantum Chemistry And LLM

Large Language Models 29/11/2024

3 main points
✔️ Identify chemical descriptors relevant to reactions and use large-scale linguistic models to discover optimal catalysts
✔️ Enhance natural language reasoning capabilities with quantum chemistry feedback to predict complex catalytic processes
✔️ ChemReasoner integrates linguistic reasoning with quantum chemistry feedback to improve efficiency of catalyst discovery

ChemReasoner: Heuristic Search over a Large Language Model's Knowledge Space using Quantum-Chemical Feedback
written by Henry W. Sprueill, Carl Edwards, Khushbu Agarwal, Mariefel V. Olarte, Udishnu Sanyal, Conrad Johnston, Hongbin Liu, Heng Ji, Sutanay Choudhury
(Submitted on 15 Feb 2024 (v1))
Comments: 9 pages, accepted by ICML 2024, final version
Subjects: Chemical Physics (physics.chem-ph); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG)

code：

The images used in this article are from the paper, the introductory slides, or were created based on them.

Summary

To discover new catalysts, one needs to find the best combination of chemical descriptors (properties). However, these are often based on rules of thumb. Chemists reason in their minds about combinations of reactants, catalysts, and operating conditions to achieve more energy-efficient chemical transformations. In one study (Nørskov et al., 2011), using chemical descriptors to link microscopic surface properties with macroscopic catalytic performance is the key to rapidly generating new hypotheses.

Large language models can enable such data-driven autonomous exploration and accelerate scientific discovery. In this paper, we aim to enhance natural language reasoning capabilities with quantum chemistry feedback in order to discover the optimal catalyst for a target reaction.

Reasoning about complex catalytic processes requires the ability to construct in multiple modalities beyond existing language models. This includes predicting scientific concepts and 3D atomic structures from the literature. Reasoning about multiple macroscopic properties is required to determine the optimal catalyst. The first step is to identify the best combination of chemical descriptors (e.g., "toxicity resistance," "porosity") relevant to the reaction. This gives rise to a myriad of possible combinations to be reasoned about and requires an external reasoner, such as a large-scale language model. Large-scale language models leverage knowledge of scientific concepts to propose key properties and select the best catalysts with these properties. Reducing the vast space of candidate catalysts requires reasoning about the complex interactions between atomic structures in 3D space. Furthermore, while simple reactions can be evaluated with the adsorption energies of 3D chemical structures, complex reactions require consideration of multi-step reaction pathways and selectivity.

To address this challenge, this paper proposes a framework that combines large-scale language model-driven heuristic search with structure-based scoring with atomic structure graph neural networks (GNNs) learned from quantum chemical simulations.

The framework formulates catalyst discovery as an uncertain environment in which agents (LLMs) pursue energetically favorable catalysts based on computational chemical feedback.

At each step of the search, the agent (1) automatically identifies the optimal set of characteristics to consider, (2) generates new search prompts based on the identified characteristics, and (3) executes the prompts according to advanced instructions.

The candidate catalysts identified in each step are converted into a 3D atomic representation of the catalyst-adsorbate structure, which evaluates spatial orientation, energy barriers in the reaction pathway, and stability, yielding a reward for catalyst suitability. This reward drives the large-scale language model toward a catalyst that enables the reaction with minimal external energy. This is an important step in the development of environmentally friendly industrial processes.

This paper introduces ChemReasoner, a novel hypothesis generation and testing framework that integrates the knowledge space of large-scale language models with quantum chemistry-based feedback. This framework enables natural language-based reasoning with strong domain guarantees derived from computational chemistry methods.Inaddition, ChemReasoner-Planner, a method planned by alarge-scale languagemodel, has been shown to outperform expert-selected chemical descriptor-based searches in two of the three categories of evaluation benchmarks. In addition, the paper goes beyond screening of catalysts based solely on adsorption energies and proposes a new method of reasoning about reaction pathways and energy barriers.

Systems and Methods

ChemReasoner'salgorithms are. Consists of two major components: (1) heuristic search by planning and guiding large-scale language models in chemical space, and (2) quantum chemical feedback by graph neural network (GNN) models learned from density functional theory (DFT) simulations The GNN model is a large scale language model that is used to plan and guide the search.

The goal of heuristic search is to systematically search for candidates from different regions of the chemical space for a user-specified natural language query. This is accomplished by taking the original query (or prompt) and the corresponding large-scale language model answer and applying different screening criteria to incrementally contextualize the large-scale language model prompts and answers into narrowed regions of chemical space. This process is illustrated in the figure below. In this paper, the beam search method is used for heuristic search.

The goal of this paper is to explore chemical descriptors to design optimal prompts, so that the large-scale language model returns the best candidate catalyst for a catalyst query. Starting with a general prompt _P0, we modify the prompt using a set of behaviors to improve the output of the large-scale language model with respect to the reward function R. It is worth noting that ChemReasoner-Planner generates its own action space A

We define the search tree as a hierarchical tree consisting of (prompt, answer, reward) nodes. Each node of this tree represents a state in the search space (hereafter referred to as the search or query state). Nodes are linked if an action a ∈ A modifies one prompt to another. The path from the root to the leaf node is called the Reasoning Pathway.

Following Sprueill et al. (2023), each large-scale language model prompt consists of (1) a natural language question, (2) a list containing or excluding specific chemical descriptors for the target catalyst, and (3) three structured relational operators that describe how to shift the search from the candidate catalysts of the previous query to different regions of chemical space The query consists of three structured internal representations.

Three relational operators are used: (1) similarity (2) subclassification (3) dissimilarity. Each large-scale language model answer represents a set of candidate catalysts, and each candidate is scored using a reward function. The search begins at the root node, and each node is expanded into a set of child nodes with action a. Each layer of the search tree is pruned based on the node's reward. Finally, when the maximum search depth is reached, the node with the highest reward is selected as the overall response to the first prompt.

The planner is responsible for systematically extending the search by determining contextually appropriate actions. Action selection is based on the complete sequence of previous queries and catalyst discoveries tracked from the root of the search tree. This contextual foundation allows automatic constraint of the next search direction in a scientifically consistent manner.

Considering any node in the search tree, if the planner executes the following query (orange box in the upper left of the figure below),then the large-scale language model executes the query to obtain a set of candidate catalysts (e.g. Cu, Pd, etc.). Each of these candidates is converted to a 3D atomic representation and evaluated with a reward function. At any given depth of the search tree, all candidates are collected and only a subset is selected for deployment in the next iteration. This process continues iteratively until the maximum tree depth is reached.

Overall, by leveraging language models to contextually extend the search, ChemReasoner-Planner balances the search while generating interpretable and scientifically grounded inference paths.Each reward function returns a real-valued value indicating the catalyst's goodness of fit (the higher the better) for the input question. And in this paper we implement two reward functions of different complexity.

Reward based on adsorption energy rewards the adsorption energy of the most stable bonding structure of the catalyst. The calculation begins by converting the symbolic representation of the catalyst (e.g., "platinum") and adsorbate (e.g., "CO") into a 3D atomic structure (right figure below, reproduced).

The stability and energy of a catalyst's atomic structure directly affect its catalytic activity and selectivity. Therefore, the most stable configuration of a catalyst-adsorbate pair is calculated and its adsorption energy is used as a measure of reward. The optimization process, also known as the relaxation process, iteratively relaxes the atomic positions of the 3D structure until a minimum value of energy is found. GNN is then used to calculate the adsorption energy from this state.

Reaction pathway-based compensation measures catalyst excellence by considering multiple reaction pathways and intermediate stages. It first retrieves reaction pathways from a large-scale language model and then calculates an energy function for each intermediate stage for each reaction pathway. The figure below shows two instances of the same reaction pathway for two different catalysts. As the figure shows, the energy required to proceed from one reaction step to the next varies from catalyst to catalyst.

Intuitively, the transition from a low energy state to a high energy state can be viewed as a "hill climb" in the energy terrain (red and blue arrows in the figure), formulating a function that assigns the highest reward to the path that climbs the smallest hill. adst is the intermediate at stage t of the reaction, and Eadst is the adsorption energy of adst on the catalyst.

Experiment

Experiments are being conducted to evaluate whether a system that combines heuristic search using a large-scale language model as a guide with quantum chemistry feedback can discover novel and effective catalysts better than a state-of-the-art large-scale language model alone. The experiment focuses on three main research questions

RQ1. quantifying performance improvement: does heuristic search with quantum chemical feedback produce better candidate catalysts than state-of-the-art large-scale language model queries?
RQ2. characterization of key components: what are the key parameters that control the trade-off between computational complexity and system performance?
RQ3. hypothesis testing of large language models: how do we test the hypotheses generated by ChemReasoner with domain knowledge; what areas need further attention to make ChemReasoner's computational screening more accurate and interpretable?

The experiment uses an extended version of the chemistry-focused inference query benchmark as the data set, which includes 145 queries. The queries are divided into three general categories, OpenCatalyst, BioFuels, and CO2-Fuel. For the first two categories, we adopt the query of Sprueill et al. (2023) and add the CO2-Fuel subset.

OpenCatalyst consists of a set of adsorbates taken from the Open Catalyst Project 2020 dataset and must propose catalysts that exhibit strong adsorption for each adsorbate (86 queries); BioFuels targets catalyst discovery for biofuel development ( 39 queries), and these queries are modified to target metal catalysts for the reward calculation. Finally, it specifically targets the conversion of CO2 into methanol and ethanol (platform molecules), which are used in the production of fuels and chemicals and are the raw materials to achieve the net zero goal.

The large-scale language models used in the experiments include OpenAI GPT-3.5 and GPT-4. We initially benchmarked with LLama2, which proved to have limited instruction-following capability in this domain, making it difficult to evaluate; as the GNN reward model, we used the GemNet-dT model from the OpenCatalysis project. The runtime configuration and inference scaling performance for each model is described in Section C.6. Inference for the OpenAI model is performed in parallel using asynchronous execution capabilities; inference for GNN is performed on a single GPU on DGX2/V100 and A100 systems.Under these setups, we evaluate the effectiveness and efficiency of ChemReasoner through experiments.

In this experiment,two different variants of ChemReasoner are evaluated: ChemReasoner-Expert is an implementation with action spaces defined by a catalyst expert. These actions (relational operators and descriptors) are as follows

Inclusion criteria: high activity, high selectivity, low cost, novelty, low toxicity, high binding energy, high conversion efficiency, high availability.
Exclusion criteria: low activity, low stability, low selectivity, low binding energy, high cost, high toxicity, low dispersibility, low porosity, high scarcity, low conversion efficiency.
Types of catalysts: metal catalysts, monometallic catalysts, bimetallic catalysts, and trimetallic catalysts.
Relationship to previous candidate sets: include different elements, include similar elements, introduce new elements, include elements from previous candidates.

These actions are sampled with equal probability without using the same criteria twice. ChemReasoner-Planner,on the other hand, extends the search space using actions suggested by the large-scale language model without requiring expert specification.

As the table below shows,bothChemReasonerimplementations significantly outperform the GPT-4 baseline. In particular, the combination of ChemReasoner-Planner and GPT-4 performs best in the OpenCatalysis and Biofuels query categories, while ChemReasoner-Expert performs best in the CO2-Conversion query Conversion queries.

In addition, as shown in the table below, ChemReasoner-Expert's top 1 prediction has high similarity to current commercial catalysts for methanol synthesis.When calculating the average depth of nodes containing the best solutions for both variations ofChemReasoner, we observed that the average search depth is reduced by 11.28% by using GPT-4. This is more pronounced for ChemReasoner-Expert than for ChemReasoner-Planner, and the performance improvement is achieved through the algorithmic contribution of planning.

The strong performance of ChemReasoner-Experton CO2-Conversion queries is remarkable. In particular, given that this is based on GPT-3.5-turbo, its performance is likely related to complex reward functions. For queries related to reward functions based on adsorption energy (OpenCatalyst and Biofuels), even if the query does not explicitly mention adsorption energy as a target, the concept of a good catalyst in large language models (generally low cost, high selectivity, etc.) is usually associated with low adsorption energy (high reward) profiles. Thus, the planner effectively uses the large-scale language model as an optimization function to search for energetically favorable catalysts. However, the concept of a good catalyst in LLM may not always match the complex reaction pathway-based reward functions associated with CO2 conversion;fine-tuning thelarge-scale languagemodel using a methodology similar to RLHF (Ouyang et al., 2022)is a downstream task with a complex reward function We suggest that this is a promising approach for

Summary

In this paper, weproposeChemReasoner, a multimodal framework that integrates linguistic reasoning with large-scale language models and atomic structure-based reward.This framework is based on the principles of catalysis and quantum chemistry. It enables the development of energy-efficient chemical conversion processes and the proposal of new catalytic structures to combat climate change.

The method is versatile and can be applied to other scientific fields such as biology and chemistry, according to the report. They also note that strong quantitative and qualitative validation using domain-based methods and supporting the proposal of new structures not included in the original training data set are important issues for the future.

This research is expected to open new avenues for the development of catalysts to combat climate change, and is expected to make a significant contribution to the advancement of science and technology.

Categories related to this article

Takumu: I have worked as a Project Manager/Product Manager and Researcher at internet advertising companies (DSP, DMP, etc.) and machine learning startups. Currently, I am a Product Manager for new business at an IT company. I also plan services utilizing data and machine learning, and conduct seminars related to machine learning and mathematics.

ChemReasoner] Catalyst Discovery Framework Utilizing Quantum Chemistry And LLM

Summary

Systems and Methods

Experiment

Summary

Libra] A New Multimodal Design Of Large Language Models Using Separate Vision Systems

Libra] A New Multimodal Design Of Large Language Models Using Separate Vision Systems

Construction And Analysis Of The "TruthEval" Dataset To Expose LLM Weaknesses

Construction And Analysis Of The "TruthEval" Dataset To Expose LLM Weaknesses

SportQA, A New Dataset That Measures The Comprehension Of Sports In Large Language Models

SportQA, A New Dataset That Measures The Comprehension Of Sports In Large Language Models

Proposal For A New Evaluation Method For AI Assistants Based On Human Preferences

Proposal For A New Evaluation Method For AI Assistants Based On Human Preferences

The Future Of Music Education, Flute X GPT And LAUI's Potential To Change Large-Scale Language Models

The Future Of Music Education, Flute X GPT And LAUI's Potential To Change Large-Scale Language Model ...

Prediction Of Handball Results For The 2024 Paris Olympics And Explanation Of The Basis For The Prediction Using LLM

Prediction Of Handball Results For The 2024 Paris Olympics And Explanation Of The Basis For The Pred ...