A BERT-based Model For Predicting The Function Of MRNAs With Genetic Information Is Now Available!
3 main points
✔️ m6A-BERT-Deg proposed as a model to predict whether a genetically relevant mRNA substance will be degraded based on its state of modification
✔️ Improved prediction accuracy compared to no prior learning and to each of the previous models
✔️ by analyzing the contribution of BERT tokens, Suggests also the discovery of a new biological mechanism
Understanding YTHDF2-mediated mRNA Degradation By m6A-BERT-Deg
written by
code:
The images used in this article are from the paper, the introductory slides, or were created based on them.
Introduction
Prerequisite Knowledge 1 (about mRNA)
The genetic information of humans and other living organisms is stored in DNA, but in order to actually control our body functions based on genetic information, the genetic information stored in DNA must first be copied into a substance called mRNA, and then the mRNA information must be used to synthesize proteins.
In other words, it is known that the information in DNA must be copied once into another medium, mRNA, in order to be converted into proteins that control body functions. to illustrate the relationship between DNA, mRNA, and proteins, DNA is a recipe book for a dish and mRNA is a copy of the recipe book,Protein can be thought of as the finished dish.
Prerequisite Knowledge 2 (Regulation of mRNA function)
Sometimes mRNAs undergo operations in which decorations , called modifications, are added.A typical example of such a modification is the m6A modification. m6A is a single letter that represents the addition of a "methyl group" to the nitrogen atom of the "A," which is known as the "A," "G," "C," and "U" in the four units that make up the mRNA string.
It was known that mRNAs undergo such modifications in such a way that they attract proteins that are responsible for regulating their own degradation. However, m6A modification does not necessarily result in mRNA degradation, and the detailed mechanism has not yet been fully elucidated.
The regulation of this stability is highly relevant to various cellular and biological processes, including cancer stem cellsin acute myeloid leukemia, and their elucidation has been anticipated.
Research Background
Therefore, in this study, we developed a model called m6A-BERT to predict whether mRNAs with m6A modifications are degraded. Furthermore, using mRNA lifetime data (half-life), we proposed m6A-BERT-Deg, which is an improved version of this model using fine tuning.
The half-life of an mRNA is related to the rate at which the mRNA is degraded and is an important parameter in understanding the mechanism of degradation.
The effectiveness of m6A-BERT-Deg was confirmed by its high accuracy compared to other state-of-the-art deep learning-based methods.
Model Structure
Overall Model
Summary
In this study, the BERT-basedm6A-BERT-Deg was proposed as a model to predict mRNA degradation by m6A modification.
The model is trained by tokenizing mRNA sequences as strings, pre-training to predict masked tokens, and fine tuning that introduces a binary classification layer to predict decomposition.
The performance of this model was higher than that of other advanced models without pre-training or in the past. In addition, the accuracy of the model was confirmed by conducting experiments with real cells.
Further analysis using an attribution score based on token contribution revealed high scores upstream of the m6A modification site, suggesting that this region is important for regulating mRNA degradation.
Personally, I think that the best part of the BERT model is that it allows us to fully consider the biological background knowledge by considering the contribution of these embedding layers, and I think it is remarkable that we are now able to elucidate new mechanisms through this kind of consideration.
Categories related to this article