Redefining long non-coding RNAs (lncRNAs) to study transposons in plants
Long non-coding RNAs (lncRNAs) constitute the new frontier of investigation for molecular biologists. However, lncRNA is inconsistently defined, which fails the research community in several ways. In a scientific review [2], Professor Andrzej Wierzbicki from the University of Michigan, Department of Molecular, Cellular and Developmental Biology, and collaborators challenge the contemporary ways of understanding lncRNAs and propose a definition that is based on function and biogenesis. “Here we propose a definition that is clear and specific,” said Wierzbicki, “and we hope that this operational definition will be widely adopted.”
The authors propose the following definition: “a lncRNA is an RNA that has a function independent of its protein-coding potential and that is produced by a mechanism other than molecular ruler-based dicing or trimming.”
Photo: Arabidopsis thaliana (juvenile, flowering and fruiting). For geneticists, Arabidopsis thaliana, a common plant from the mustard (Brassicaceae) family, is considered the equivalent of the fruit fly (Drosophilia melanogaster). This plant can easily be used as a model system for identifying genes and determining their functions. Arabidopsis thaliana is the first plant for which a complete genome sequence was established in 2000. [1]
With this new definition, these scientists address two major issues with the traditional term “long non-coding RNAs.” First, the common definition of lncRNAs requires 200 nucleotides to be classified as a long RNA. “200 is an arbitrary number of nucleotides, and the role of lncRNAs in molecular biology is often not about their size,” explained Wierzbicki. With this criterion, RNAs that are under 200 nucleotides but that could have similar functions are left out, skewing the understanding of lncRNAs roles. Then and perhaps even more problematic is the negative definition of non-coding. For Wierzbicki, there is no way to prove that so-called non-coding RNAs (ncRNAs), indeed do not encode any proteins.
To make the lncRNA definition more useful and consistent with the original use of its elements, the team proposes different criteria. It starts with focusing on the function of the RNA within the biology of the cell and distinguishes lncRNAs according to the presence of a function, which does not rely of their coding potential. This is a much more practical test than the daunting task of proving the lack of any encoded proteins. The Wierzbicki’s definition also dismisses the arbitrary threshold of 200 nucleotides to include all RNAs regardless of their sizes—except for the specific group of small RNAs produced by Dicer, an RNA cutting enzyme. Instead, the new definition proposes to classify lncRNAs based on how they are produced, which is clearly distinct from small RNA biogenesis. This approach clarifies and strengthens the lncRNA definition and its application.
lncRNAs to regulate transposons in plants
For Wierzbicki and his collaborators, the traditional definition of lncRNAs proved not very helpful for their research on transposons in plants. A transposon, also known as a jumping gene, is a DNA sequence that can copy itself over, and/or change its position within a genome, sometimes creating mutations and possibly altering the cell’s genetic identity. Transposable elements constitute a large fraction of the genome. For example, they make up 50% of the human genome. In plants, the proportion of transposons varies and can be up to 80%. While transposons could play an important role in species evolution and adaptation, they are also considered genomic parasites and need to be repressed by the cell as to not create havoc. lncRNAs play an essential role in keeping transposons under control.
RNA is produced by protein enzymes known as RNA polymerases and most eukaryotic organisms, including humans have three RNA polymerases in their cells. Plants have two additional RNA polymerases, which are dedicated towards producing lncRNAs that recognize and repress transposons. This particularly elaborate machinery demonstrates a special role of lncRNA in plants. It also offers opportunities to use plants as model organisms to study mechanisms of transposon repression, which are conserved throughout the tree of life.
“These two additional polymerases are essential for recognizing and silencing transposable elements and contribute to making plants so successful. To properly study the role of lncRNAs in the regulation of transposons, we needed to first clearly define lncRNAs,” concluded Wierzbicki.
[1] The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815 (2000). https://doi.org/10.1038/35048692 [2] Long Noncoding RNAs in Plants, Andrzej T. Wierzbicki, Todd Blevins, and Szymon Swiezewski, Annual Review of Plant Biology, Vol. 72:245-271 (Volume publication date June 2021), First published as a Review in Advance on March 22, 2021, https://doi.org/10.1146/annurev-arplant-093020-035446