Leng Han and Chunjiang He are the creators of the tissue-specific circular RNA (circRNA) database TSCD. They performed the first global analysis of tissue-specific circRNAs and collected these data in a comprehensive database. Here, they talk about their work and how the TSCD database can help researchers explore their RNA sequencing data.
A repository of more than 300,000 tissue-specific RNAs
With circRNAs attracting more attention in transcriptome research, we explored the global features of tissue-specific circRNAs in embryo development and organ differentiation. To identify tissue specific circRNAs, 3 algorithms, CIRI, circRNA_finder and find_circ, were applied to RNA-seq data collected from the ENCODE project and the NCBI GEO database.
Based on the major types of circRNA, we identified more than 300,000 tissue-specific circRNAs in different tissues. Our analyses indicated that tissue-specific circRNAs were mainly derived from exons, although they can also be derived from introns or intergenic regions. The majority are generated from protein-coding genes, which suggests that these circRNAs are associated with mRNA translation or are an mRNA backup.
Among all circRNAs, 10.4% of human circRNAs and 34.3% of mouse circRNAs are tissue-specific, which suggests a link with tissue development. We also observed uneven distribution of tissue-specific circRNAs across different tissues, and found that there are more tissue-specific circRNAs expressed in the brain (89,137 were identified in fetal brain), which may be due to the complexity of neuronal activity in the brain.
Abundance of TS circRNAs across different tissues: (A) 16 adult human tissue types; (B) 15 fetal human tissues; (C) and 9 mouse tissues (in log2 of SRPTM: number of circular reads/number of mapped reads (units in trillion)/read length).
Functional enrichment analysis revealed that tissue-specific circRNAs are largely associated with tissue development and differentiation. To understand the potential functions of tissue-specific circRNAs, we identified a significant number of miRNA binding elements (MRE) and RBP (RNA binding protein) binding sites.
Finding a tissue-specific circRNA in TSCD
Users can easily browse TSCD content via a browser page and can view tissue-specific circRNAs by selecting:
- Human adult tissue, human fetal tissue, or mouse tissue
- And one of the 26 individual tissue types including adipose, adrenal, blood vessel, brain, esophagogastric, esophagus, eye, female gonad, heart, intestine, kidney, liver, lung, mammary gland, pancreas, skeletal muscle, skin, spleen, stomach, testis, thymus, thyroid gland, tibial nerve, tongue, umbilical cord, and uterus.
Data organization and visualization on the TSCD web interface
All data have been organized into a set of relational MySQL tables. Customized Java and PHP scripts were used to construct the interface of database. The visualization page displays the coordinates of each circRNA.
The index page allows the user to easily query the information concerning TS circRNAs by chromosome, start and end site, junction read, conservation, genomic location, etc.
Web interface of TSCD.
- Users can view the comprehensive information as tissue category, circRNA ID, coordinates of backsplice sites, genomic locations, junction reads, strand information, genomic spanning length, gene annotation and MRE/RBP sites.
- More importantly, users can visualize the details of tissue-specific circRNA through the gene symbol link. Backsplices of circRNA are represented by arcs: a black arc for non-specific circRNAs, a red arc for tissue-specific circRNAs.
- Annotated exons and introns of reference transcripts. If the reference genes have multiple transcripts, all transcripts are displayed. If the circRNA is generated from multiple genes, the exon structures of all related genes are displayed to better illustrate the biogenesis of circRNAs. TSCD provides the tables including all precise coordinates of each backsplice of circRNA across different tissues.
Exploring tissue-specific circRNAs with TSCD
TSCD offers several pages that are of benefit to the research community:
- The Browser-hg38|mm10 page which displays coordinates for each circRNA based on the latest genome version, including GRCH38 and mm10.
- The comparison page which allows users to compare circRNAs among different tissues.
- The download page which allows users to batch download tissue-specific circRNAs from all tissues and the customized Perl script to identify the tissue-specific circRNAs from their own RNA-seq data.
(Xia et al., 2016) Comprehensive characterization of tissue-specific circular RNAs in the human and mouse genomes. Brief Bioinform.