Your top 3 Venn diagram tools

banner-venn-diagram

Venn diagrams are very simple, yet incredibly useful tools used to show all logical relations between finite collections of different sets of data. In Venn diagrams, sets of data are often represented as overlapping circles. Data that are shared between two different sets will reside at the intersection, while unique data remain outside the intersection.

Venn diagrams in biology

In biology and omics, Venn diagrams can be used for a variety of purposes, such as the comparison of different lists of genes or proteins (generally 2 or 3) to identify similarities and represent them in two dimensions. Most softwares allow easy extraction of the data, and let you customize the diagrams.

Continuing our series of data visualization tools, OMICtools members voted for their favorite Venn diagram representation softwares and websites. Here are the results from 54 voters.

Your top 1 Venn diagram generation tool: Venny

You were 50% to choose Venny as your number 1 favorite tool to generate Venn diagrams.

Venny is a web-server created by Juan Carlos Oliveros from the BioinfoGP service at the Spanish National Biotechnology Centre, that can be used online or offline to generate Venn diagrams from up to four lists.

Its straight-forward usage lets you create diagrams and extract data in 3 basic steps:

  1. Paste your lists of data (one element per row) and rename the lists
  2. Click on the numbers to get exclusive and common data between lists
  3. Right-click the figure to view and save the diagram

Venny allows basic customization of your diagrams (line weight, font size and style).

Venny-omictools
Example Venn diagram generated with Venny

Shared second place for BioVenn and Venn diagram

43% of the OMICtools community voted for Biovenn and Venn diagram as their favorite tool!

BioVenn

BioVenn is a web application developed at the Centre for Molecular and Biomolecular Informatics that enables creation of Venn diagrams from up to 3 sets of data. Unlike in Venny, the diagrams in BioVenn are area-proportional, which means that the size of the circles and the overlaps correspond to the sizes of the data sets. BioVenn also comes with interesting features, such as the ability to directly upload a data set from a tab file, or to support a wide range of identifiers which can be linked to biological databases.

BioVenn-omictools
Venn diagram generated with BioVenn

Venn diagram

Venn diagram was developed by the VIB-Ugen Center for Plant Systems Biology at Ghent University. This web application allows users to draw Venn diagrams from up to 6 data lists, in a symmetric or non-symmetric fashion. The diagrams can then be downloaded in SVG or PNG format. Moreover, Venn diagram is able to calculate the intersections of up to 30 different lists, making it a useful tool to identify common values between multiple data sets.

Venn-diagram-omictools
Venn diagram generated with the Venn diagram software

Bronze medal for InteractiVenn

A very close third place goes to InteractiVenn, chosen by 41% of voters as their favorite tool.

InteractiVenn is a sophisticated and flexible web-based tool that allows creation of Venn diagrams from up to 6 lists and analysis of set unions, while preserving the shape of the diagram. By displaying partial unions, the user is able to locate regions that combine unions of sets and their intersections, thus providing additional observations on the interactions between joined sets.

With InteractiVenn, the user can choose text size, color, opacity, and can export the diagram in a vectored format.  Datasets can also be saved locally for later use on the website.

Interactivenn-omictools
Venn diagram generated with InteractiVenn

References

(Oliveros, J.C., 2007-2015) Venny. An interactive tool for comparing lists with Venn’s diagrams. http://bioinfogp.cnb.csic.es/tools/venny/index.html

(Hulsen et al., 2008) BioVenn – a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams. BMC Genomics.

Venn diagram: http://bioinformatics.psb.ugent.be/webtools/Venn/

(Heberle et al., 2015) InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams. BMC Bioinformatics.

Single-cell RNA sequencing in immunology

Banner-scRNAseq-Omictools

Single-cell RNA sequencing (scRNA-seq) has revolutionized the study of the immune system and now has a wide range of applications in immunology. This technology spans the whole genome and provides an unbiased gene expression profile of individual cells.

Bulk vs single-cell RNA-seq

Traditional bulk RNA-seq is often performed on well-identified groups of cells thought to be homogeneous. However, quantification of molecular changes is made by estimating the mean value from millions of cells and averaging the signal of individual cells, thus ignoring cell-to-cell heterogeneity, which is a hallmark of adaptive immune cell subsets such as B and T lymphocytes.

The need to identify new and discrete immune cell populations and to understand molecular changes that occur at the single cell level has favored the development of low‐input RNA‐seq protocols, that now have a multitude of different applications and come with a bunch of new analysis tools.

ScRNA-seq-analysis-outline-omictools
Single-Cell RNA-Sequencing Analysis Outline. From Neu et al.

Main applications of scRNA-seq in immunology

Identification of new cell types and functions

By spanning the whole genome in search of unknown molecular markers, scRNA-seq can be used to identify new cell types and functions. While traditional qPCR approaches are sensitive and easy to perform, they require prior knowledge and are based on the measurement of a preselected pool of genes, which introduces bias. Using scRNA-seq technology in the context of immune response to a stimulus (infection, vaccination, autoimmunity) can lead to the identification of new activities and functions. Gene expression and quantification tools originally designed for bulk RNA-seq have now been successfully adopted for scRNA-seq data, including STAR, RSEM and Kallisto.

Characterization of heterogeneous populations

Adaptive immune cells such as B and T lymphocytes use V(D)J recombination to generate a highly diverse repertoire of receptors to recognize antigens. By combining single-cell identification of clonotypes with cell phenotype (eg responsive/autoreactive/anergic), researchers can find strategies to augment or lower specific immune responses. Several tools have now been developed to help you reconstruct full-length T and B cell receptors from scRNA-seq data, such as TraCeR, BASIC, and ImReP.

TCR-scRNA-seq-omictools
TCR sequences assembled from scRNA-seq reads during Salmonella infection in mice. From Stubbington et al.

Mapping transition states and cell fate decisions

Immune cell populations arise from precursor cells and go through a succession of checkpoints and states before becoming fully mature and functional. Mapping transition states and cell lineages with scRNA-seq can provide insights into developmental aspects of the immune system in health and disease. Specific tools let you organize individual cells in pseudotime and bifurcating developmental trajectories, such as Monocle and TSCAN.

Pseudotime-trajectory-scrna-seq-omictools
Bifurcating pseudotime trajectory. From Stubbington et al.

Personalized medicine

In the near future, scRNA-seq could revolutionize the field of personalized medicine in cancer by enabling researchers to identify individual clones and biomarkers in a tumor, and select precision drugs for each of them. Because one particular tumor cell can drive drug resistance or metastasis, scRNA-seq can provide critical information for rapid and personalized treatment. Of particular interest, the ESTIMATE algorithm can be applied to scRNA-seq data to identify the tumor phenotype and the proportion of tumor, immune, or stromal cells.

Personalized-medecine-scRNA-seq-omictools
scRNA-seq applications in cancer medicine. From Shalek and Benson.

Future directions

From flow cytometry to microscopy, the study of the immune system has often relied on technologies that operate at a single-cell resolution. With next-generation sequencing (NGS) technologies becoming cheaper, scRNA-seq will probably be routinely used by researchers in the near future.

Upcoming challenges will include data management and development of integrated multiplex tools to combine transcriptomics with other genomic data.

Based on recent papers:

(Neu et al., 2016) Single-Cell Genomics: Approaches and Utility in Immunology. Trends in Immunology

(Papalexi and Satija, 2017) Single-cell RNA sequencing to explore immune cell heterogeneity. Nature Reviews Immunology

(Shalek and Benson, 2017) Single-cell analyses to tailor treatments. Science Translational Medecine.

(Stubbington et al., 2017) Single-cell transcriptomics
 to explore the immune system in health and disease. Science.

Selecting and preparing proteins for virtual screening

Banner-ligQ-omictools

Dr. Leandro Radusky and his team have come up with a new online web service, LigQ, that allows fast and efficient identification of potential binders to a desired target before starting a virtual screening procedure. Here, they describe the functionality of the LigQ tool and discuss how you can use it to select and prepare your proteins for virtual screening.

The need for compound selection before virtual screening

A major aspect of drug discovery involves the identification of new compounds that are able to bind a protein and control its activity. In silico virtual screening is one of the most powerful and widely used techniques to search for lead compounds that bind to a protein of interest with moderate to high affinity.

Since the main cost of a virtual screening project is directly related to the number of compounds to be tested experimentally, and given that typically a relatively low number of compounds is selected, it is crucial that this set contains a maxium number of true binders.

The LigQ workflow and pipeline

Flow chart of the LigQ webserver pipeline

LigQ is organized into four independent modules that can be used sequentially to perform all virtual screening preparation steps:

1. Pocket Detection Module

This module allows users to find the optimal ligand binding pocket for a given protein target.

2. Ligand Detection Module

With this module users can search a database to find of group of potential binders to the desired protein based on similarity to known binders. Potential ligands are then retrieved and shown on the website as figures or in 3D with JSMol visualizer.

3. Extend Ligand Set Module

From this module, the user can extend the list of compounds found by the previous steps by searching the LigQ database and comparing chemical similarity based on Tanimoto Index.

4. Ligand Structure Generation Module

This module generates enantiomer and tautomer 3D structures for the desired ligands to use them in molecular docking experiments.

An effective and time-saving tool

The pipelined execution of LigQ allows users to start from only a UniProt protein accession to obtain both the docking grid of the most probable binding site of the target protein as well as the candidate compounds in a three-dimensional format ready to execute in silico virtual screening computations.

Since each module can be executed separately, it allows the user to find the most druggable pockets of a protein, a list of compounds with known binding affinity of a protein, an extended set of candidates based on similarity, and the most favorable geometries from a list of compounds.

The LigQ pipeline was also demonstrated to be very effective in retrieving a list of compounds enriched in true binders over commonly used benchmarking sets of proteins.

Reference

(Radusky et al., 2017)  LigQ: A Webserver to Select and Prepare Ligands for Virtual Screening. Journal of Chemical Information and Modeling.