MolEvolvR: a web-app for molecular evolution and phylogeny


Molecular evolution and phylogeny can provide key insights into protein families. Studying how these proteins evolve across various lineages, can help identify lineage-specific and conserved signatures and variants, and consequently, their functions. We have developed a streamlined computational approach for the molecular evolution and phylogeny of target proteins, widely applicable across protein families and phyla. This approach starts with one or more query proteins and identification of their homologs (using sequence alignment and clustering algorithms for domain detection), domain architectures (using protein domain/orthology databases and prediction algorithms for signal peptides, transmembrane regions, cellular localization, and secondary/tertiary structures ), and phyletic spreads to trace the conservation and evolution of protein families. To demonstrate the versatility of the approach, we have applied it to various operons in zoonotic pathogens, such as nutrient acquisition in Staphylococcus aureus and the pan-bacterial phage shock protein (Psp) stress response system. Further, we have implemented our approach as an interactive webapp, MolEvolvR, which enables biologists to run our entire molecular evolution and phylogeny approach on their data by simply uploading a list of their query proteins. The webapp accepts inputs in multiple formats: protein/domain sequences, multi-protein operons/homologous proteins (e.g., prior BLAST searches), or motif/domain scans (e.g., Interproscan). Depending on the input, MolEvolvR returns to the user the complete set of homologs, domain architectures, common partner domains, and their phyletic spreads. Users can obtain graphical summaries that include multiple sequence alignments and phylogenetic trees, domain architectures, domain proximity networks, phyletic spreads, co-occurrence patterns, and relative occurrences across lineages. Thus, the webapp provides an easy-to-use interface for a wide range of analyses, starting from homology searches and phylogeny to domain architectures. In addition to this analysis, researchers can use the app for data summarization and dynamic visualization. MolEvolvR will be a powerful, easy-to-use tool that accelerates the characterization of proteins. The webapp can be accessed here: An instance of the webapp, applied to study a large number of Psp stress response proteins (present across the tree of life) can be found here: Soon, MolEvolvR will be available as an R-package for use by computational biologists.

Mar 26, 2021 8:00 PM — 9:00 PM