• Altmetrics
  • Sign-up for PNAS eTOC Alerts

Patterns of coevolving amino acids unveil structural and dynamical domains

  1. Vincenzo Carnevalea,2
  1. aInstitute for Computational Molecular Science, College of Science and Technology, Temple University, Philadelphia, PA 19122;
  2. bMolecular and Statistical Biophysics, Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy
  1. Edited by Richard W. Aldrich, The University of Texas at Austin, Austin, TX, and approved November 7, 2017 (received for review July 6, 2017)


Patterns of pairwise correlations in sequence alignments can be used to reconstruct the network of residue-residue contacts and thus the three-dimensional structure of proteins. Less explored, and yet extremely intriguing, is the functional relevance of such coevolving networks: Do they encode for the collective motions occurring in proteins at thermal equilibrium? Here, by combining coevolutionary coupling analysis with a state-of-the-art dimensionality reduction approach, we show that the network of pairwise evolutionary couplings can be analyzed to reveal communities of amino acids, which we term “evolutionary domains,” that are in striking agreement with the quasi-rigid protein domains obtained from elastic network models and molecular dynamics simulations.


Patterns of interacting amino acids are so preserved within protein families that the sole analysis of evolutionary comutations can identify pairs of contacting residues. It is also known that evolution conserves functional dynamics, i.e., the concerted motion or displacement of large protein regions or domains. Is it, therefore, possible to use a pure sequence-based analysis to identify these dynamical domains? To address this question, we introduce here a general coevolutionary coupling analysis strategy and apply it to a curated sequence database of hundreds of protein families. For most families, the sequence-based method partitions amino acids into a few clusters. When viewed in the context of the native structure, these clusters have the signature characteristics of viable protein domains: They are spatially separated but individually compact. They have a direct functional bearing too, as shown for various reference cases. We conclude that even large-scale structural and functionally related properties can be recovered from inference methods applied to evolutionary-related sequences. The method introduced here is available as a software package and web server (spectrus.sissa.it/spectrus-evo_webserver).


  • ?1D.G. and L.P. contributed equally to this work.

  • ?2To whom correspondence may be addressed. Email: vincenzo.carnevale{at}temple.edu, daniele.granata{at}gmail.com, ponzoniluca{at}gmail.com, or michelet{at}sissa.it.

Published under the PNAS license.

Online Impact