# Frequent nonallelic gene conversion on the human lineage and its effect on the divergence of gene duplicates

1. aDepartment of Biology, Stanford University, Stanford, CA 94305;
2. bDepartment of Genetics, Stanford University, Stanford, CA 94305;
3. cHoward Hughes Medical Institute, Stanford University, Stanford, CA 94305
1. Edited by Adam Siepel, Cold Spring Harbor Laboratory, and accepted by Editorial Board Member Daniel L. Hartl October 12, 2017 (received for review May 17, 2017)

## Significance

Nonallelic gene conversion (NAGC) is a driver of more than 20 diseases. It is also thought to drive the “concerted evolution” of gene duplicates because it acts to eliminate any differences that accumulate between them. Despite its importance, the parameters that govern NAGC are not well characterized. We developed statistical tools to study NAGC and its consequences for human gene duplicates. We find that the baseline rate of NAGC in humans is 20 times faster than the point mutation rate. Despite this high rate, NAGC has a surprisingly small effect on the average sequence divergence of human duplicates—and concerted evolution is not as pervasive as previously thought.

## Abstract

Gene conversion is the copying of a genetic sequence from a “donor” region to an “acceptor.” In nonallelic gene conversion (NAGC), the donor and the acceptor are at distinct genetic loci. Despite the role NAGC plays in various genetic diseases and the concerted evolution of gene families, the parameters that govern NAGC are not well characterized. Here, we survey duplicate gene families and identify converted tracts in 46% of them. These conversions reflect a large GC bias of NAGC. We develop a sequence evolution model that leverages substantially more information in duplicate sequences than used by previous methods and use it to estimate the parameters that govern NAGC in humans: a mean converted tract length of 250 bp and a probability of <mml:math><mml:mrow><mml:mn>2.5</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>?</mml:mo><mml:mn>7</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math>2.5×10?7 per generation for a nucleotide to be converted (an order of magnitude higher than the point mutation rate). Despite this high baseline rate, we show that NAGC slows down as duplicate sequences diverge—until an eventual “escape” of the sequences from its influence. As a result, NAGC has a small average effect on the sequence divergence of duplicates. This work improves our understanding of the NAGC mechanism and the role that it plays in the evolution of gene duplicates.

## Footnotes

• ?1A.H. and X.L. contributed equally to this work.

• ?2To whom correspondence may be addressed. Email: arbelh{at}stanford.edu or pritch{at}stanford.edu.
• Author contributions: A.H., X.L., Z.G., and J.K.P. designed research; A.H., X.L., Z.G., and J.K.P. performed research; A.H., X.L., Z.G., and J.K.P. contributed new analytic tools; A.H., X.L., Z.G., and J.K.P. analyzed data; and A.H. wrote the paper.

• The authors declare no conflict of interest.

#### Online Impact

• 1634281249 2018-02-17
• 2115681248 2018-02-17
• 8627591247 2018-02-17
• 1184961246 2018-02-17
• 9203941245 2018-02-17
• 4504061244 2018-02-16
• 5597191243 2018-02-16
• 5234981242 2018-02-16
• 6285841241 2018-02-16
• 3913011240 2018-02-16
• 5129741239 2018-02-16
• 3595841238 2018-02-16
• 3166311237 2018-02-16
• 633831236 2018-02-16
• 4424691235 2018-02-16
• 4865101234 2018-02-16
• 159241233 2018-02-16
• 8626671232 2018-02-16
• 315591231 2018-02-16
• 5822951230 2018-02-16