Detailed analysis of inversions predicted between two human genomes

Errors, real polymorphisms, and their origin and population distribution

David Vicente-Salvador, Marta Puig, Magdalena Gayà-Vidal, Sarai Pacheco, Carla Giner-Delgado, Isaac Noguera, David Izquierdo, Alexander Martínez-Fundichely, Aurora Ruiz-Herrera, Xavier P. Estivill, Cristina Aguado, José Ignacio Lucas-Lledó, Mario Cáceres

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

The growing catalogue of structural variants in humans often overlooks inversions as one of the most difficult types of variation to study, even though they affect phenotypic traits in diverse organisms. Here, we have analysed in detail 90 inversions predicted from the comparison of two independently assembled human genomes: the reference genome (NCBI36/HG18) and HuRef. Surprisingly, we found that two thirds of these predictions (62) represent errors either in assembly comparison or in one of the assemblies, including 27 misassembled regions in HG18. Next, we validated 22 of the remaining 28 potential polymorphic inversions using different PCR techniques and characterized their breakpoints and ancestral state. In addition, we determined experimentally the derived allele frequency in Europeans for 17 inversions (DAF=0.01-0.80), as well as the distribution in 14 worldwide populations for 12 of them based on the 1000 Genomes Project data. Among the validated inversions, nine have inverted repeats (IRs) at their breakpoints, and two show nucleotide variation patterns consistent with a recurrent origin. Conversely, inversions without IRs have a unique origin and almost all of them show deletions or insertions at the breakpoints in the derived allele mediated by microhomology sequences, which highlights the importance of mechanisms like FoSTeS/MMBIR in the generation of complex rearrangements in the human genome. Finally, we found several inversions located within genes and at least one candidate to be positively selected in Africa. Thus, our study emphasizes the importance of careful analysis and validation of large-scale genomic predictions to extract reliable biological conclusions.

Original languageEnglish
Pages (from-to)567-581
Number of pages15
JournalHuman Molecular Genetics
Volume26
Issue number3
DOIs
Publication statusPublished - 1 Jan 2017

Fingerprint

Human Genome
Demography
Genome
Gene Frequency
Nucleotides
Alleles
Polymerase Chain Reaction
Population
Genes

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics
  • Genetics(clinical)

Cite this

Vicente-Salvador, D., Puig, M., Gayà-Vidal, M., Pacheco, S., Giner-Delgado, C., Noguera, I., ... Cáceres, M. (2017). Detailed analysis of inversions predicted between two human genomes: Errors, real polymorphisms, and their origin and population distribution. Human Molecular Genetics, 26(3), 567-581. https://doi.org/10.1093/hmg/ddw415

Detailed analysis of inversions predicted between two human genomes : Errors, real polymorphisms, and their origin and population distribution. / Vicente-Salvador, David; Puig, Marta; Gayà-Vidal, Magdalena; Pacheco, Sarai; Giner-Delgado, Carla; Noguera, Isaac; Izquierdo, David; Martínez-Fundichely, Alexander; Ruiz-Herrera, Aurora; Estivill, Xavier P.; Aguado, Cristina; Lucas-Lledó, José Ignacio; Cáceres, Mario.

In: Human Molecular Genetics, Vol. 26, No. 3, 01.01.2017, p. 567-581.

Research output: Contribution to journalArticle

Vicente-Salvador, D, Puig, M, Gayà-Vidal, M, Pacheco, S, Giner-Delgado, C, Noguera, I, Izquierdo, D, Martínez-Fundichely, A, Ruiz-Herrera, A, Estivill, XP, Aguado, C, Lucas-Lledó, JI & Cáceres, M 2017, 'Detailed analysis of inversions predicted between two human genomes: Errors, real polymorphisms, and their origin and population distribution', Human Molecular Genetics, vol. 26, no. 3, pp. 567-581. https://doi.org/10.1093/hmg/ddw415
Vicente-Salvador, David ; Puig, Marta ; Gayà-Vidal, Magdalena ; Pacheco, Sarai ; Giner-Delgado, Carla ; Noguera, Isaac ; Izquierdo, David ; Martínez-Fundichely, Alexander ; Ruiz-Herrera, Aurora ; Estivill, Xavier P. ; Aguado, Cristina ; Lucas-Lledó, José Ignacio ; Cáceres, Mario. / Detailed analysis of inversions predicted between two human genomes : Errors, real polymorphisms, and their origin and population distribution. In: Human Molecular Genetics. 2017 ; Vol. 26, No. 3. pp. 567-581.
@article{79845a91d9aa42aba3b8b863cfcfda6d,
title = "Detailed analysis of inversions predicted between two human genomes: Errors, real polymorphisms, and their origin and population distribution",
abstract = "The growing catalogue of structural variants in humans often overlooks inversions as one of the most difficult types of variation to study, even though they affect phenotypic traits in diverse organisms. Here, we have analysed in detail 90 inversions predicted from the comparison of two independently assembled human genomes: the reference genome (NCBI36/HG18) and HuRef. Surprisingly, we found that two thirds of these predictions (62) represent errors either in assembly comparison or in one of the assemblies, including 27 misassembled regions in HG18. Next, we validated 22 of the remaining 28 potential polymorphic inversions using different PCR techniques and characterized their breakpoints and ancestral state. In addition, we determined experimentally the derived allele frequency in Europeans for 17 inversions (DAF=0.01-0.80), as well as the distribution in 14 worldwide populations for 12 of them based on the 1000 Genomes Project data. Among the validated inversions, nine have inverted repeats (IRs) at their breakpoints, and two show nucleotide variation patterns consistent with a recurrent origin. Conversely, inversions without IRs have a unique origin and almost all of them show deletions or insertions at the breakpoints in the derived allele mediated by microhomology sequences, which highlights the importance of mechanisms like FoSTeS/MMBIR in the generation of complex rearrangements in the human genome. Finally, we found several inversions located within genes and at least one candidate to be positively selected in Africa. Thus, our study emphasizes the importance of careful analysis and validation of large-scale genomic predictions to extract reliable biological conclusions.",
author = "David Vicente-Salvador and Marta Puig and Magdalena Gay{\`a}-Vidal and Sarai Pacheco and Carla Giner-Delgado and Isaac Noguera and David Izquierdo and Alexander Mart{\'i}nez-Fundichely and Aurora Ruiz-Herrera and Estivill, {Xavier P.} and Cristina Aguado and Lucas-Lled{\'o}, {Jos{\'e} Ignacio} and Mario C{\'a}ceres",
year = "2017",
month = "1",
day = "1",
doi = "10.1093/hmg/ddw415",
language = "English",
volume = "26",
pages = "567--581",
journal = "Human Molecular Genetics",
issn = "0964-6906",
publisher = "Oxford University Press",
number = "3",

}

TY - JOUR

T1 - Detailed analysis of inversions predicted between two human genomes

T2 - Errors, real polymorphisms, and their origin and population distribution

AU - Vicente-Salvador, David

AU - Puig, Marta

AU - Gayà-Vidal, Magdalena

AU - Pacheco, Sarai

AU - Giner-Delgado, Carla

AU - Noguera, Isaac

AU - Izquierdo, David

AU - Martínez-Fundichely, Alexander

AU - Ruiz-Herrera, Aurora

AU - Estivill, Xavier P.

AU - Aguado, Cristina

AU - Lucas-Lledó, José Ignacio

AU - Cáceres, Mario

PY - 2017/1/1

Y1 - 2017/1/1

N2 - The growing catalogue of structural variants in humans often overlooks inversions as one of the most difficult types of variation to study, even though they affect phenotypic traits in diverse organisms. Here, we have analysed in detail 90 inversions predicted from the comparison of two independently assembled human genomes: the reference genome (NCBI36/HG18) and HuRef. Surprisingly, we found that two thirds of these predictions (62) represent errors either in assembly comparison or in one of the assemblies, including 27 misassembled regions in HG18. Next, we validated 22 of the remaining 28 potential polymorphic inversions using different PCR techniques and characterized their breakpoints and ancestral state. In addition, we determined experimentally the derived allele frequency in Europeans for 17 inversions (DAF=0.01-0.80), as well as the distribution in 14 worldwide populations for 12 of them based on the 1000 Genomes Project data. Among the validated inversions, nine have inverted repeats (IRs) at their breakpoints, and two show nucleotide variation patterns consistent with a recurrent origin. Conversely, inversions without IRs have a unique origin and almost all of them show deletions or insertions at the breakpoints in the derived allele mediated by microhomology sequences, which highlights the importance of mechanisms like FoSTeS/MMBIR in the generation of complex rearrangements in the human genome. Finally, we found several inversions located within genes and at least one candidate to be positively selected in Africa. Thus, our study emphasizes the importance of careful analysis and validation of large-scale genomic predictions to extract reliable biological conclusions.

AB - The growing catalogue of structural variants in humans often overlooks inversions as one of the most difficult types of variation to study, even though they affect phenotypic traits in diverse organisms. Here, we have analysed in detail 90 inversions predicted from the comparison of two independently assembled human genomes: the reference genome (NCBI36/HG18) and HuRef. Surprisingly, we found that two thirds of these predictions (62) represent errors either in assembly comparison or in one of the assemblies, including 27 misassembled regions in HG18. Next, we validated 22 of the remaining 28 potential polymorphic inversions using different PCR techniques and characterized their breakpoints and ancestral state. In addition, we determined experimentally the derived allele frequency in Europeans for 17 inversions (DAF=0.01-0.80), as well as the distribution in 14 worldwide populations for 12 of them based on the 1000 Genomes Project data. Among the validated inversions, nine have inverted repeats (IRs) at their breakpoints, and two show nucleotide variation patterns consistent with a recurrent origin. Conversely, inversions without IRs have a unique origin and almost all of them show deletions or insertions at the breakpoints in the derived allele mediated by microhomology sequences, which highlights the importance of mechanisms like FoSTeS/MMBIR in the generation of complex rearrangements in the human genome. Finally, we found several inversions located within genes and at least one candidate to be positively selected in Africa. Thus, our study emphasizes the importance of careful analysis and validation of large-scale genomic predictions to extract reliable biological conclusions.

UR - http://www.scopus.com/inward/record.url?scp=85018244621&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85018244621&partnerID=8YFLogxK

U2 - 10.1093/hmg/ddw415

DO - 10.1093/hmg/ddw415

M3 - Article

VL - 26

SP - 567

EP - 581

JO - Human Molecular Genetics

JF - Human Molecular Genetics

SN - 0964-6906

IS - 3

ER -