Haplotype inferring via galled-tree networks is NP-complete

Arvind Gupta, Mohammad M. Karimi, Ján Maňuch, Ladislav Stacho, Xiaohong Zhao

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

The problem of determining haplotypes from genotypes has gained considerable prominence in the research community since the beginning of the HapMap project. Here the focus is on determining the sets of SNP values of individual chromosomes (haplotypes), since such information better captures the genetic causes of diseases. One of the main algorithmic tools for haplotyping is based on the assumption that the evolutionary history for the original haplotypes satisfies perfect phylogeny. This tool can be applied only on individual blocks of chromosomes, in which it is assumed that recombinations do not happen. However, exact determination of blocks is usually not possible. It would be desirable to develop a method for haplotyping which can account for recombinations, and thus can be applied on multiblock sections of chromosomes. A natural candidate for such a method is haplotyping via phylogenetic networks (which model recombinations) or their simplified version: galled-tree networks. However, even haplotyping via galled-tree networks appears hard, as the efficient algorithms exist only for very special cases: the galled-tree network has either a single gall or only small galls with two mutations each. Building on our previous results, we show that, in general, haplotyping via galled-tree networks is NP-complete, and it remains NP-complete when galls are allowed to have at most k mutations, for any k≥3.

Original languageEnglish
Pages (from-to)1317-1331
Number of pages15
JournalJournal of Computational Biology
Volume17
Issue number10
DOIs
Publication statusPublished - 1 Oct 2010
Externally publishedYes

Fingerprint

Tree Networks
Haplotype
Chromosomes
Haplotypes
Genetic Recombination
NP-complete problem
Recombination
Chromosome
HapMap Project
Mutation
Inborn Genetic Diseases
Phylogenetic Network
Phylogeny
Multiblock
Single Nucleotide Polymorphism
History
Genotype
Network Model
Efficient Algorithms
Research

Keywords

  • algorithms
  • combinatorial optimization
  • combinatorics
  • RNA

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics
  • Computational Mathematics
  • Modelling and Simulation
  • Computational Theory and Mathematics

Cite this

Haplotype inferring via galled-tree networks is NP-complete. / Gupta, Arvind; Karimi, Mohammad M.; Maňuch, Ján; Stacho, Ladislav; Zhao, Xiaohong.

In: Journal of Computational Biology, Vol. 17, No. 10, 01.10.2010, p. 1317-1331.

Research output: Contribution to journalArticle

Gupta, Arvind ; Karimi, Mohammad M. ; Maňuch, Ján ; Stacho, Ladislav ; Zhao, Xiaohong. / Haplotype inferring via galled-tree networks is NP-complete. In: Journal of Computational Biology. 2010 ; Vol. 17, No. 10. pp. 1317-1331.
@article{ded4ce26a46a4d7e8ef4a6f423126132,
title = "Haplotype inferring via galled-tree networks is NP-complete",
abstract = "The problem of determining haplotypes from genotypes has gained considerable prominence in the research community since the beginning of the HapMap project. Here the focus is on determining the sets of SNP values of individual chromosomes (haplotypes), since such information better captures the genetic causes of diseases. One of the main algorithmic tools for haplotyping is based on the assumption that the evolutionary history for the original haplotypes satisfies perfect phylogeny. This tool can be applied only on individual blocks of chromosomes, in which it is assumed that recombinations do not happen. However, exact determination of blocks is usually not possible. It would be desirable to develop a method for haplotyping which can account for recombinations, and thus can be applied on multiblock sections of chromosomes. A natural candidate for such a method is haplotyping via phylogenetic networks (which model recombinations) or their simplified version: galled-tree networks. However, even haplotyping via galled-tree networks appears hard, as the efficient algorithms exist only for very special cases: the galled-tree network has either a single gall or only small galls with two mutations each. Building on our previous results, we show that, in general, haplotyping via galled-tree networks is NP-complete, and it remains NP-complete when galls are allowed to have at most k mutations, for any k≥3.",
keywords = "algorithms, combinatorial optimization, combinatorics, RNA",
author = "Arvind Gupta and Karimi, {Mohammad M.} and J{\'a}n Maňuch and Ladislav Stacho and Xiaohong Zhao",
year = "2010",
month = "10",
day = "1",
doi = "10.1089/cmb.2009.0117",
language = "English",
volume = "17",
pages = "1317--1331",
journal = "Journal of Computational Biology",
issn = "1066-5277",
publisher = "Mary Ann Liebert Inc.",
number = "10",

}

TY - JOUR

T1 - Haplotype inferring via galled-tree networks is NP-complete

AU - Gupta, Arvind

AU - Karimi, Mohammad M.

AU - Maňuch, Ján

AU - Stacho, Ladislav

AU - Zhao, Xiaohong

PY - 2010/10/1

Y1 - 2010/10/1

N2 - The problem of determining haplotypes from genotypes has gained considerable prominence in the research community since the beginning of the HapMap project. Here the focus is on determining the sets of SNP values of individual chromosomes (haplotypes), since such information better captures the genetic causes of diseases. One of the main algorithmic tools for haplotyping is based on the assumption that the evolutionary history for the original haplotypes satisfies perfect phylogeny. This tool can be applied only on individual blocks of chromosomes, in which it is assumed that recombinations do not happen. However, exact determination of blocks is usually not possible. It would be desirable to develop a method for haplotyping which can account for recombinations, and thus can be applied on multiblock sections of chromosomes. A natural candidate for such a method is haplotyping via phylogenetic networks (which model recombinations) or their simplified version: galled-tree networks. However, even haplotyping via galled-tree networks appears hard, as the efficient algorithms exist only for very special cases: the galled-tree network has either a single gall or only small galls with two mutations each. Building on our previous results, we show that, in general, haplotyping via galled-tree networks is NP-complete, and it remains NP-complete when galls are allowed to have at most k mutations, for any k≥3.

AB - The problem of determining haplotypes from genotypes has gained considerable prominence in the research community since the beginning of the HapMap project. Here the focus is on determining the sets of SNP values of individual chromosomes (haplotypes), since such information better captures the genetic causes of diseases. One of the main algorithmic tools for haplotyping is based on the assumption that the evolutionary history for the original haplotypes satisfies perfect phylogeny. This tool can be applied only on individual blocks of chromosomes, in which it is assumed that recombinations do not happen. However, exact determination of blocks is usually not possible. It would be desirable to develop a method for haplotyping which can account for recombinations, and thus can be applied on multiblock sections of chromosomes. A natural candidate for such a method is haplotyping via phylogenetic networks (which model recombinations) or their simplified version: galled-tree networks. However, even haplotyping via galled-tree networks appears hard, as the efficient algorithms exist only for very special cases: the galled-tree network has either a single gall or only small galls with two mutations each. Building on our previous results, we show that, in general, haplotyping via galled-tree networks is NP-complete, and it remains NP-complete when galls are allowed to have at most k mutations, for any k≥3.

KW - algorithms

KW - combinatorial optimization

KW - combinatorics

KW - RNA

UR - http://www.scopus.com/inward/record.url?scp=77958065841&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77958065841&partnerID=8YFLogxK

U2 - 10.1089/cmb.2009.0117

DO - 10.1089/cmb.2009.0117

M3 - Article

VL - 17

SP - 1317

EP - 1331

JO - Journal of Computational Biology

JF - Journal of Computational Biology

SN - 1066-5277

IS - 10

ER -