A fast exact sequential algorithm for the partial digest problem

Mostafa Abbas, Hazem M. Bahig

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Background: Restriction site analysis involves determining the locations of restriction sites after the process of digestion by reconstructing their positions based on the lengths of the cut DNA. Using different reaction times with a single enzyme to cut DNA is a technique known as a partial digestion. Determining the exact locations of restriction sites following a partial digestion is challenging due to the computational time required even with the best known practical algorithm. Results: In this paper, we introduce an efficient algorithm to find the exact solution for the partial digest problem. The algorithm is able to find all possible solutions for the input and works by traversing the solution tree with a breadth-first search in two stages and deleting all repeated subproblems. Two types of simulated data, random and Zhang, are used to measure the efficiency of the algorithm. We also apply the algorithm to real data for the Luciferase gene and the E. coli K12 genome. Conclusion: Our algorithm is a fast tool to find the exact solution for the partial digest problem. The percentage of improvement is more than 75% over the best known practical algorithm for the worst case. For large numbers of inputs, our algorithm is able to solve the problem in a suitable time, while the best known practical algorithm is unable.

Original languageEnglish
Article number510
JournalBMC Bioinformatics
Volume17
DOIs
Publication statusPublished - 22 Dec 2016

Fingerprint

Sequential Algorithm
Exact Algorithms
Partial
Digestion
Restriction
Exact Solution
DNA
Genes
Breadth-first Search
Reaction Time
Escherichia coli K12
Escherichia Coli
Luciferases
Percentage
Escherichia coli
Enzymes
Genome
Efficient Algorithms
Gene

Keywords

  • Bioinformatics algorithm
  • Breadth first search
  • Digestion process
  • DNA
  • Partial digest problem
  • Restriction site analysis

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

A fast exact sequential algorithm for the partial digest problem. / Abbas, Mostafa; Bahig, Hazem M.

In: BMC Bioinformatics, Vol. 17, 510, 22.12.2016.

Research output: Contribution to journalArticle

@article{5e051f8689084f899587b33b015657ff,
title = "A fast exact sequential algorithm for the partial digest problem",
abstract = "Background: Restriction site analysis involves determining the locations of restriction sites after the process of digestion by reconstructing their positions based on the lengths of the cut DNA. Using different reaction times with a single enzyme to cut DNA is a technique known as a partial digestion. Determining the exact locations of restriction sites following a partial digestion is challenging due to the computational time required even with the best known practical algorithm. Results: In this paper, we introduce an efficient algorithm to find the exact solution for the partial digest problem. The algorithm is able to find all possible solutions for the input and works by traversing the solution tree with a breadth-first search in two stages and deleting all repeated subproblems. Two types of simulated data, random and Zhang, are used to measure the efficiency of the algorithm. We also apply the algorithm to real data for the Luciferase gene and the E. coli K12 genome. Conclusion: Our algorithm is a fast tool to find the exact solution for the partial digest problem. The percentage of improvement is more than 75{\%} over the best known practical algorithm for the worst case. For large numbers of inputs, our algorithm is able to solve the problem in a suitable time, while the best known practical algorithm is unable.",
keywords = "Bioinformatics algorithm, Breadth first search, Digestion process, DNA, Partial digest problem, Restriction site analysis",
author = "Mostafa Abbas and Bahig, {Hazem M.}",
year = "2016",
month = "12",
day = "22",
doi = "10.1186/s12859-016-1365-2",
language = "English",
volume = "17",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - A fast exact sequential algorithm for the partial digest problem

AU - Abbas, Mostafa

AU - Bahig, Hazem M.

PY - 2016/12/22

Y1 - 2016/12/22

N2 - Background: Restriction site analysis involves determining the locations of restriction sites after the process of digestion by reconstructing their positions based on the lengths of the cut DNA. Using different reaction times with a single enzyme to cut DNA is a technique known as a partial digestion. Determining the exact locations of restriction sites following a partial digestion is challenging due to the computational time required even with the best known practical algorithm. Results: In this paper, we introduce an efficient algorithm to find the exact solution for the partial digest problem. The algorithm is able to find all possible solutions for the input and works by traversing the solution tree with a breadth-first search in two stages and deleting all repeated subproblems. Two types of simulated data, random and Zhang, are used to measure the efficiency of the algorithm. We also apply the algorithm to real data for the Luciferase gene and the E. coli K12 genome. Conclusion: Our algorithm is a fast tool to find the exact solution for the partial digest problem. The percentage of improvement is more than 75% over the best known practical algorithm for the worst case. For large numbers of inputs, our algorithm is able to solve the problem in a suitable time, while the best known practical algorithm is unable.

AB - Background: Restriction site analysis involves determining the locations of restriction sites after the process of digestion by reconstructing their positions based on the lengths of the cut DNA. Using different reaction times with a single enzyme to cut DNA is a technique known as a partial digestion. Determining the exact locations of restriction sites following a partial digestion is challenging due to the computational time required even with the best known practical algorithm. Results: In this paper, we introduce an efficient algorithm to find the exact solution for the partial digest problem. The algorithm is able to find all possible solutions for the input and works by traversing the solution tree with a breadth-first search in two stages and deleting all repeated subproblems. Two types of simulated data, random and Zhang, are used to measure the efficiency of the algorithm. We also apply the algorithm to real data for the Luciferase gene and the E. coli K12 genome. Conclusion: Our algorithm is a fast tool to find the exact solution for the partial digest problem. The percentage of improvement is more than 75% over the best known practical algorithm for the worst case. For large numbers of inputs, our algorithm is able to solve the problem in a suitable time, while the best known practical algorithm is unable.

KW - Bioinformatics algorithm

KW - Breadth first search

KW - Digestion process

KW - DNA

KW - Partial digest problem

KW - Restriction site analysis

UR - http://www.scopus.com/inward/record.url?scp=85006856701&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85006856701&partnerID=8YFLogxK

U2 - 10.1186/s12859-016-1365-2

DO - 10.1186/s12859-016-1365-2

M3 - Article

C2 - 28155644

AN - SCOPUS:85006856701

VL - 17

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 510

ER -