### Abstract

Background: Restriction site analysis involves determining the locations of restriction sites after the process of digestion by reconstructing their positions based on the lengths of the cut DNA. Using different reaction times with a single enzyme to cut DNA is a technique known as a partial digestion. Determining the exact locations of restriction sites following a partial digestion is challenging due to the computational time required even with the best known practical algorithm. Results: In this paper, we introduce an efficient algorithm to find the exact solution for the partial digest problem. The algorithm is able to find all possible solutions for the input and works by traversing the solution tree with a breadth-first search in two stages and deleting all repeated subproblems. Two types of simulated data, random and Zhang, are used to measure the efficiency of the algorithm. We also apply the algorithm to real data for the Luciferase gene and the E. coli K12 genome. Conclusion: Our algorithm is a fast tool to find the exact solution for the partial digest problem. The percentage of improvement is more than 75% over the best known practical algorithm for the worst case. For large numbers of inputs, our algorithm is able to solve the problem in a suitable time, while the best known practical algorithm is unable.

Original language | English |
---|---|

Article number | 510 |

Journal | BMC Bioinformatics |

Volume | 17 |

DOIs | |

Publication status | Published - 22 Dec 2016 |

### Fingerprint

### Keywords

- Bioinformatics algorithm
- Breadth first search
- Digestion process
- DNA
- Partial digest problem
- Restriction site analysis

### ASJC Scopus subject areas

- Structural Biology
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Applied Mathematics

### Cite this

*BMC Bioinformatics*,

*17*, [510]. https://doi.org/10.1186/s12859-016-1365-2

**A fast exact sequential algorithm for the partial digest problem.** / Abbas, Mostafa; Bahig, Hazem M.

Research output: Contribution to journal › Article

*BMC Bioinformatics*, vol. 17, 510. https://doi.org/10.1186/s12859-016-1365-2

}

TY - JOUR

T1 - A fast exact sequential algorithm for the partial digest problem

AU - Abbas, Mostafa

AU - Bahig, Hazem M.

PY - 2016/12/22

Y1 - 2016/12/22

N2 - Background: Restriction site analysis involves determining the locations of restriction sites after the process of digestion by reconstructing their positions based on the lengths of the cut DNA. Using different reaction times with a single enzyme to cut DNA is a technique known as a partial digestion. Determining the exact locations of restriction sites following a partial digestion is challenging due to the computational time required even with the best known practical algorithm. Results: In this paper, we introduce an efficient algorithm to find the exact solution for the partial digest problem. The algorithm is able to find all possible solutions for the input and works by traversing the solution tree with a breadth-first search in two stages and deleting all repeated subproblems. Two types of simulated data, random and Zhang, are used to measure the efficiency of the algorithm. We also apply the algorithm to real data for the Luciferase gene and the E. coli K12 genome. Conclusion: Our algorithm is a fast tool to find the exact solution for the partial digest problem. The percentage of improvement is more than 75% over the best known practical algorithm for the worst case. For large numbers of inputs, our algorithm is able to solve the problem in a suitable time, while the best known practical algorithm is unable.

AB - Background: Restriction site analysis involves determining the locations of restriction sites after the process of digestion by reconstructing their positions based on the lengths of the cut DNA. Using different reaction times with a single enzyme to cut DNA is a technique known as a partial digestion. Determining the exact locations of restriction sites following a partial digestion is challenging due to the computational time required even with the best known practical algorithm. Results: In this paper, we introduce an efficient algorithm to find the exact solution for the partial digest problem. The algorithm is able to find all possible solutions for the input and works by traversing the solution tree with a breadth-first search in two stages and deleting all repeated subproblems. Two types of simulated data, random and Zhang, are used to measure the efficiency of the algorithm. We also apply the algorithm to real data for the Luciferase gene and the E. coli K12 genome. Conclusion: Our algorithm is a fast tool to find the exact solution for the partial digest problem. The percentage of improvement is more than 75% over the best known practical algorithm for the worst case. For large numbers of inputs, our algorithm is able to solve the problem in a suitable time, while the best known practical algorithm is unable.

KW - Bioinformatics algorithm

KW - Breadth first search

KW - Digestion process

KW - DNA

KW - Partial digest problem

KW - Restriction site analysis

UR - http://www.scopus.com/inward/record.url?scp=85006856701&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85006856701&partnerID=8YFLogxK

U2 - 10.1186/s12859-016-1365-2

DO - 10.1186/s12859-016-1365-2

M3 - Article

C2 - 28155644

AN - SCOPUS:85006856701

VL - 17

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 510

ER -