Review of General Algorithmic Features for Genome Assemblers for Next Generation Sequencers

Bilal Wajid, Erchin Serpedin

Research output: Contribution to journalReview article

15 Citations (Scopus)

Abstract

In the realm of bioinformatics and computational biology, the most rudimentary data upon which all the analysis is built is the sequence data of genes, proteins and RNA. The sequence data of the entire genome is the solution to the genome assembly problem. The scope of this contribution is to provide an overview on the art of problem-solving applied within the domain of genome assembly in the next-generation sequencing (NGS) platforms. This article discusses the major genome assemblers that were proposed in the literature during the past decade by outlining their basic working principles. It is intended to act as a qualitative, not a quantitative, tutorial to all working on genome assemblers pertaining to the next generation of sequencers. We discuss the theoretical aspects of various genome assemblers, identifying their working schemes. We also discuss briefly the direction in which the area is headed towards along with discussing core issues on software simplicity.

Original languageEnglish
Pages (from-to)58-73
Number of pages16
JournalGenomics, Proteomics and Bioinformatics
Volume10
Issue number2
DOIs
Publication statusPublished - Apr 2012
Externally publishedYes

Fingerprint

Genome
Genes
Computational Biology
Bioinformatics
Sequencing
Review
RNA
Simplicity
Software
Entire
Gene
Protein
Proteins

Keywords

  • Comparative assembly
  • De Bruijn graphs
  • De novo assembly
  • Genome assembly
  • Next-generation sequencing

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Genetics
  • Computational Mathematics

Cite this

Review of General Algorithmic Features for Genome Assemblers for Next Generation Sequencers. / Wajid, Bilal; Serpedin, Erchin.

In: Genomics, Proteomics and Bioinformatics, Vol. 10, No. 2, 04.2012, p. 58-73.

Research output: Contribution to journalReview article

@article{690dfb1232fe4cd29756adc5b785c239,
title = "Review of General Algorithmic Features for Genome Assemblers for Next Generation Sequencers",
abstract = "In the realm of bioinformatics and computational biology, the most rudimentary data upon which all the analysis is built is the sequence data of genes, proteins and RNA. The sequence data of the entire genome is the solution to the genome assembly problem. The scope of this contribution is to provide an overview on the art of problem-solving applied within the domain of genome assembly in the next-generation sequencing (NGS) platforms. This article discusses the major genome assemblers that were proposed in the literature during the past decade by outlining their basic working principles. It is intended to act as a qualitative, not a quantitative, tutorial to all working on genome assemblers pertaining to the next generation of sequencers. We discuss the theoretical aspects of various genome assemblers, identifying their working schemes. We also discuss briefly the direction in which the area is headed towards along with discussing core issues on software simplicity.",
keywords = "Comparative assembly, De Bruijn graphs, De novo assembly, Genome assembly, Next-generation sequencing",
author = "Bilal Wajid and Erchin Serpedin",
year = "2012",
month = "4",
doi = "10.1016/j.gpb.2012.05.006",
language = "English",
volume = "10",
pages = "58--73",
journal = "Genomics, Proteomics and Bioinformatics",
issn = "1672-0229",
publisher = "Beijing Genomics Institute",
number = "2",

}

TY - JOUR

T1 - Review of General Algorithmic Features for Genome Assemblers for Next Generation Sequencers

AU - Wajid, Bilal

AU - Serpedin, Erchin

PY - 2012/4

Y1 - 2012/4

N2 - In the realm of bioinformatics and computational biology, the most rudimentary data upon which all the analysis is built is the sequence data of genes, proteins and RNA. The sequence data of the entire genome is the solution to the genome assembly problem. The scope of this contribution is to provide an overview on the art of problem-solving applied within the domain of genome assembly in the next-generation sequencing (NGS) platforms. This article discusses the major genome assemblers that were proposed in the literature during the past decade by outlining their basic working principles. It is intended to act as a qualitative, not a quantitative, tutorial to all working on genome assemblers pertaining to the next generation of sequencers. We discuss the theoretical aspects of various genome assemblers, identifying their working schemes. We also discuss briefly the direction in which the area is headed towards along with discussing core issues on software simplicity.

AB - In the realm of bioinformatics and computational biology, the most rudimentary data upon which all the analysis is built is the sequence data of genes, proteins and RNA. The sequence data of the entire genome is the solution to the genome assembly problem. The scope of this contribution is to provide an overview on the art of problem-solving applied within the domain of genome assembly in the next-generation sequencing (NGS) platforms. This article discusses the major genome assemblers that were proposed in the literature during the past decade by outlining their basic working principles. It is intended to act as a qualitative, not a quantitative, tutorial to all working on genome assemblers pertaining to the next generation of sequencers. We discuss the theoretical aspects of various genome assemblers, identifying their working schemes. We also discuss briefly the direction in which the area is headed towards along with discussing core issues on software simplicity.

KW - Comparative assembly

KW - De Bruijn graphs

KW - De novo assembly

KW - Genome assembly

KW - Next-generation sequencing

UR - http://www.scopus.com/inward/record.url?scp=84863444675&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863444675&partnerID=8YFLogxK

U2 - 10.1016/j.gpb.2012.05.006

DO - 10.1016/j.gpb.2012.05.006

M3 - Review article

VL - 10

SP - 58

EP - 73

JO - Genomics, Proteomics and Bioinformatics

JF - Genomics, Proteomics and Bioinformatics

SN - 1672-0229

IS - 2

ER -