Filtration of string proximity search via transformation

S. Alireza Aghili, Divyakant Agrawal, Amr El Abbadi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

The problem of proximity search in biological databases is addressed. We study vector transformations and conduct the application of DFT(Discrete Fourier Transformation) and DWT(Discrete Wavelet Transformation, Haar) dimensionality reduction techniques for DNA sequence proximity search to reduce the search time of range queries. Our empirical results on a number of Prokaryote and Eukaryote DNA contig databases demonstrate up to 50-fold filtration ratio of the search space, up to 13 times faster filtration. The proposed transformation techniques may easily be integrated as a preprocessing phase on top of the current existing similarity search heuristics such as BLAST[1], PattenHunter[11], FastA[17], QUASAR[4] and to efficiently prune non-relevant sequences. We study the precision of applying dimensionality reduction techniques for faster and more efficient range query searches, and discuss the imposed trade-offs.

Original languageEnglish
Title of host publicationProceedings - 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages149-157
Number of pages9
ISBN (Print)0769519075, 9780769519074
DOIs
Publication statusPublished - 2003
Externally publishedYes
Event3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003 - Bethesda, United States
Duration: 10 Mar 200312 Mar 2003

Other

Other3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003
CountryUnited States
CityBethesda
Period10/3/0312/3/03

Fingerprint

DNA sequences
DNA

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Aghili, S. A., Agrawal, D., & Abbadi, A. E. (2003). Filtration of string proximity search via transformation. In Proceedings - 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003 (pp. 149-157). [1188941] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BIBE.2003.1188941

Filtration of string proximity search via transformation. / Aghili, S. Alireza; Agrawal, Divyakant; Abbadi, Amr El.

Proceedings - 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003. Institute of Electrical and Electronics Engineers Inc., 2003. p. 149-157 1188941.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Aghili, SA, Agrawal, D & Abbadi, AE 2003, Filtration of string proximity search via transformation. in Proceedings - 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003., 1188941, Institute of Electrical and Electronics Engineers Inc., pp. 149-157, 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003, Bethesda, United States, 10/3/03. https://doi.org/10.1109/BIBE.2003.1188941
Aghili SA, Agrawal D, Abbadi AE. Filtration of string proximity search via transformation. In Proceedings - 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003. Institute of Electrical and Electronics Engineers Inc. 2003. p. 149-157. 1188941 https://doi.org/10.1109/BIBE.2003.1188941
Aghili, S. Alireza ; Agrawal, Divyakant ; Abbadi, Amr El. / Filtration of string proximity search via transformation. Proceedings - 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003. Institute of Electrical and Electronics Engineers Inc., 2003. pp. 149-157
@inproceedings{9978dbb1d58d49bfa9acd2a37c65c1c5,
title = "Filtration of string proximity search via transformation",
abstract = "The problem of proximity search in biological databases is addressed. We study vector transformations and conduct the application of DFT(Discrete Fourier Transformation) and DWT(Discrete Wavelet Transformation, Haar) dimensionality reduction techniques for DNA sequence proximity search to reduce the search time of range queries. Our empirical results on a number of Prokaryote and Eukaryote DNA contig databases demonstrate up to 50-fold filtration ratio of the search space, up to 13 times faster filtration. The proposed transformation techniques may easily be integrated as a preprocessing phase on top of the current existing similarity search heuristics such as BLAST[1], PattenHunter[11], FastA[17], QUASAR[4] and to efficiently prune non-relevant sequences. We study the precision of applying dimensionality reduction techniques for faster and more efficient range query searches, and discuss the imposed trade-offs.",
author = "Aghili, {S. Alireza} and Divyakant Agrawal and Abbadi, {Amr El}",
year = "2003",
doi = "10.1109/BIBE.2003.1188941",
language = "English",
isbn = "0769519075",
pages = "149--157",
booktitle = "Proceedings - 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Filtration of string proximity search via transformation

AU - Aghili, S. Alireza

AU - Agrawal, Divyakant

AU - Abbadi, Amr El

PY - 2003

Y1 - 2003

N2 - The problem of proximity search in biological databases is addressed. We study vector transformations and conduct the application of DFT(Discrete Fourier Transformation) and DWT(Discrete Wavelet Transformation, Haar) dimensionality reduction techniques for DNA sequence proximity search to reduce the search time of range queries. Our empirical results on a number of Prokaryote and Eukaryote DNA contig databases demonstrate up to 50-fold filtration ratio of the search space, up to 13 times faster filtration. The proposed transformation techniques may easily be integrated as a preprocessing phase on top of the current existing similarity search heuristics such as BLAST[1], PattenHunter[11], FastA[17], QUASAR[4] and to efficiently prune non-relevant sequences. We study the precision of applying dimensionality reduction techniques for faster and more efficient range query searches, and discuss the imposed trade-offs.

AB - The problem of proximity search in biological databases is addressed. We study vector transformations and conduct the application of DFT(Discrete Fourier Transformation) and DWT(Discrete Wavelet Transformation, Haar) dimensionality reduction techniques for DNA sequence proximity search to reduce the search time of range queries. Our empirical results on a number of Prokaryote and Eukaryote DNA contig databases demonstrate up to 50-fold filtration ratio of the search space, up to 13 times faster filtration. The proposed transformation techniques may easily be integrated as a preprocessing phase on top of the current existing similarity search heuristics such as BLAST[1], PattenHunter[11], FastA[17], QUASAR[4] and to efficiently prune non-relevant sequences. We study the precision of applying dimensionality reduction techniques for faster and more efficient range query searches, and discuss the imposed trade-offs.

UR - http://www.scopus.com/inward/record.url?scp=4544363220&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4544363220&partnerID=8YFLogxK

U2 - 10.1109/BIBE.2003.1188941

DO - 10.1109/BIBE.2003.1188941

M3 - Conference contribution

SN - 0769519075

SN - 9780769519074

SP - 149

EP - 157

BT - Proceedings - 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE 2003

PB - Institute of Electrical and Electronics Engineers Inc.

ER -