PADS

Protein structure alignment using directional shape signatures

S. Alireza Aghili, Divyakant Agrawal, Amr El Abbadi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

A novel data mining approach for similarity search and knowledge discovery in protein structure databases is proposed. PADS (Protein structure Alignment by Directional shape Signatures) incorporates the three dimensional coordinates of the main atoms of each amino acid and extracts a geometrical shape signature along with the direction of each amino acid. As a result, each protein structure is presented by a series of multidimensional feature vectors representing local geometry, shape, direction, and biological properties of its amino acid molecules. Furthermore, a distance matrix is calculated and is incorporated into a local alignment dynamic programming algorithm to find the similar portions of two given protein structures followed by a sequence alignment step for more efficient filtration. The optimal superimposition of the detected similar regions is used to assess the quality of the results. The proposed algorithm is fast and accurate and hence could be used for analysis and knowledge discovery in large protein structures. The method has been compared with the results from CE, DALI, and CTSS using a representative sample of PDB structures. Several new structures not detected by other methods are detected.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science
EditorsL. Zhou, B.C. Ooi, X. Meng
Pages17-29
Number of pages13
Volume3453
Publication statusPublished - 2005
Externally publishedYes
Event10th International Conference on Database Systems for Advanced Applications, DASFAA 2005 - Beijing, China
Duration: 17 Apr 200520 Apr 2005

Other

Other10th International Conference on Database Systems for Advanced Applications, DASFAA 2005
CountryChina
CityBeijing
Period17/4/0520/4/05

Fingerprint

Proteins
Data mining
Amino acids
Dynamic programming
Atoms
Molecules
Geometry

Keywords

  • Bioinformatics
  • Biological data mining
  • Protein structure comparison
  • Shape similarity

ASJC Scopus subject areas

  • Computer Science (miscellaneous)

Cite this

Aghili, S. A., Agrawal, D., & El Abbadi, A. (2005). PADS: Protein structure alignment using directional shape signatures. In L. Zhou, B. C. Ooi, & X. Meng (Eds.), Lecture Notes in Computer Science (Vol. 3453, pp. 17-29)

PADS : Protein structure alignment using directional shape signatures. / Aghili, S. Alireza; Agrawal, Divyakant; El Abbadi, Amr.

Lecture Notes in Computer Science. ed. / L. Zhou; B.C. Ooi; X. Meng. Vol. 3453 2005. p. 17-29.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Aghili, SA, Agrawal, D & El Abbadi, A 2005, PADS: Protein structure alignment using directional shape signatures. in L Zhou, BC Ooi & X Meng (eds), Lecture Notes in Computer Science. vol. 3453, pp. 17-29, 10th International Conference on Database Systems for Advanced Applications, DASFAA 2005, Beijing, China, 17/4/05.
Aghili SA, Agrawal D, El Abbadi A. PADS: Protein structure alignment using directional shape signatures. In Zhou L, Ooi BC, Meng X, editors, Lecture Notes in Computer Science. Vol. 3453. 2005. p. 17-29
Aghili, S. Alireza ; Agrawal, Divyakant ; El Abbadi, Amr. / PADS : Protein structure alignment using directional shape signatures. Lecture Notes in Computer Science. editor / L. Zhou ; B.C. Ooi ; X. Meng. Vol. 3453 2005. pp. 17-29
@inproceedings{36419b7470404e249b2c6b018fe5bd42,
title = "PADS: Protein structure alignment using directional shape signatures",
abstract = "A novel data mining approach for similarity search and knowledge discovery in protein structure databases is proposed. PADS (Protein structure Alignment by Directional shape Signatures) incorporates the three dimensional coordinates of the main atoms of each amino acid and extracts a geometrical shape signature along with the direction of each amino acid. As a result, each protein structure is presented by a series of multidimensional feature vectors representing local geometry, shape, direction, and biological properties of its amino acid molecules. Furthermore, a distance matrix is calculated and is incorporated into a local alignment dynamic programming algorithm to find the similar portions of two given protein structures followed by a sequence alignment step for more efficient filtration. The optimal superimposition of the detected similar regions is used to assess the quality of the results. The proposed algorithm is fast and accurate and hence could be used for analysis and knowledge discovery in large protein structures. The method has been compared with the results from CE, DALI, and CTSS using a representative sample of PDB structures. Several new structures not detected by other methods are detected.",
keywords = "Bioinformatics, Biological data mining, Protein structure comparison, Shape similarity",
author = "Aghili, {S. Alireza} and Divyakant Agrawal and {El Abbadi}, Amr",
year = "2005",
language = "English",
volume = "3453",
pages = "17--29",
editor = "L. Zhou and B.C. Ooi and X. Meng",
booktitle = "Lecture Notes in Computer Science",

}

TY - GEN

T1 - PADS

T2 - Protein structure alignment using directional shape signatures

AU - Aghili, S. Alireza

AU - Agrawal, Divyakant

AU - El Abbadi, Amr

PY - 2005

Y1 - 2005

N2 - A novel data mining approach for similarity search and knowledge discovery in protein structure databases is proposed. PADS (Protein structure Alignment by Directional shape Signatures) incorporates the three dimensional coordinates of the main atoms of each amino acid and extracts a geometrical shape signature along with the direction of each amino acid. As a result, each protein structure is presented by a series of multidimensional feature vectors representing local geometry, shape, direction, and biological properties of its amino acid molecules. Furthermore, a distance matrix is calculated and is incorporated into a local alignment dynamic programming algorithm to find the similar portions of two given protein structures followed by a sequence alignment step for more efficient filtration. The optimal superimposition of the detected similar regions is used to assess the quality of the results. The proposed algorithm is fast and accurate and hence could be used for analysis and knowledge discovery in large protein structures. The method has been compared with the results from CE, DALI, and CTSS using a representative sample of PDB structures. Several new structures not detected by other methods are detected.

AB - A novel data mining approach for similarity search and knowledge discovery in protein structure databases is proposed. PADS (Protein structure Alignment by Directional shape Signatures) incorporates the three dimensional coordinates of the main atoms of each amino acid and extracts a geometrical shape signature along with the direction of each amino acid. As a result, each protein structure is presented by a series of multidimensional feature vectors representing local geometry, shape, direction, and biological properties of its amino acid molecules. Furthermore, a distance matrix is calculated and is incorporated into a local alignment dynamic programming algorithm to find the similar portions of two given protein structures followed by a sequence alignment step for more efficient filtration. The optimal superimposition of the detected similar regions is used to assess the quality of the results. The proposed algorithm is fast and accurate and hence could be used for analysis and knowledge discovery in large protein structures. The method has been compared with the results from CE, DALI, and CTSS using a representative sample of PDB structures. Several new structures not detected by other methods are detected.

KW - Bioinformatics

KW - Biological data mining

KW - Protein structure comparison

KW - Shape similarity

UR - http://www.scopus.com/inward/record.url?scp=24644524737&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=24644524737&partnerID=8YFLogxK

M3 - Conference contribution

VL - 3453

SP - 17

EP - 29

BT - Lecture Notes in Computer Science

A2 - Zhou, L.

A2 - Ooi, B.C.

A2 - Meng, X.

ER -