Toward efficient search for ultrascale storage systems

Joseph L. Naps, Mohamed Mokbel, David H.C. Du

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

As the rate at which scientific computing generates data continues to increase, we are finding that managing this data, in all facets, is quickly becoming more challenging. In many facilities with large scale storage needs, this massive amount of data is stored in distributed, multi-tiered storage systems. It has become imperative to allow for efficient and effective search within these kinds of environments. For some search problems, specifically file system metadata, traditional relational databases have been used with, initially, good results. As the scale of supercomputing has grown though, we find that it is becoming increasing difficult for databases to scale up with the volume of metadata that they are managing. In this paper, we propose a new direction for database management techniques within the context of high performance computing, specifically, search within ultrascale storage systems. Instead of using databases as a layer sitting above the storage system, we suggest the movement of database components within the storage system itself. By taking this approach, we aim to leverage the decades of research and tuning that have made relational database technology successful. At the same time, this integration gives us the ability to maintain a better view of the storage system for search optimization. Through this effort, we can position these techniques to better scale to the degree that is required by the high performance computing community currently, and in the future.

Original languageEnglish
Title of host publicationHPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11
Pages1-4
Number of pages4
DOIs
Publication statusPublished - 1 Dec 2011
Externally publishedYes
Event1st Annual 2011 Workshop on High-Performance Computing Meets Databases, HPCDB'11, Co-located with Supercomputing, SC'11 - Seattle, WA, United States
Duration: 13 Nov 201113 Nov 2011

Other

Other1st Annual 2011 Workshop on High-Performance Computing Meets Databases, HPCDB'11, Co-located with Supercomputing, SC'11
CountryUnited States
CitySeattle, WA
Period13/11/1113/11/11

Fingerprint

Metadata
Natural sciences computing
Tuning

Keywords

  • Databases
  • Exascale
  • File systems
  • Indexing
  • Search

ASJC Scopus subject areas

  • Computer Science Applications
  • Software

Cite this

Naps, J. L., Mokbel, M., & Du, D. H. C. (2011). Toward efficient search for ultrascale storage systems. In HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11 (pp. 1-4) https://doi.org/10.1145/2125636.2125638

Toward efficient search for ultrascale storage systems. / Naps, Joseph L.; Mokbel, Mohamed; Du, David H.C.

HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11. 2011. p. 1-4.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Naps, JL, Mokbel, M & Du, DHC 2011, Toward efficient search for ultrascale storage systems. in HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11. pp. 1-4, 1st Annual 2011 Workshop on High-Performance Computing Meets Databases, HPCDB'11, Co-located with Supercomputing, SC'11, Seattle, WA, United States, 13/11/11. https://doi.org/10.1145/2125636.2125638
Naps JL, Mokbel M, Du DHC. Toward efficient search for ultrascale storage systems. In HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11. 2011. p. 1-4 https://doi.org/10.1145/2125636.2125638
Naps, Joseph L. ; Mokbel, Mohamed ; Du, David H.C. / Toward efficient search for ultrascale storage systems. HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11. 2011. pp. 1-4
@inproceedings{3d9c14134108445488f9a834c5eee389,
title = "Toward efficient search for ultrascale storage systems",
abstract = "As the rate at which scientific computing generates data continues to increase, we are finding that managing this data, in all facets, is quickly becoming more challenging. In many facilities with large scale storage needs, this massive amount of data is stored in distributed, multi-tiered storage systems. It has become imperative to allow for efficient and effective search within these kinds of environments. For some search problems, specifically file system metadata, traditional relational databases have been used with, initially, good results. As the scale of supercomputing has grown though, we find that it is becoming increasing difficult for databases to scale up with the volume of metadata that they are managing. In this paper, we propose a new direction for database management techniques within the context of high performance computing, specifically, search within ultrascale storage systems. Instead of using databases as a layer sitting above the storage system, we suggest the movement of database components within the storage system itself. By taking this approach, we aim to leverage the decades of research and tuning that have made relational database technology successful. At the same time, this integration gives us the ability to maintain a better view of the storage system for search optimization. Through this effort, we can position these techniques to better scale to the degree that is required by the high performance computing community currently, and in the future.",
keywords = "Databases, Exascale, File systems, Indexing, Search",
author = "Naps, {Joseph L.} and Mohamed Mokbel and Du, {David H.C.}",
year = "2011",
month = "12",
day = "1",
doi = "10.1145/2125636.2125638",
language = "English",
isbn = "9781450311571",
pages = "1--4",
booktitle = "HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11",

}

TY - GEN

T1 - Toward efficient search for ultrascale storage systems

AU - Naps, Joseph L.

AU - Mokbel, Mohamed

AU - Du, David H.C.

PY - 2011/12/1

Y1 - 2011/12/1

N2 - As the rate at which scientific computing generates data continues to increase, we are finding that managing this data, in all facets, is quickly becoming more challenging. In many facilities with large scale storage needs, this massive amount of data is stored in distributed, multi-tiered storage systems. It has become imperative to allow for efficient and effective search within these kinds of environments. For some search problems, specifically file system metadata, traditional relational databases have been used with, initially, good results. As the scale of supercomputing has grown though, we find that it is becoming increasing difficult for databases to scale up with the volume of metadata that they are managing. In this paper, we propose a new direction for database management techniques within the context of high performance computing, specifically, search within ultrascale storage systems. Instead of using databases as a layer sitting above the storage system, we suggest the movement of database components within the storage system itself. By taking this approach, we aim to leverage the decades of research and tuning that have made relational database technology successful. At the same time, this integration gives us the ability to maintain a better view of the storage system for search optimization. Through this effort, we can position these techniques to better scale to the degree that is required by the high performance computing community currently, and in the future.

AB - As the rate at which scientific computing generates data continues to increase, we are finding that managing this data, in all facets, is quickly becoming more challenging. In many facilities with large scale storage needs, this massive amount of data is stored in distributed, multi-tiered storage systems. It has become imperative to allow for efficient and effective search within these kinds of environments. For some search problems, specifically file system metadata, traditional relational databases have been used with, initially, good results. As the scale of supercomputing has grown though, we find that it is becoming increasing difficult for databases to scale up with the volume of metadata that they are managing. In this paper, we propose a new direction for database management techniques within the context of high performance computing, specifically, search within ultrascale storage systems. Instead of using databases as a layer sitting above the storage system, we suggest the movement of database components within the storage system itself. By taking this approach, we aim to leverage the decades of research and tuning that have made relational database technology successful. At the same time, this integration gives us the ability to maintain a better view of the storage system for search optimization. Through this effort, we can position these techniques to better scale to the degree that is required by the high performance computing community currently, and in the future.

KW - Databases

KW - Exascale

KW - File systems

KW - Indexing

KW - Search

UR - http://www.scopus.com/inward/record.url?scp=84857926748&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84857926748&partnerID=8YFLogxK

U2 - 10.1145/2125636.2125638

DO - 10.1145/2125636.2125638

M3 - Conference contribution

AN - SCOPUS:84857926748

SN - 9781450311571

SP - 1

EP - 4

BT - HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11

ER -