A scientific data management system for irregular applications

Jaechun No, R. Thakur, D. Kaushik, L. Freitag, A. Choudhary

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Many scientific applications are I/O intensive and generate large data sets, spanning hundreds or thousands of "files." Management, storage, efficient access, and analysis of this data present an extremely challenging task. We have developed a software system, called Scientific Data Manager (SDM), that uses a combination of parallel file I/O and database support for high-performance scientific data management. SDM provides a high-level API to the user and, internally, uses a parallel file system to store real data and a database to store application-related metadata. In this paper, we describe how we designed and implemented SDM to support irregular applications. SDM can efficiently handle the reading and writing of data in an irregular mesh, as well as the distribution of index values. We describe the SDM user interface and how we have implemented it to achieve high performance. SDM makes extensive use of MPI-IO's noncontiguous collective I/O functions. SDM also uses the concept of a history file to optimize the cost of the index distribution using the metadata stored in database. We present performance results with two irregular applications, a CFD code called FUN3D and a Rayleigh-Taylor instability code, on the SGI Origin2000 at Argonne National Laboratory.

Original languageEnglish
Title of host publicationProceedings - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1216-1223
Number of pages8
ISBN (Electronic)0769509908, 9780769509907
DOIs
Publication statusPublished - 2001
Externally publishedYes
Event15th International Parallel and Distributed Processing Symposium, IPDPS 2001 - San Francisco, United States
Duration: 23 Apr 200127 Apr 2001

Other

Other15th International Parallel and Distributed Processing Symposium, IPDPS 2001
CountryUnited States
CitySan Francisco
Period23/4/0127/4/01

Fingerprint

Information management
Managers
Metadata
Storage management
Application programming interfaces (API)
User interfaces
Computational fluid dynamics
Costs

ASJC Scopus subject areas

  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

No, J., Thakur, R., Kaushik, D., Freitag, L., & Choudhary, A. (2001). A scientific data management system for irregular applications. In Proceedings - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001 (pp. 1216-1223). [925096] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IPDPS.2001.925096

A scientific data management system for irregular applications. / No, Jaechun; Thakur, R.; Kaushik, D.; Freitag, L.; Choudhary, A.

Proceedings - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001. Institute of Electrical and Electronics Engineers Inc., 2001. p. 1216-1223 925096.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

No, J, Thakur, R, Kaushik, D, Freitag, L & Choudhary, A 2001, A scientific data management system for irregular applications. in Proceedings - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001., 925096, Institute of Electrical and Electronics Engineers Inc., pp. 1216-1223, 15th International Parallel and Distributed Processing Symposium, IPDPS 2001, San Francisco, United States, 23/4/01. https://doi.org/10.1109/IPDPS.2001.925096
No J, Thakur R, Kaushik D, Freitag L, Choudhary A. A scientific data management system for irregular applications. In Proceedings - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001. Institute of Electrical and Electronics Engineers Inc. 2001. p. 1216-1223. 925096 https://doi.org/10.1109/IPDPS.2001.925096
No, Jaechun ; Thakur, R. ; Kaushik, D. ; Freitag, L. ; Choudhary, A. / A scientific data management system for irregular applications. Proceedings - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001. Institute of Electrical and Electronics Engineers Inc., 2001. pp. 1216-1223
@inproceedings{2bc47c45a5bf41bd8a906cde2af321dd,
title = "A scientific data management system for irregular applications",
abstract = "Many scientific applications are I/O intensive and generate large data sets, spanning hundreds or thousands of {"}files.{"} Management, storage, efficient access, and analysis of this data present an extremely challenging task. We have developed a software system, called Scientific Data Manager (SDM), that uses a combination of parallel file I/O and database support for high-performance scientific data management. SDM provides a high-level API to the user and, internally, uses a parallel file system to store real data and a database to store application-related metadata. In this paper, we describe how we designed and implemented SDM to support irregular applications. SDM can efficiently handle the reading and writing of data in an irregular mesh, as well as the distribution of index values. We describe the SDM user interface and how we have implemented it to achieve high performance. SDM makes extensive use of MPI-IO's noncontiguous collective I/O functions. SDM also uses the concept of a history file to optimize the cost of the index distribution using the metadata stored in database. We present performance results with two irregular applications, a CFD code called FUN3D and a Rayleigh-Taylor instability code, on the SGI Origin2000 at Argonne National Laboratory.",
author = "Jaechun No and R. Thakur and D. Kaushik and L. Freitag and A. Choudhary",
year = "2001",
doi = "10.1109/IPDPS.2001.925096",
language = "English",
pages = "1216--1223",
booktitle = "Proceedings - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - A scientific data management system for irregular applications

AU - No, Jaechun

AU - Thakur, R.

AU - Kaushik, D.

AU - Freitag, L.

AU - Choudhary, A.

PY - 2001

Y1 - 2001

N2 - Many scientific applications are I/O intensive and generate large data sets, spanning hundreds or thousands of "files." Management, storage, efficient access, and analysis of this data present an extremely challenging task. We have developed a software system, called Scientific Data Manager (SDM), that uses a combination of parallel file I/O and database support for high-performance scientific data management. SDM provides a high-level API to the user and, internally, uses a parallel file system to store real data and a database to store application-related metadata. In this paper, we describe how we designed and implemented SDM to support irregular applications. SDM can efficiently handle the reading and writing of data in an irregular mesh, as well as the distribution of index values. We describe the SDM user interface and how we have implemented it to achieve high performance. SDM makes extensive use of MPI-IO's noncontiguous collective I/O functions. SDM also uses the concept of a history file to optimize the cost of the index distribution using the metadata stored in database. We present performance results with two irregular applications, a CFD code called FUN3D and a Rayleigh-Taylor instability code, on the SGI Origin2000 at Argonne National Laboratory.

AB - Many scientific applications are I/O intensive and generate large data sets, spanning hundreds or thousands of "files." Management, storage, efficient access, and analysis of this data present an extremely challenging task. We have developed a software system, called Scientific Data Manager (SDM), that uses a combination of parallel file I/O and database support for high-performance scientific data management. SDM provides a high-level API to the user and, internally, uses a parallel file system to store real data and a database to store application-related metadata. In this paper, we describe how we designed and implemented SDM to support irregular applications. SDM can efficiently handle the reading and writing of data in an irregular mesh, as well as the distribution of index values. We describe the SDM user interface and how we have implemented it to achieve high performance. SDM makes extensive use of MPI-IO's noncontiguous collective I/O functions. SDM also uses the concept of a history file to optimize the cost of the index distribution using the metadata stored in database. We present performance results with two irregular applications, a CFD code called FUN3D and a Rayleigh-Taylor instability code, on the SGI Origin2000 at Argonne National Laboratory.

UR - http://www.scopus.com/inward/record.url?scp=84981156270&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84981156270&partnerID=8YFLogxK

U2 - 10.1109/IPDPS.2001.925096

DO - 10.1109/IPDPS.2001.925096

M3 - Conference contribution

SP - 1216

EP - 1223

BT - Proceedings - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001

PB - Institute of Electrical and Electronics Engineers Inc.

ER -