Scalable I/O tracing and analysis

Karthik Vijayakumar, Frank Mueller, Xiaosong Ma, Philip C. Roth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Citations (Scopus)

Abstract

As supercomputer performance approached and then surpassed the petaflop level, I/O performance has become a major performance bottleneck for many scientific applications. Several tools exist to collect I/O traces to assist in the analysis of I/O performance problems. However, these tools either produce extremely large trace files that complicate performance analysis, or sacrifice accuracy to collect high-level statistical information. We propose a multi-level trace generator tool, ScalaIOTrace, that collects traces at several levels in the HPC I/O stack. ScalaIOTrace features aggressive trace compression that generates trace files of near constant size for regular I/O patterns and orders of magnitudes smaller for less regular ones. This enables the collection of I/O and communication traces of applications running on thousands of processors. Our contributions also include automated trace analysis to collect selected statistical information of I/O calls by parsing the compressed trace on-the-fly and time-accurate replay of communication events with MPI-IO calls. We evaluated our approach with the Parallel Ocean Program (POP) climate simulation and the FLASH parallel I/O benchmark. POP uses NetCDF as an I/O library while FLASH I/O uses the parallel HDF5 I/O library, which internally maps onto MPI-IO. We collected MPI-IO and low-level POSIX I/O traces to study application I/O behavior. Our results show constant size trace files of only 145KB irrespective of the number of nodes for FLASH I/O benchmark, which exhibits regular I/O and communication pattern. For POP, we observe up to two orders of magnitude reduction in trace file sizes compared to flat traces. Statistical information gathered reveals insight on the number of I/O and communication calls issued in the POP and FLASH I/O. Such concise traces are unprecedented for isolated I/O and combined I/O plus communication tracing.

Original languageEnglish
Title of host publicationProceedings of the 4th Annual Petascale Data Storage Workshop, PDSW '09, Held in conjunction with Supercomputing '09
Pages26-31
Number of pages6
DOIs
Publication statusPublished - 1 Dec 2009
Externally publishedYes
Event4th Annual Petascale Data Storage Workshop, PDSW '09, Held in conjunction with Supercomputing '09 - Portland, OR, United States
Duration: 15 Nov 200915 Nov 2009

Other

Other4th Annual Petascale Data Storage Workshop, PDSW '09, Held in conjunction with Supercomputing '09
CountryUnited States
CityPortland, OR
Period15/11/0915/11/09

Fingerprint

Communication
Trace analysis
Supercomputers

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software

Cite this

Vijayakumar, K., Mueller, F., Ma, X., & Roth, P. C. (2009). Scalable I/O tracing and analysis. In Proceedings of the 4th Annual Petascale Data Storage Workshop, PDSW '09, Held in conjunction with Supercomputing '09 (pp. 26-31) https://doi.org/10.1145/1713072.1713080

Scalable I/O tracing and analysis. / Vijayakumar, Karthik; Mueller, Frank; Ma, Xiaosong; Roth, Philip C.

Proceedings of the 4th Annual Petascale Data Storage Workshop, PDSW '09, Held in conjunction with Supercomputing '09. 2009. p. 26-31.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Vijayakumar, K, Mueller, F, Ma, X & Roth, PC 2009, Scalable I/O tracing and analysis. in Proceedings of the 4th Annual Petascale Data Storage Workshop, PDSW '09, Held in conjunction with Supercomputing '09. pp. 26-31, 4th Annual Petascale Data Storage Workshop, PDSW '09, Held in conjunction with Supercomputing '09, Portland, OR, United States, 15/11/09. https://doi.org/10.1145/1713072.1713080
Vijayakumar K, Mueller F, Ma X, Roth PC. Scalable I/O tracing and analysis. In Proceedings of the 4th Annual Petascale Data Storage Workshop, PDSW '09, Held in conjunction with Supercomputing '09. 2009. p. 26-31 https://doi.org/10.1145/1713072.1713080
Vijayakumar, Karthik ; Mueller, Frank ; Ma, Xiaosong ; Roth, Philip C. / Scalable I/O tracing and analysis. Proceedings of the 4th Annual Petascale Data Storage Workshop, PDSW '09, Held in conjunction with Supercomputing '09. 2009. pp. 26-31
@inproceedings{acd764973b44445786c8b368649a265e,
title = "Scalable I/O tracing and analysis",
abstract = "As supercomputer performance approached and then surpassed the petaflop level, I/O performance has become a major performance bottleneck for many scientific applications. Several tools exist to collect I/O traces to assist in the analysis of I/O performance problems. However, these tools either produce extremely large trace files that complicate performance analysis, or sacrifice accuracy to collect high-level statistical information. We propose a multi-level trace generator tool, ScalaIOTrace, that collects traces at several levels in the HPC I/O stack. ScalaIOTrace features aggressive trace compression that generates trace files of near constant size for regular I/O patterns and orders of magnitudes smaller for less regular ones. This enables the collection of I/O and communication traces of applications running on thousands of processors. Our contributions also include automated trace analysis to collect selected statistical information of I/O calls by parsing the compressed trace on-the-fly and time-accurate replay of communication events with MPI-IO calls. We evaluated our approach with the Parallel Ocean Program (POP) climate simulation and the FLASH parallel I/O benchmark. POP uses NetCDF as an I/O library while FLASH I/O uses the parallel HDF5 I/O library, which internally maps onto MPI-IO. We collected MPI-IO and low-level POSIX I/O traces to study application I/O behavior. Our results show constant size trace files of only 145KB irrespective of the number of nodes for FLASH I/O benchmark, which exhibits regular I/O and communication pattern. For POP, we observe up to two orders of magnitude reduction in trace file sizes compared to flat traces. Statistical information gathered reveals insight on the number of I/O and communication calls issued in the POP and FLASH I/O. Such concise traces are unprecedented for isolated I/O and combined I/O plus communication tracing.",
author = "Karthik Vijayakumar and Frank Mueller and Xiaosong Ma and Roth, {Philip C.}",
year = "2009",
month = "12",
day = "1",
doi = "10.1145/1713072.1713080",
language = "English",
isbn = "9781605588834",
pages = "26--31",
booktitle = "Proceedings of the 4th Annual Petascale Data Storage Workshop, PDSW '09, Held in conjunction with Supercomputing '09",

}

TY - GEN

T1 - Scalable I/O tracing and analysis

AU - Vijayakumar, Karthik

AU - Mueller, Frank

AU - Ma, Xiaosong

AU - Roth, Philip C.

PY - 2009/12/1

Y1 - 2009/12/1

N2 - As supercomputer performance approached and then surpassed the petaflop level, I/O performance has become a major performance bottleneck for many scientific applications. Several tools exist to collect I/O traces to assist in the analysis of I/O performance problems. However, these tools either produce extremely large trace files that complicate performance analysis, or sacrifice accuracy to collect high-level statistical information. We propose a multi-level trace generator tool, ScalaIOTrace, that collects traces at several levels in the HPC I/O stack. ScalaIOTrace features aggressive trace compression that generates trace files of near constant size for regular I/O patterns and orders of magnitudes smaller for less regular ones. This enables the collection of I/O and communication traces of applications running on thousands of processors. Our contributions also include automated trace analysis to collect selected statistical information of I/O calls by parsing the compressed trace on-the-fly and time-accurate replay of communication events with MPI-IO calls. We evaluated our approach with the Parallel Ocean Program (POP) climate simulation and the FLASH parallel I/O benchmark. POP uses NetCDF as an I/O library while FLASH I/O uses the parallel HDF5 I/O library, which internally maps onto MPI-IO. We collected MPI-IO and low-level POSIX I/O traces to study application I/O behavior. Our results show constant size trace files of only 145KB irrespective of the number of nodes for FLASH I/O benchmark, which exhibits regular I/O and communication pattern. For POP, we observe up to two orders of magnitude reduction in trace file sizes compared to flat traces. Statistical information gathered reveals insight on the number of I/O and communication calls issued in the POP and FLASH I/O. Such concise traces are unprecedented for isolated I/O and combined I/O plus communication tracing.

AB - As supercomputer performance approached and then surpassed the petaflop level, I/O performance has become a major performance bottleneck for many scientific applications. Several tools exist to collect I/O traces to assist in the analysis of I/O performance problems. However, these tools either produce extremely large trace files that complicate performance analysis, or sacrifice accuracy to collect high-level statistical information. We propose a multi-level trace generator tool, ScalaIOTrace, that collects traces at several levels in the HPC I/O stack. ScalaIOTrace features aggressive trace compression that generates trace files of near constant size for regular I/O patterns and orders of magnitudes smaller for less regular ones. This enables the collection of I/O and communication traces of applications running on thousands of processors. Our contributions also include automated trace analysis to collect selected statistical information of I/O calls by parsing the compressed trace on-the-fly and time-accurate replay of communication events with MPI-IO calls. We evaluated our approach with the Parallel Ocean Program (POP) climate simulation and the FLASH parallel I/O benchmark. POP uses NetCDF as an I/O library while FLASH I/O uses the parallel HDF5 I/O library, which internally maps onto MPI-IO. We collected MPI-IO and low-level POSIX I/O traces to study application I/O behavior. Our results show constant size trace files of only 145KB irrespective of the number of nodes for FLASH I/O benchmark, which exhibits regular I/O and communication pattern. For POP, we observe up to two orders of magnitude reduction in trace file sizes compared to flat traces. Statistical information gathered reveals insight on the number of I/O and communication calls issued in the POP and FLASH I/O. Such concise traces are unprecedented for isolated I/O and combined I/O plus communication tracing.

UR - http://www.scopus.com/inward/record.url?scp=77950484239&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77950484239&partnerID=8YFLogxK

U2 - 10.1145/1713072.1713080

DO - 10.1145/1713072.1713080

M3 - Conference contribution

SN - 9781605588834

SP - 26

EP - 31

BT - Proceedings of the 4th Annual Petascale Data Storage Workshop, PDSW '09, Held in conjunction with Supercomputing '09

ER -