Reliable and Efficient Encoding for HPC

Managiing Data Efficiently in Scientific Simulations and Workflow

Transformative research in science and engineering (S&E) to address the Grand Challenges of our time, such as climate/weather prediction, designing new combustion systems, understanding the structure of atomic and molecular systems that shape chemistry and biological systems, etc., depends on progressively sophisticated computational models and data analysis at scale running on High Performance Computing (HPC) systems. These simulations and analyses are increasingly constrained by the massive volumes of data that they must use, generate, and analyze, and by their software stacks that lack proper mechanisms to reduce data movement. Meanwhile, there has been a growing interest in the use of inexact data or computation and its impact on existing scientific workflows. Therefore, to continue to achieve scientific discovery critical to NSF’s mission, research is needed to innovate new mechanisms that can efficiently reduce the volume of data generated and transferred while also enabling rapid execution of various analysis kernels using compressed data.

 

The research goal of this project is to investigate novel data encoding and decoding schemes that optimize data movement and computation in simulation and analyses to enable seamless scaling of their performance on current and future extreme-scale platforms. The research objectives of this proposal are to: 1) investigate data encoding and decoding of scientific datasets and methodologies for effectively harnessing encoded data to revolutionize exascale simulation and analysis performance and cost-effectiveness; 2) employ and scale encoded datasets seamlessly within current extreme-scale scientific workflows, including optimizing machine learning and data mining (ML&DM) algorithms for data-driven science, to maximize use of computing power and minimize errors; and 3) validate the performance of these novel mechanisms on an evaluation framework that will apply them to multiple extreme-scale scientific applications, including climate, multiphysics, and fluid dynamics.

An illustration of block decomposition with DCT using the rlds dataset. Our method is applicable to data of any dimensions as we regard all of them as flattened data.
The bar graphs (primary y-axis) show compression ratios for DCT-EC, SZ, and ZFP with the max error bounds (P) of 1E−3. The markers (secondary y-axis) show deviations between original data and reconstructed data in terms of NRMSE with different lossy compressors. (blue circle: DCT-EC, orange triangle: SZ, grey square: ZFP).
The maximum relative error variation during simula- tion timestamps in two solvers in FLASH. The y-axis shows the maximum relative error between the data generated from restart using reconstructed data and the original data.

The educational objectives of this project are to: 1) broaden and strengthen the current University of Massachusetts Lowell (UML) computer engineering curriculum at both undergraduate and graduate levels, including course enhancements of three existing courses on parallel computer architecture and development of a new course on combining data science in computer architecture and parallel computing; 2) create an interdisciplinary research program that leverages other academic disciplines through UML, other university, national lab, and international collaborations to address ever-increasing data movement concerns in extreme- scale computing platforms; and 3) implement proven, research-based interventions to attract, retain, and educate female and underrepresented minority (URM) populations in computer engineering, and expose them to HPC.