Skip to main content

Jinzhen Wang

New Jersey Institute of Technology, Electrical and Computer Engineering, Graduate Student

Shandong University, Information Science and Engineering, Undergraduate

Followers

28

Following

20

Co-author

1

Public Views

I'm currently a PhD student in the Department of Electrical and Computer Engineering, New Jersey Institute of Technology. My research interest lies in HPC and Big Data.
Address: Newark, New Jersey, United States

less

Armando Marques-Guedes

UNL - New University of Lisbon

Ludwig-Maximilians-Universität München

University of Technology Sydney

PALIMOTE JUSTICE

RIVERS STATE POLYTECHNIC

Bálint Molnár

Eötvös Loránd University

Indian Institute of Technology Indore

Mohamad Ivan Fanany

University of Indonesia

Vellingiriraj E.K

Kongu Engineering College

Nguyễn Văn Hiếu

Juan Carlos Gomez

Universidad de Guanajuato

Interests

Uploads

Project Report by Jinzhen Wang

SOUND SOURCE LOCALIZATION USING MICROPHONE ARRAYS

Sound source localization technique measures the sound field to determine the location of sound s... more Sound source localization technique measures the sound field to determine the location of sound source directly. In this experiment, we try to improve an existing localization method by improving the source sound to get better localization results.
We build up the experiment setup, debug the system so that it works properly and then we try with different kinds of sound sources to find out the best kind of sound that fits this localization method. Then we run analysis to see why the sound source suits this method or not.
We use an 8-microphone array that is connected to NI high speed A/D converter to collect the signals and digitize the signals so that we can collect and process data easily with computers. We use Matlab to collect the signals and process the signals for both location estimation and for analysis.

IMPLEMENTATION OF COLLABORATIVE FILTERING TECHNIQUES FOR MOVIE RECOMMENDATION

by Jinzhen Wang and Chris Markson

A recommendation system is a kind of information filtering system that identifies recommendations... more A recommendation system is a kind of information filtering system that identifies recommendations for individual users based on their purchase history, ratings or other types of interactions.
There are two types of techniques mainly used:
1) Content-based filtering: Constructs recommendations on the basis of a user's behavior (for example, historical browsing information).
2) Collaborative filtering: Constructs recommendations by searching a large group of people and finding a smaller set with similar taste to the user we are generating recommendation to.

Thesis by Jinzhen Wang

Undergraduate Senior Design Thesis

Feature Extraction is a concept in the field of Computer Vision and Image Processing. It means to... more Feature Extraction is a concept in the field of Computer Vision and Image Processing. It means to extract the digital information of digital images with digital computer, decide whether every point in the image meets a specific image feature, usually a digital feature. Feature Extraction operation is a first level computation step, for many computer graphic algorithms take Feature Extraction operation as the pre-processing step to images. A series of Feature Extraction method algorithms are developed aiming at different application purposes and different application environment. These algorithms can extract different kinds of features with various computation complexity and repeatability. With the characteristics of Feature Extraction, this paper aims at and studies the listed works:
Bring out and implement a unique Feature Extraction Method basing on foreground and feature segmentation (FEM). FEM aims at the real data from the experimental platform of bats flight dynamics. Thus the FEM is targeted, faster and more accurate than other algorithms.
Put forward and implement a tracking method basing on optic flow optimization. Then compare this method with Surf from Open CV and Kalman Filtering algorithm. This method is more accurate than Surf and faster than KF algorithm with low accuracy decline.
Put two parts together and design a Matlab program with which the data from the experimental platform of bats flight dynamics can be processed automatically with less time and manpower required.Finally summarized this work, pointed out its weakness and the outlook.

Papers by Jinzhen Wang

Analyzing the Impact of Lossy Data Reduction on Volume Rendering of Cosmology Data

2022 IEEE/ACM 8th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD)

Locality-based transfer learning on compression autoencoder for efficient scientific data lossy compression

Journal of Network and Computer Applications

Reducing the Training Overhead of the HPC Compression Autoencoder via Dataset Proportioning

2021 IEEE International Conference on Networking, Architecture and Storage (NAS)

As the storage overhead of high-performance computing (HPC) data reaches into the petabyte or eve... more As the storage overhead of high-performance computing (HPC) data reaches into the petabyte or even exabyte scale, it could be useful to find new methods of compressing such data. The compression autoencoder (CAE) has recently been proposed to compress HPC data with a very high compression ratio. However, this machine learning-based method suffers from the major drawback of lengthy training time. In this paper, we attempt to mitigate this problem by proposing a proportioning scheme to reduce the amount of data that is used for training relative to the amount of data to be compressed. We show that this method drastically reduces the training time without, in most cases, significantly increasing the error. We further explain how this scheme can even improve the accuracy of the CAE on certain datasets. Finally, we provide some guidance on how to determine a suitable proportion of the training dataset to use in order to train the CAE for a given dataset.

Exploring Transfer Learning to Reduce Training Overhead of HPC Data in Machine Learning

2019 IEEE International Conference on Networking, Architecture and Storage (NAS), 2019

Nowadays, scientific simulations on high-performance computing (HPC) systems can generate large a... more Nowadays, scientific simulations on high-performance computing (HPC) systems can generate large amounts of data (in the scale of terabytes or petabytes) per run. When this huge amount of HPC data is processed by machine learning applications, the training overhead will be significant. Typically, the training process for a neural network can take several hours to complete, if not longer. When machine learning is applied to HPC scientific data, the training time can take several days or even weeks. Transfer learning, an optimization usually used to save training time or achieve better performance, has potential for reducing this large training overhead. In this paper, we apply transfer learning to a machine learning HPC application. We find that transfer learning can reduce training time without, in most cases, significantly increasing the error. This indicates transfer learning can be very useful for working with HPC datasets in machine learning applications.

jw447/zPerf: First release

zPerf: A Statistical Gray-box Approach to Performance Modeling and Extrapolation for Scientific L... more

Robust and scalable deep learning for X-ray synchrotron image analysis

2017 New York Scientific Data Summit (NYSDS), 2017

X-ray scattering is a key technique in modern synchrotron facilities towards material analysis an... more X-ray scattering is a key technique in modern synchrotron facilities towards material analysis and discovery via structural characterization at the molecular scale and nano-scale. Image classification and tagging play a crucial role in recognizing patterns, inferring meaningful physical properties from sample, and guiding subsequent experiment steps. We designed deeplearning based image classification pipelines and gained significant improvements in terms of accuracy and speed. Constrained by available computing resources and optimization library, we need to make trade-off among computation efficiency, input image size and volume, and the flexibility and stability of processing images with different levels of qualities and artifacts. Consequently, our deep learning framework requires careful data preprocessing techniques to down-sample images and extract true image signals. However, X-ray scattering images contain different levels of noise, numerous gaps, rotations, and defects arising from detector limitations, sample (mis)alignment, and experimental configuration. Traditional methods of healing x-ray scattering images make strong assumptions about these artifacts and require hand-crafted procedures and experiment meta-data to de-noise, interpolate measured data to eliminate gaps, and rotate and translate images to align the center of samples with the center of images. These manual procedures are error-prone, experience-driven, and isolated from the intended image prediction, and consequently not scalable to the data rate of X-ray images from modern detectors. We aim to explore deeplearning based image classification techniques that are robust and capable of leverage high-definition experimental images with rich variations even in a production environment that is not defect-free, and ultimately automate labor-intensive data preprocessing tasks and integrate them seamlessly into our TensorFlow based experimental data analysis framework.

Identifying Latent Reduced Models to Precondition Lossy Compression

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2019

With the high volume and velocity of scientific data produced on high-performance computing syste... more With the high volume and velocity of scientific data produced on high-performance computing systems, it has become increasingly critical to improve the compression performance. Leveraging the general tolerance of reduced accuracy in applications, lossy compressors can achieve much higher compression ratios with a user-prescribed error bound. However, they are still far from satisfying the reduction requirements from applications. In this paper, we propose and evaluate the idea that data need to be preconditioned prior to compression, such that they can better match the design philosophies of a compressor. In particular, we aim to identify a reduced model that can be utilized to transform the original data to a more compressible form. We begin with a case study of Heat3d as a proof of concept, in which we demonstrate that a reduced model can indeed reside in the full model output, and can be utilized to improve compression ratios. We further explore more general dimension reduction techniques to extract the reduced model, including principal component analysis, singular value decomposition, and discrete wavelet transform. After preconditioning, the reduced model in conjunction with delta is stored, which results in higher compression ratios. We evaluate the reduced models on nine scientific datasets, and the results show the effectiveness of our approaches.

High-Ratio Lossy Compression: Exploring the Autoencoder to Compress Scientific Data

IEEE Transactions on Big Data, 2021

Estimating Lossy Compressibility of Scientific Data Using Deep Neural Networks

IEEE Letters of the Computer Society, 2020

Formal modeling of multi-agent systems is an active area of research. The use of precise and unam... more Formal modeling of multi-agent systems is an active area of research. The use of precise and unambiguous notation of formal methods is used to accurately describe and reason about the system under consideration at the design time. Multi-agent systems deployed in dynamic and unpredictable environment needs to have the ability of self-adaptation, making them adaptable to the failures. State of the art encourages the use of MAPE-K feedback loop for the provision of self-adaptation in any system. There is a dire need of formal vocabulary that can be used for the conceptual design of any real-time multi-agent system with selfadaptation. In this paper, we have proposed a set of predefined interfaces for the provision of self-adaptation in real-time multi-agent systems. The interfaces are based on monitor, analyze, plan, and execute phases of the MAPE-K feedback loop. We formally specify our interfaces using timed-communicating object-Z language. The complete framework is elaborated using a trivial case-study of conveyor belt system based on a real-time agent architecture. INDEX TERMS Formal methods, self-adaptation, autonomic computing, multi-agent systems, real-time systems.

Compression Ratio Modeling and Estimation across Error Bounds for Lossy Compression

IEEE Transactions on Parallel and Distributed Systems, 2019

Scientific simulations on high-performance computing (HPC) systems generate vast amounts of float... more Scientific simulations on high-performance computing (HPC) systems generate vast amounts of floating-point data that need to be reduced in order to lower the storage and I/O cost. Lossy compressors trade data accuracy for reduction performance and have been demonstrated to be effective in reducing data volume. However, a key hurdle to wide adoption of lossy compressors is that the trade-off between data accuracy and compression performance, particularly the compression ratio, is not well understood. Consequently, domain scientists often need to exhaust many possible error bounds before they can figure out an appropriate setup. The current practice of using lossy compressors to reduce data volume is, therefore, through trial and error, which is not efficient for large datasets which take a tremendous amount of computational resources to compress. This paper aims to analyze and estimate the compression performance of lossy compressors on HPC datasets. In particular, we predict the compression ratios of two modern lossy compressors that achieve superior performance, SZ and ZFP, on HPC scientific datasets at various error bounds, based upon the compressors' intrinsic metrics collected under a given base error bound. We evaluate the estimation scheme using twenty real HPC datasets and the results confirm the effectiveness of our approach.

DuoModel: Leveraging Reduced Model for Data Reduction and Re-Computation on HPC Storage

IEEE Letters of the Computer Society, 2018

High-performance computing (HPC) applications generate large amounts of floating-point data that ... more High-performance computing (HPC) applications generate large amounts of floating-point data that need to be stored and analyzed efficiently to extract the insights and advance knowledge discovery. With the growing disparities between compute and I/O, optimizing the storage stack alone may not suffice to cure the I/O problem. There has been a strong push in the HPC communities to perform data reduction before data is transmitted to storage in order to lower the I/O cost. However, as of now, neither lossless nor lossy compressors can achieve the adequate reduction ratio that is desired by applications. This paper proposes DuoModel, a new approach that leverages the similarity between the full and reduced application models, and further improve the data reduction ratio. DouModel further improves the compression ratio of state-ofthe-art compressors via compressing the differences (termed as delta) between the data products of the two models. For data analytics, the high fidelity data can be re-computed by launching the reduced model and applying the compressed delta. Our evaluations confirm that DuoModel can further push the limit of data reduction while the high fidelity of data is maintained.

SIRIUS: Enabling Progressive Data Exploration for Extreme-Scale Scientific Data

IEEE Transactions on Multi-Scale Computing Systems, 2018

Scientific simulations on high performance computing (HPC) platforms generate large quantities of... more Scientific simulations on high performance computing (HPC) platforms generate large quantities of data. To bridge the widening gap between compute and I/O, and enable data to be more efficiently stored and analyzed, simulation outputs need to be refactored, reduced, and appropriately mapped to storage tiers. However, a systematic solution to support these steps has been lacking in the current HPC software ecosystem. To that end, this paper develops SIRIUS, a progressive JPEG-like data management scheme for storing and analyzing big scientific data. It co-designs data decimation, compression, and data storage, taking the hardware characteristics of each storage tier into considerations. With reasonably low overhead, our approach refactors simulation data, using either topological or uniform decimation, into a much smaller, reduced-accuracy base dataset, and a series of deltas that is used to augment the accuracy if needed. The base dataset and deltas are compressed and written to multiple storage tiers. Data saved on different tiers can then be selectively retrieved to restore the level of accuracy that satisfies data analytics. Thus, SIRIUS provides a paradigm shift towards elastic data analytics and enables end users to make trade-offs between analysis speed and accuracy on-the-fly. This paper further develops algorithms to preserve statistics for data decimation, a common requirement for reducing data. We assess the impact of SIRIUS on unstructured triangular meshes, a pervasive data model used in scientific simulations. In particular, we evaluate two realistic use cases: the blob detection in fusion and highpressure area extraction in computational fluid dynamics. Diffraction Neuromuscular Hafnium compounds Adhesive strength Bit rate Antibacterial activity. Human-robot interaction Induction heating Long Term Evolution Wet etching X-rays Reluctance machines Data communication Satellite ground stations Task analysis Learning management systems Pulse shaping methods Probability distribution. Filament lamps Wavelength conversion Dynamic equilibrium Transfer molding Waste recovery. Computer performance Ribs Stray light Electromagnetic measurements MySpace Geophysics Knowledge transfer Brazing Electronics cooling Electric vehicle charging Silicon photonics Thin film inductors. Wet etching Log-periodic dipole antennas Integrated circuit testing Tunable circuits and devices Unmanned vehicles Bulk storage Fuel economy Stock markets Flip chip solder joints Breast tissue Semiconductor device measurement Epilepsy Piezooptic effects. Constellation diagram Matrix decomposition Gaze tracking Microstrip antennas Electrooptic modulators Gynecology Human voice. Stripline International Atomic Time Cryptocurrency Materials reliability Semiconductor device doping Particle collisions Switching converters.

Unbalanced Parallel I/O: An Often-Neglected Side Effect of Lossy Scientific Data Compression

2021 7th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-7), 2021

Lossy compression techniques have demonstrated promising results in significantly reducing the sc... more Lossy compression techniques have demonstrated promising results in significantly reducing the scientific data size while guaranteeing the compression error bounds. However, one important yet often neglected side effect of lossy scientific data compression is its impact on the performance of parallel I/O. Our key observation is that the compressed data size is often highly skewed across processes in lossy scientific compression. To understand this behavior, we conduct extensive experiments where we apply three lossy compressors MGARD, ZFP, and SZ, which are specifically designed and optimized for scientific data, to three real-world scientific applications Gray-Scott simulation, WarpX, and XGC. Our analysis result demonstrates that the size of the compressed data is always skewed even if the original data is evenly decomposed among processes. Such skewness widely exists in different scientific applications using different compressors as long as the information density of the data varies across processes. We then systematically study how this side effect of lossy scientific data compression impacts the performance of parallel I/O. We observe that the skewness in the sizes of the compressed data often leads to I/O imbalance, which can significantly reduce the efficiency of I/O bandwidth utilization if not properly handled. In addition, writing data concurrently to a single shared file through MPI-IO library is more sensitive to the unbalanced I/O loads. Therefore, we believe our research community should pay more attention to the unbalanced parallel I/O caused by lossy scientific data compression.

SOUND SOURCE LOCALIZATION USING MICROPHONE ARRAYS

Sound source localization technique measures the sound field to determine the location of sound s... more Sound source localization technique measures the sound field to determine the location of sound source directly. In this experiment, we try to improve an existing localization method by improving the source sound to get better localization results.
We build up the experiment setup, debug the system so that it works properly and then we try with different kinds of sound sources to find out the best kind of sound that fits this localization method. Then we run analysis to see why the sound source suits this method or not.
We use an 8-microphone array that is connected to NI high speed A/D converter to collect the signals and digitize the signals so that we can collect and process data easily with computers. We use Matlab to collect the signals and process the signals for both location estimation and for analysis.

IMPLEMENTATION OF COLLABORATIVE FILTERING TECHNIQUES FOR MOVIE RECOMMENDATION

by Jinzhen Wang and Chris Markson

A recommendation system is a kind of information filtering system that identifies recommendations... more A recommendation system is a kind of information filtering system that identifies recommendations for individual users based on their purchase history, ratings or other types of interactions.
There are two types of techniques mainly used:
1) Content-based filtering: Constructs recommendations on the basis of a user's behavior (for example, historical browsing information).
2) Collaborative filtering: Constructs recommendations by searching a large group of people and finding a smaller set with similar taste to the user we are generating recommendation to.

Undergraduate Senior Design Thesis

Feature Extraction is a concept in the field of Computer Vision and Image Processing. It means to... more Feature Extraction is a concept in the field of Computer Vision and Image Processing. It means to extract the digital information of digital images with digital computer, decide whether every point in the image meets a specific image feature, usually a digital feature. Feature Extraction operation is a first level computation step, for many computer graphic algorithms take Feature Extraction operation as the pre-processing step to images. A series of Feature Extraction method algorithms are developed aiming at different application purposes and different application environment. These algorithms can extract different kinds of features with various computation complexity and repeatability. With the characteristics of Feature Extraction, this paper aims at and studies the listed works:
Bring out and implement a unique Feature Extraction Method basing on foreground and feature segmentation (FEM). FEM aims at the real data from the experimental platform of bats flight dynamics. Thus the FEM is targeted, faster and more accurate than other algorithms.
Put forward and implement a tracking method basing on optic flow optimization. Then compare this method with Surf from Open CV and Kalman Filtering algorithm. This method is more accurate than Surf and faster than KF algorithm with low accuracy decline.
Put two parts together and design a Matlab program with which the data from the experimental platform of bats flight dynamics can be processed automatically with less time and manpower required.Finally summarized this work, pointed out its weakness and the outlook.

Analyzing the Impact of Lossy Data Reduction on Volume Rendering of Cosmology Data

2022 IEEE/ACM 8th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD)

Locality-based transfer learning on compression autoencoder for efficient scientific data lossy compression

Journal of Network and Computer Applications

Reducing the Training Overhead of the HPC Compression Autoencoder via Dataset Proportioning

2021 IEEE International Conference on Networking, Architecture and Storage (NAS)

As the storage overhead of high-performance computing (HPC) data reaches into the petabyte or eve... more As the storage overhead of high-performance computing (HPC) data reaches into the petabyte or even exabyte scale, it could be useful to find new methods of compressing such data. The compression autoencoder (CAE) has recently been proposed to compress HPC data with a very high compression ratio. However, this machine learning-based method suffers from the major drawback of lengthy training time. In this paper, we attempt to mitigate this problem by proposing a proportioning scheme to reduce the amount of data that is used for training relative to the amount of data to be compressed. We show that this method drastically reduces the training time without, in most cases, significantly increasing the error. We further explain how this scheme can even improve the accuracy of the CAE on certain datasets. Finally, we provide some guidance on how to determine a suitable proportion of the training dataset to use in order to train the CAE for a given dataset.

Exploring Transfer Learning to Reduce Training Overhead of HPC Data in Machine Learning

2019 IEEE International Conference on Networking, Architecture and Storage (NAS), 2019

Nowadays, scientific simulations on high-performance computing (HPC) systems can generate large a... more Nowadays, scientific simulations on high-performance computing (HPC) systems can generate large amounts of data (in the scale of terabytes or petabytes) per run. When this huge amount of HPC data is processed by machine learning applications, the training overhead will be significant. Typically, the training process for a neural network can take several hours to complete, if not longer. When machine learning is applied to HPC scientific data, the training time can take several days or even weeks. Transfer learning, an optimization usually used to save training time or achieve better performance, has potential for reducing this large training overhead. In this paper, we apply transfer learning to a machine learning HPC application. We find that transfer learning can reduce training time without, in most cases, significantly increasing the error. This indicates transfer learning can be very useful for working with HPC datasets in machine learning applications.

jw447/zPerf: First release

zPerf: A Statistical Gray-box Approach to Performance Modeling and Extrapolation for Scientific L... more

Robust and scalable deep learning for X-ray synchrotron image analysis

2017 New York Scientific Data Summit (NYSDS), 2017

X-ray scattering is a key technique in modern synchrotron facilities towards material analysis an... more X-ray scattering is a key technique in modern synchrotron facilities towards material analysis and discovery via structural characterization at the molecular scale and nano-scale. Image classification and tagging play a crucial role in recognizing patterns, inferring meaningful physical properties from sample, and guiding subsequent experiment steps. We designed deeplearning based image classification pipelines and gained significant improvements in terms of accuracy and speed. Constrained by available computing resources and optimization library, we need to make trade-off among computation efficiency, input image size and volume, and the flexibility and stability of processing images with different levels of qualities and artifacts. Consequently, our deep learning framework requires careful data preprocessing techniques to down-sample images and extract true image signals. However, X-ray scattering images contain different levels of noise, numerous gaps, rotations, and defects arising from detector limitations, sample (mis)alignment, and experimental configuration. Traditional methods of healing x-ray scattering images make strong assumptions about these artifacts and require hand-crafted procedures and experiment meta-data to de-noise, interpolate measured data to eliminate gaps, and rotate and translate images to align the center of samples with the center of images. These manual procedures are error-prone, experience-driven, and isolated from the intended image prediction, and consequently not scalable to the data rate of X-ray images from modern detectors. We aim to explore deeplearning based image classification techniques that are robust and capable of leverage high-definition experimental images with rich variations even in a production environment that is not defect-free, and ultimately automate labor-intensive data preprocessing tasks and integrate them seamlessly into our TensorFlow based experimental data analysis framework.

Identifying Latent Reduced Models to Precondition Lossy Compression

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2019

With the high volume and velocity of scientific data produced on high-performance computing syste... more With the high volume and velocity of scientific data produced on high-performance computing systems, it has become increasingly critical to improve the compression performance. Leveraging the general tolerance of reduced accuracy in applications, lossy compressors can achieve much higher compression ratios with a user-prescribed error bound. However, they are still far from satisfying the reduction requirements from applications. In this paper, we propose and evaluate the idea that data need to be preconditioned prior to compression, such that they can better match the design philosophies of a compressor. In particular, we aim to identify a reduced model that can be utilized to transform the original data to a more compressible form. We begin with a case study of Heat3d as a proof of concept, in which we demonstrate that a reduced model can indeed reside in the full model output, and can be utilized to improve compression ratios. We further explore more general dimension reduction techniques to extract the reduced model, including principal component analysis, singular value decomposition, and discrete wavelet transform. After preconditioning, the reduced model in conjunction with delta is stored, which results in higher compression ratios. We evaluate the reduced models on nine scientific datasets, and the results show the effectiveness of our approaches.

High-Ratio Lossy Compression: Exploring the Autoencoder to Compress Scientific Data

IEEE Transactions on Big Data, 2021

Estimating Lossy Compressibility of Scientific Data Using Deep Neural Networks

IEEE Letters of the Computer Society, 2020

Formal modeling of multi-agent systems is an active area of research. The use of precise and unam... more Formal modeling of multi-agent systems is an active area of research. The use of precise and unambiguous notation of formal methods is used to accurately describe and reason about the system under consideration at the design time. Multi-agent systems deployed in dynamic and unpredictable environment needs to have the ability of self-adaptation, making them adaptable to the failures. State of the art encourages the use of MAPE-K feedback loop for the provision of self-adaptation in any system. There is a dire need of formal vocabulary that can be used for the conceptual design of any real-time multi-agent system with selfadaptation. In this paper, we have proposed a set of predefined interfaces for the provision of self-adaptation in real-time multi-agent systems. The interfaces are based on monitor, analyze, plan, and execute phases of the MAPE-K feedback loop. We formally specify our interfaces using timed-communicating object-Z language. The complete framework is elaborated using a trivial case-study of conveyor belt system based on a real-time agent architecture. INDEX TERMS Formal methods, self-adaptation, autonomic computing, multi-agent systems, real-time systems.

Compression Ratio Modeling and Estimation across Error Bounds for Lossy Compression

IEEE Transactions on Parallel and Distributed Systems, 2019

Scientific simulations on high-performance computing (HPC) systems generate vast amounts of float... more Scientific simulations on high-performance computing (HPC) systems generate vast amounts of floating-point data that need to be reduced in order to lower the storage and I/O cost. Lossy compressors trade data accuracy for reduction performance and have been demonstrated to be effective in reducing data volume. However, a key hurdle to wide adoption of lossy compressors is that the trade-off between data accuracy and compression performance, particularly the compression ratio, is not well understood. Consequently, domain scientists often need to exhaust many possible error bounds before they can figure out an appropriate setup. The current practice of using lossy compressors to reduce data volume is, therefore, through trial and error, which is not efficient for large datasets which take a tremendous amount of computational resources to compress. This paper aims to analyze and estimate the compression performance of lossy compressors on HPC datasets. In particular, we predict the compression ratios of two modern lossy compressors that achieve superior performance, SZ and ZFP, on HPC scientific datasets at various error bounds, based upon the compressors' intrinsic metrics collected under a given base error bound. We evaluate the estimation scheme using twenty real HPC datasets and the results confirm the effectiveness of our approach.

DuoModel: Leveraging Reduced Model for Data Reduction and Re-Computation on HPC Storage

IEEE Letters of the Computer Society, 2018

High-performance computing (HPC) applications generate large amounts of floating-point data that ... more High-performance computing (HPC) applications generate large amounts of floating-point data that need to be stored and analyzed efficiently to extract the insights and advance knowledge discovery. With the growing disparities between compute and I/O, optimizing the storage stack alone may not suffice to cure the I/O problem. There has been a strong push in the HPC communities to perform data reduction before data is transmitted to storage in order to lower the I/O cost. However, as of now, neither lossless nor lossy compressors can achieve the adequate reduction ratio that is desired by applications. This paper proposes DuoModel, a new approach that leverages the similarity between the full and reduced application models, and further improve the data reduction ratio. DouModel further improves the compression ratio of state-ofthe-art compressors via compressing the differences (termed as delta) between the data products of the two models. For data analytics, the high fidelity data can be re-computed by launching the reduced model and applying the compressed delta. Our evaluations confirm that DuoModel can further push the limit of data reduction while the high fidelity of data is maintained.

SIRIUS: Enabling Progressive Data Exploration for Extreme-Scale Scientific Data

IEEE Transactions on Multi-Scale Computing Systems, 2018

Scientific simulations on high performance computing (HPC) platforms generate large quantities of... more Scientific simulations on high performance computing (HPC) platforms generate large quantities of data. To bridge the widening gap between compute and I/O, and enable data to be more efficiently stored and analyzed, simulation outputs need to be refactored, reduced, and appropriately mapped to storage tiers. However, a systematic solution to support these steps has been lacking in the current HPC software ecosystem. To that end, this paper develops SIRIUS, a progressive JPEG-like data management scheme for storing and analyzing big scientific data. It co-designs data decimation, compression, and data storage, taking the hardware characteristics of each storage tier into considerations. With reasonably low overhead, our approach refactors simulation data, using either topological or uniform decimation, into a much smaller, reduced-accuracy base dataset, and a series of deltas that is used to augment the accuracy if needed. The base dataset and deltas are compressed and written to multiple storage tiers. Data saved on different tiers can then be selectively retrieved to restore the level of accuracy that satisfies data analytics. Thus, SIRIUS provides a paradigm shift towards elastic data analytics and enables end users to make trade-offs between analysis speed and accuracy on-the-fly. This paper further develops algorithms to preserve statistics for data decimation, a common requirement for reducing data. We assess the impact of SIRIUS on unstructured triangular meshes, a pervasive data model used in scientific simulations. In particular, we evaluate two realistic use cases: the blob detection in fusion and highpressure area extraction in computational fluid dynamics. Diffraction Neuromuscular Hafnium compounds Adhesive strength Bit rate Antibacterial activity. Human-robot interaction Induction heating Long Term Evolution Wet etching X-rays Reluctance machines Data communication Satellite ground stations Task analysis Learning management systems Pulse shaping methods Probability distribution. Filament lamps Wavelength conversion Dynamic equilibrium Transfer molding Waste recovery. Computer performance Ribs Stray light Electromagnetic measurements MySpace Geophysics Knowledge transfer Brazing Electronics cooling Electric vehicle charging Silicon photonics Thin film inductors. Wet etching Log-periodic dipole antennas Integrated circuit testing Tunable circuits and devices Unmanned vehicles Bulk storage Fuel economy Stock markets Flip chip solder joints Breast tissue Semiconductor device measurement Epilepsy Piezooptic effects. Constellation diagram Matrix decomposition Gaze tracking Microstrip antennas Electrooptic modulators Gynecology Human voice. Stripline International Atomic Time Cryptocurrency Materials reliability Semiconductor device doping Particle collisions Switching converters.

Unbalanced Parallel I/O: An Often-Neglected Side Effect of Lossy Scientific Data Compression

2021 7th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-7), 2021

Lossy compression techniques have demonstrated promising results in significantly reducing the sc... more Lossy compression techniques have demonstrated promising results in significantly reducing the scientific data size while guaranteeing the compression error bounds. However, one important yet often neglected side effect of lossy scientific data compression is its impact on the performance of parallel I/O. Our key observation is that the compressed data size is often highly skewed across processes in lossy scientific compression. To understand this behavior, we conduct extensive experiments where we apply three lossy compressors MGARD, ZFP, and SZ, which are specifically designed and optimized for scientific data, to three real-world scientific applications Gray-Scott simulation, WarpX, and XGC. Our analysis result demonstrates that the size of the compressed data is always skewed even if the original data is evenly decomposed among processes. Such skewness widely exists in different scientific applications using different compressors as long as the information density of the data varies across processes. We then systematically study how this side effect of lossy scientific data compression impacts the performance of parallel I/O. We observe that the skewness in the sizes of the compressed data often leads to I/O imbalance, which can significantly reduce the efficiency of I/O bandwidth utilization if not properly handled. In addition, writing data concurrently to a single shared file through MPI-IO library is more sensitive to the unbalanced I/O loads. Therefore, we believe our research community should pay more attention to the unbalanced parallel I/O caused by lossy scientific data compression.