CRII: OAC: Enabling Quantities-of-Interest Error Control for Trust-Driven Lossy Compression



Project description


Scientific simulations and instruments are producing data at volumes and velocities that overwhelm network and storage systems. Although error-controlled lossy compressors have been employed to mitigate these data issues, many scientists still feel reluctant to adopt them because these compressors provide no guarantee on the accuracy of downstream analysis results derived from raw data. This project aims to fill this gap by developing a trust-driven lossy data compression infrastructure capable of strictly controlling the errors in downstream analysis theoretically and practically to facilitate the use of data reduction in scientific applications. Success of this project will promote the progress of science in multiple disciplines via effective data reduction, and contribute to resolving important societal problems including electric generation, weather forecasting, material design, and transportation. Moreover, this project will contribute to the growth and development of future generations of scientists and engineers through educational and engagement activities, including development of new curriculum and recruitment of K-12 students.


Publications


IPDPS'25

Xuan Wu, Sheng Di, Congrong Ren, Pu Jiao, Mingze Xia, Cheng Wang, Hanqi Guo, Xin Liang*, Franck Cappello.
Enabling Efficient Error-controlled Lossy Compression for Unstructured Scientific Data.
Proceedings of the 39th IEEE International Parallel & Distributed Processing Symposium, Milan, Italy, June 3 - June 7, 2025. Nominated for the Best Paper Award. (*: Corresponding authors)

IPDPS'25

Pu Jiao, Sheng Di, Mingze Xia, Xuan Wu, Jinyang Liu, Xin Liang*, Franck Cappello.
Improving the Efficiency of Interpolation-Based Scientific Data Compressors with Adaptive Quantization Index Prediction.
Proceedings of the 39th IEEE International Parallel & Distributed Processing Symposium, Milan, Italy, June 3 - June 7, 2025. (*: Corresponding authors)

PacificVis'25

Nathan Gorski, Xin Liang, Hanqi Guo, Lin Yan, Bei Wang.
A General Framework for Augmenting Lossy Compressors with Topological Guarantees.
Proceedings of IEEE Transactions on Visualization and Computer Graphics (IEEE PacificVis 2025 Journal Track), Taipei, April 22-25, 2025.

SC'24

Xuan Wu, Qian Gong, Jieyang Chen, Qing Liu, Norbert Podhorszki, Xin Liang*, and Scott Klasky.
Error-controlled Progressive Retrieval of Scientific Data under Derivable Quantities of Interest.
Proceedings of the 36th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, Atlanta, GA, USA, Nov 17 - 22, 2024. (*: Corresponding authors)

EuroVis'24

Congrong Ren, Xin Liang, and Hanqi Guo.
A Prediction-Traversal Approach for Compressing Scientific Data on Unstructured Meshes with Bounded Error.
Proceedings of 26th EG Conference on Visualization, Odense, Denmark, May 27 - May 31, 2024.

IPDPS'24

Zizhe Jian, Sheng Di, Jinyang Liu, Kai Zhao, Xin Liang, Haiying Xu, Robert Underwood, Shixun Wu, Jiajun Huang, Zizhong Chen, Franck Cappello.
CliZ: Optimizing Lossy Compression for Climate Datasets with Adaptive Fine-tuned Data Prediction.
Proceed- ings of 38th IEEE International Parallel & Distributed Processing Symposium, San Francisco, California, May 27 - May 31, 2024.

ICDE'24

Mingze Xia, Sheng Di, Franck Cappello, Pu Jiao, Kai Zhao, Jinyang Liu, Xuan Wu, Xin Liang*, and Hanqi Guo.
Preserving Topological Feature with Sign-of-Determinant Predicates in Lossy Compression: A Case Study of Vector Field Critical Points.
Proceedings of the 40th IEEE International Conference on Data Engineering, Utrecht, Netherlands, May 13 - 16, 2024. (*: Corresponding authors)

SIGMOD'24

Jinyang Liu, Sheng Di, Kai Zhao, Xin Liang, Sian Jin, Zizhe Jian, Jiajun Huang, Shixun Wu, Zizhong Chen, Franck Cappello.
High-performance Effective Scientific Error-bounded Lossy Compression with Auto-tuned Multi-component Interpolation.
Proceedings of the 2024 ACM SIGMOD International Conference on Management of Data, Santiago, Chile, June 9 - 15, 2024.

BigData'23

Jinyang Liu, Sheng Di, Sian Jin, Kai Zhao, Xin Liang, Zizhong Chen, Franck Cappello.
Scientific Error-bounded Lossy Compression with Super-resolution Neural Networks.
Proceedings of the 2023 IEEE International Conference on Big Data, Sorrento, Italy, Dec 15 - Dec 18, 2023..

HiPC'23

Pu Jiao, Sheng Di, Jinyang Liu, Xin Liang*, and Franck Cappello.
Characterization and Detection of Artifacts for Error-controlled Lossy Compressors.
Proceedings of the 30th IEEE International Conference on High Performance Computing, Data, and Analytics, Goa, India, Dec 18 - 21, 2023. (*: Corresponding authors)

VIS'23

Lin Yan, Xin Liang, Hanqi Guo, Bei Wang
TopoSZ: Preserving Topology in Error-Bounded Lossy Compression.
Proceedings of the 34th IEEE VIS Conference, Melbourne, Australia, Oct 22 - 27, 2023.

ICS'23

Jinyang Liu, Sheng Di, Kai Zhao, Xin Liang, Zizhong Chen, Franck Cappello.
FAZ: A flexible auto-tuned modular error-bounded compression framework for scientific data.
Proceedings of the 37th International Conference on Supercomputing, Orlando, FL, Jun 21 - 23, 2023. Nominated in the Best Paper Finalist.

VLDB'23

Pu Jiao, Sheng Di, Hanqi Guo, Kai Zhao, Jiannan Tian, Dingwen Tao, Xin Liang*, and Franck Cappello.
Toward Quantity-of-Interest Preserving Lossy Compression for Scientific Data.
Proceedings of the 49th International Conference on Very Large Data Bases, Vancour, Canada, Aug 28 - Sep 1, 2023. (*: Corresponding authors)

SC'22

Jinyang Liu, Sheng Di, Kai Zhao, Xin Liang, Zizhong Chen, and Franck Cappello.
Dynamic Quality Metric Oriented Error Bounded Lossy Compression for Scientific Datasets.
Proceedings of the 34th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, Dallas, TX, USA, Nov 13 - 18, 2022.

SSDBM'22

Qian Gong, Ben Whitney, Chengzhu Zhang, Xin Liang, Anand Rangarajan, Jieyang Chen, Lipeng Wan, Paul Ullrich, Qing Liu, Robert Jacob, Sanjay Ranka, and Scott Klasky.
Region-adaptive, Error-controlled Scientific Data Compression using Multilevel Decomposition.
Proceedings of the 34th International Conference on Scientific and Statistical Database Management, Copenhagen, Denmark, July 6-8, 2022.

TVCG

Xin Liang, Sheng Di, Franck Cappello, Mukund Raj, Chunhui Liu, Kenji Ono, Zizhong Chen, Tom Peterka, and Hanqi Guo.
Toward Feature-Preserving Vector Field Compression.
IEEE Transactions on Visualization and Computer Graphics, 2022.

TBD

Xin Liang*, Kai Zhao*, Sheng Di, Sihuan Li, Robert Underwood, Ali M. Gok, Jiannan Tian, Junjing Deng, Jon C. Calhoun, Dingwen Tao, Zizhong Chen, and Franck Cappello.
SZ3: A Modular Framework for Composing Prediction-based Error-bounded Lossy Compressors (2023 Best Paper Award from IEEE Transactions on Big Data by the IEEE Computer Society Publications Board).
IEEE Transactions on Big Data, 2022. (*: Co-first authors)