Orthrus: Efficient and Timely Detection of Silent User Data Corruption in the Cloud with Resource-Adaptive Computation Validation
Published in SOSP 2025 (The 31st Symposium on Operating Systems Principles), 2025
Introduction:
Even with substantial endeavors to test and validate processors, computational errors may still arise post-installation. One particular category of CPU errors transpires discreetly, without crashing applications or triggering hardware warnings. These elusive errors pose a significant threat by undermining user data, and their detection is challenging. This paper introduces Orthrus, a solution for the timely detection of silent user-data corruption caused by post-installation CPU errors.
Recommended citation: Chenxiao Liu, Zhenting Zhu, Quanxi Li, Yanwen Xia, Yifan Qiao, Xiangyun Deng, Youyou Lu, Tao Xie, Huimin Cui, Zidong Du, Harry Xu, and Chenxi Wang. 2025. Orthrus: Efficient and Timely Detection of Silent User Data Corruption in the Cloud with Resource-Adaptive Computation Validation. In Proceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles (SOSP 25). Association for Computing Machinery, New York, NY, USA, 286–304. https://doi.org/10.1145/3731569.3764832
Download Paper
