李顺:Processing missing information in big data environment
作者: 来源:国家安全学院 发布时间:2022年11月07日
Abstract
How to handle missing information is essential for system efficiency and robustness in the field of the database. Missing information in big data environment tends to have richer semantics, leading to more complex computational logic, as well as affecting operations and implement. The existing methods either have limited semantic expression ability or do not consider the influence of big data environment. To solve these problems, this paper proposes a novel missing information processing method. Combining the practical case of the big data environment, we summary the missing information into two types: unknown and nonexistent value, and define four-valued logic to support the logic operation. The relational algebra is extended systematically to describe the data operations. We implement our approach on the dynamic table model in the self-developed big data management system Muldas. Experimental results on real large-scale sparse data sets show the proposed approach has the good ability of semantic expression and computational efficiency.