Residential College | false |
Status | 已發表Published |
Cuttle: Enabling cross-column compression in distributed column stores | |
Liu, Hao1; Xiao, Jiang2; Guo, Xianjun3; Tan, Haoyu1; Luo, Qiong1; Ni, Lionel M.4 | |
2017 | |
Conference Name | 1st Asia-Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data, APWeb-WAIM 2017 |
Source Publication | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Volume | 10367 LNCS |
Pages | 219-226 |
Conference Date | 7 7, 2017 - 7 9, 2017 |
Conference Place | Beijing, China |
Author of Source | Springer Verlag |
Abstract | We observe that, in real-world distributed data warehouse systems, data columns from different sources often exhibit redundancy. Even though these systems can employ both general and column-oriented compression schemes to reduce the data storage pressure, such cross-column redundancy (CCR) is not recognized or exploited effectively. Therefore, we propose Cuttle, a column storage system that enables cross-column compression to reduce CCR. Specifically, we identify three kinds of CCR and develop a referential transformation encoding (RTE) scheme to compress multiple columns of data with CCR. Furthermore, we address the CCR selection problem and propose a greedy algorithm to generate cross-column compression schemes. Our experiments on real-world datasets show that Cuttle can further reduce data size by half after applying both the column-oriented and general compression schemes, and that the query processing performance with Cuttle is improved by $$20\%$$ without any change to the application programs. © Springer International Publishing AG 2017. |
DOI | 10.1007/978-3-319-63564-4_18 |
Language | 英語English |
WOS ID | WOS:000452448300018 |
Scopus ID | 2-s2.0-85028471149 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | University of Macau |
Affiliation | 1.Department of Computer Science and Engineering, HKUST, Kowloon, Hong Kong; 2.Huazhong University of Science and Technology, Wuhan, China; 3.Deepera Inc., Ocean Coast City, Shenzhen, China; 4.University of Macau, Zhuhai, China |
Recommended Citation GB/T 7714 | Liu, Hao,Xiao, Jiang,Guo, Xianjun,et al. Cuttle: Enabling cross-column compression in distributed column stores[C]. Springer Verlag, 2017, 219-226. |
APA | Liu, Hao., Xiao, Jiang., Guo, Xianjun., Tan, Haoyu., Luo, Qiong., & Ni, Lionel M. (2017). Cuttle: Enabling cross-column compression in distributed column stores. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10367 LNCS, 219-226. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment