UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
FastThetaJoin: An Optimization on Multi-way Data Stream θ-join with Range Constraints
Ziyue Hu1,2; Xiaopeng Fan1; Yang Wang1; Chengzhong Xu3
2020-09-29
Conference Name20th International Conference on Algorithms and Architectures for Parallel Processing
Source PublicationICA3PP 2020: Algorithms and Architectures for Parallel Processing
Conference Date2020/10/02-2020/10/04
Conference PlaceNew York, NY
CountryUSA
Abstract

In this paper, we propose FastThetaJoin, an optimization technique for θ-join operation on multi-way data streams, which is an essential query often used in many data analytical tasks. The θ-join operation on multi-way data streams is notoriously difficult as it always involves tremendous shuffle cost due to data movements between multiple operation components, rendering it hard to be efficiently implemented in a distributed environment. As with previous methods, FastThetaJoin also tries to minimize the number of θ-joins, but it is distinct from others in terms of making partitions, deleting unnecessary data items, and performing the Cartesian product. FastThetaJoin not only effectively minimizes the number of θ-joins, but also substantially improves the efficiency of its operations in a distributed environment. We implemented FastThetaJoin in the framework of Spark Streaming, characterized by its efficient bucket implementation of parameterized windows. The experimental results show that, compared with the existing solutions, our proposed method can speed up the θ-join processing while reducing its overhead; the specific effects of the optimization is correlated to the nature of data streams–the greater the data difference is, the more apparent the optimization effect is.

KeywordΘ-join Theta Join Multi-way Data Streams Data Streams Spark Streaming
DOI10.1007/978-3-030-60245-1_12
Indexed ByCPCI-S
Language英語English
WOS Research AreaComputer Science
WOS SubjectComputer Science, Hardware & Architecture ; Computer Science, Software Engineering ; Computer Science, Theory & Methods
WOS IDWOS:000719289200012
Scopus ID2-s2.0-85092650298
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionFaculty of Science and Technology
Corresponding AuthorYang Wang
Affiliation1.Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
2.University of Chinese Academy of Sciences, Beijing, China
3.University of Macau, Macau, China
Recommended Citation
GB/T 7714
Ziyue Hu,Xiaopeng Fan,Yang Wang,et al. FastThetaJoin: An Optimization on Multi-way Data Stream θ-join with Range Constraints[C], 2020.
APA Ziyue Hu., Xiaopeng Fan., Yang Wang., & Chengzhong Xu (2020). FastThetaJoin: An Optimization on Multi-way Data Stream θ-join with Range Constraints. ICA3PP 2020: Algorithms and Architectures for Parallel Processing.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Ziyue Hu]'s Articles
[Xiaopeng Fan]'s Articles
[Yang Wang]'s Articles
Baidu academic
Similar articles in Baidu academic
[Ziyue Hu]'s Articles
[Xiaopeng Fan]'s Articles
[Yang Wang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Ziyue Hu]'s Articles
[Xiaopeng Fan]'s Articles
[Yang Wang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.