Residential College | false |
Status | 已發表Published |
Improvised Methods for Tackling Big Data Stream Mining Challenges: Case Study of Human Activity Recognition | |
Simon Fong1; Kexing Liu1; Kyungeun Cho2; Raymond Wong3; Sabah Mohammed4; Jinan Fiaidhi4 | |
2016-02-16 | |
Source Publication | JOURNAL OF SUPERCOMPUTING |
ISSN | 0920-8542 |
Volume | 72Issue:10Pages:3927-3959 |
Abstract | Big data stream is a new hype but a practical computational challenge founded on data streams that are prevalent in applications nowadays. It is quite well known that data streams that are originated and collected from monitoring sensors accumulate continuously to a very huge amount making traditional batch-based model induction algorithms infeasible for real-time data mining or just-in-time data analytics. In this position paper, following a new datastream mining methodology, namely stream-based holistic analytics and reasoning in parallel (SHARP), a list of data analytic challenges as well as improvised methods are looked into. In particular, two types of decision tree algorithms, batch-mode and incremental-mode, are put under test at sensor data that represents a typical big data stream. We investigate whether and to what extent of two improvised methods-outlier removal and balancing imbalanced class distributions-affect the prediction performance in big data stream mining. SHARP is founded on incremental learning which does not require all the training to be loaded into the memory. This important fundamental concept needs to be supported not only by the decision tree algorithms, but by the other improvised methods usually at the preprocessing stage as well. This paper sheds some light into this area which is often overlooked by dataanalysts when it comes to big data stream mining. |
Keyword | Data Stream Mining Big Data Very Fast Decision Tree Resampling Sensor Data |
DOI | 10.1007/s11227-016-1639-5 |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Computer Science ; Engineering |
WOS Subject | Computer Science, Hardware & Architecture ; Computer Science, Theory & Methods ; Engineering, Electrical & Electronic |
WOS ID | WOS:000385417400014 |
Publisher | SPRINGER, VAN GODEWIJCKSTRAAT 30, 3311 GZ DORDRECHT, NETHERLANDS |
Scopus ID | 2-s2.0-84958742980 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Corresponding Author | Simon Fong |
Affiliation | 1.Department of Computer and Information Science, University of Macau, Macau, SAR, China 2.Department of Multimedia Engineering, Dongguk University, Seoul, Korea 3.School of Computer Science and Engineering, University of New South Wales, Sydney, Australia 4.Department of Computer Science, Lakehead University, Thunder Bay, Canada |
First Author Affilication | University of Macau |
Corresponding Author Affilication | University of Macau |
Recommended Citation GB/T 7714 | Simon Fong,Kexing Liu,Kyungeun Cho,et al. Improvised Methods for Tackling Big Data Stream Mining Challenges: Case Study of Human Activity Recognition[J]. JOURNAL OF SUPERCOMPUTING, 2016, 72(10), 3927-3959. |
APA | Simon Fong., Kexing Liu., Kyungeun Cho., Raymond Wong., Sabah Mohammed., & Jinan Fiaidhi (2016). Improvised Methods for Tackling Big Data Stream Mining Challenges: Case Study of Human Activity Recognition. JOURNAL OF SUPERCOMPUTING, 72(10), 3927-3959. |
MLA | Simon Fong,et al."Improvised Methods for Tackling Big Data Stream Mining Challenges: Case Study of Human Activity Recognition".JOURNAL OF SUPERCOMPUTING 72.10(2016):3927-3959. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment