Fast Cluster-learning with Prior Probability from Big Dataset

doi:10.1109/ISCMI.2018.8703219

UM > Faculty of Science and Technology > DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE

Residential College	false
Status	已發表Published
	Fast Cluster-learning with Prior Probability from Big Dataset
	Tengyue Li 1; Simon Fong1 ; Joao Alexandre Lobo Marques 2; Raymond K. Wong 3
	2019-05-02
Conference Name	2018 5th International Conference on Soft Computing & Machine Intelligence (ISCMI)
Source Publication	5th International Conference on Soft Computing and Machine Intelligence, ISCMI 2018
Pages	60-66
Conference Date	21-22 Nov. 2018
Conference Place	Nairobi, Kenya
Publisher	IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA
Abstract	Association Rule Mining by Aprior method has been one of the popular data mining techniques for decades, where knowledge in the form of item-association rules is harvested from a dataset. The quality of item-association rules nevertheless depends on the concentration of frequent items from the input dataset. When the dataset becomes large, the items are scattered far apart. It is known from previous literature that clustering helps produce some data groups which are concentrated with frequent items. Among all the data clusters generated by a clustering algorithm, there must be one or more clusters which contain suitable and frequent items. In turn, the association rules that are mined from such clusters would be assured of better qualities in terms of high confidence than those mined from the whole dataset. However, it is not known in advance which cluster is the suitable one until all the clusters are tried by association rule mining. It is time consuming if they were to be tested by brute-force. In this paper, a statistical property called prior probability is investigated with respect to selecting the best out of many clusters by a clustering algorithm as a pre-processing step before association rule mining. Experiment results indicate that there is correlation between prior probability of the best cluster and the relatively high quality of association rules generated from that cluster. The results are significant as it is possible to know which cluster should be best used for association rule mining instead of testing them all out exhaustively.
Keyword	Association Rule Mining Clustering Preprocessing Prior Probability
DOI	10.1109/ISCMI.2018.8703219
URL	View the original
Indexed By	CPCI-S
Language	英語English
WOS Research Area	Computer Science
WOS Subject	Computer Science, Artificial Intelligence ; Computer Science, Theory & Methods
WOS ID	WOS:000470762100011
Scopus ID	2-s2.0-85065698950
Fulltext Access	View Full-Text via DOI View Full-Text via Web of Science View Full-Text via Scopus
Citation statistics
Document Type	Conference paper
Collection	DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding Author	Tengyue Li
Affiliation	1.Department of Computer and Information Science,University of Macau,Macao 2.School of Business,University of Saint Joseph,Macao 3.School of Computer Science and Engineering,University of New South Wales,Sydney,Australia
First Author Affilication	University of Macau
Corresponding Author Affilication	University of Macau
Recommended Citation GB/T 7714	Tengyue Li,Simon Fong,Joao Alexandre Lobo Marques,et al. Fast Cluster-learning with Prior Probability from Big Dataset[C]:IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA, 2019, 60-66.
APA	Tengyue Li., Simon Fong., Joao Alexandre Lobo Marques., & Raymond K. Wong (2019). Fast Cluster-learning with Prior Probability from Big Dataset. 5th International Conference on Soft Computing and Machine Intelligence, ISCMI 2018, 60-66.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh