广义无冗余情节规则抽取方法研究

尤涛, 徐伟, 杨凯, 杜承烈, 钟冬

电子学报 ›› 2015, Vol. 43 ›› Issue (2) : 269-275.

PDF(1998 KB)
PDF(1998 KB)
电子学报 ›› 2015, Vol. 43 ›› Issue (2) : 269-275. DOI: 10.3969/j.issn.0372-2112.2015.02.010
学术论文

广义无冗余情节规则抽取方法研究

  • 尤涛, 徐伟, 杨凯, 杜承烈, 钟冬
作者信息 +

Research on Extracting Generalized Non-Redundant Episode Rules

  • YOU Tao, XU Wei, YANG Kai, DU Cheng-lie, ZHONG Dong
Author information +
文章历史 +

摘要

情节规则挖掘旨在发现频繁情节之间的因果关联,现有无损情节规则挖掘方法没有考虑多规则间的关联关系,故而存在大量冗余.利用演绎推导特性对情节规则间的关联关系进行建模,引入无冗余情节迹规则的概念,分析了情节迹冗余的原因,通过最大重叠项冗余性检查给出广义无冗余情节规则抽取算法;证明了广义无冗余情节规则对情节规则的等价表达能力.理论分析和实验评估表明该算法在处理效率基本不变的前提下,提高了情节规则的生成质量.

Abstract

Aiming at the problem that current nondestructive episode rule mining algorithms don't consider the relationship between episode rules and generate redundancy,we model the relationship among the episode rules by using deduction characteristic,and introduce the concept of non-redundant episode trace rules.We also analyze reasons for episode trace redundancy,and present the generalized non-redundant episode rules mining algorithm based on the redundant checking on maximum overlap items.Then we prove that generalized non-redundant episode rules keep the equivalent expression ability to episode rules.Theoretical analysis and experiments demonstrate this algorithm improved the quality of generatedepisode rules with almost the same efficiency.

关键词

事件序列 / 演绎 / 情节迹 / 最大重叠项 / 情节规则

Key words

event sequence / deduction / episode trace / maximum overlap items / episode rule

引用本文

导出引用
尤涛, 徐伟, 杨凯, 杜承烈, 钟冬. 广义无冗余情节规则抽取方法研究[J]. 电子学报, 2015, 43(2): 269-275. https://doi.org/10.3969/j.issn.0372-2112.2015.02.010
YOU Tao, XU Wei, YANG Kai, DU Cheng-lie, ZHONG Dong. Research on Extracting Generalized Non-Redundant Episode Rules[J]. Acta Electronica Sinica, 2015, 43(2): 269-275. https://doi.org/10.3969/j.issn.0372-2112.2015.02.010
中图分类号: TP311   

参考文献

[1] Mannila H,Toivonen H,Verkamo A I.Discovering frequent episodesin sequences extended abstract[A].1st Conference on Knowledge Discovery and Data Mining[C].Montreal:CA.1995.210-215.
[2] Koh Y S,Pears R.Efficient negative association rule mining based on chance thresholds[J].Intelligent Data Analysis,2014,18(2):243-260.
[3] Lam H T,Calders T,Yang J,et al.Zips:mining compressing sequential patterns in streams[A].Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics[C].New York:ACM,2013.54-62.
[4] Van Der Aalst W.Process mining:Overview and opportunities[J].ACM Transactions on Management Information Systems,2012,3(2):1-17.
[5] Laxman S,Tankasali V,White R W.Stream prediction using a generative model based on frequent episodes in event sequences[A].Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining[C].Las Vegas,NV:ACM,2008.453-461.
[6] Cho C W,Zheng Y,Chen A L P.Continuously Matching Episode Rules for Predicting Future Events over Event Streams[M].Advances in Data and Web Management.2007.884-891.
[7] Cho C W,Wu Y H,Yen S J,et al.On-line rule matching for event prediction[J].The VLDB Journal,2011,20(3):303-334.
[8] Hatonen K,Klemettinen M,Mannila H,et al.Knowledge discovery from telecommunication network alarm databases[A].Proceedings of the Twelfth IEEE International Conference[C].Trois-Rivières:IEEE,1996.115-122.
[9] Méger N,Rigotti C.Constraint-based mining of episode rules and optimal window sizes[A].Knowledge Discovery in Databases:PKDD 2004[C].Berlin Heidelberg:PKDD,2004.313-324.
[10] Lo D,Khoo S C,Li J.Mining and ranking generators of sequential patterns[A].SIAM International Conference on Data Mining[C].Atlanta Georgia:SIAM,2008.553-564.
[11] 朱辉生,汪卫,施伯乐.基于频繁闭情节及其生成子的无冗余情节规则抽取[J].计算机学报,2012,35(1):53-64. Zhu Huisheng,Wang wei,Shi Bole.Extracting non-redundant episode rules based onfrequent closed episodes and their generators[J].Journal of Computers,2012,35(1):53-64.(in Chinese)
[12] Fournier-Viger P,Tseng V S.Tns:mining top-k non-redundantsequential rules[A].Proceedings of the 28th Annual ACM Symposium on Applied Computing[C].Coimbra,Portugal:ACM,2013.164-166.
[13] Zhang S,Wu X.Fundamentals of association rules in data mining and knowledge discovery[J].Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,2011,1(2):97-116.
[14] Rudin C,Letham B,Madigan D.Learning theory analysis for association rules and sequential event prediction[J].The Journal of Machine Learning Research,2013,14(1):3441-3492.
[15] Toon Calders,Bart Goethals.Non-derivable itemset mining[J].Data Mining and Knowledge Discovery,2007,14(1):171-206.
[16] Ao F,Du J,Yan Y,et al.An efficient algorithm for mining closed frequent itemsets in data Streams[A].Computer and Information Technology Workshops of IEEE 8th International Conference 2008[C].Aizu-Wakamatsu:IEEE,2008.37-42.
[17] Agrawal R,Srikant R.Fast algorithms for mining association rules[A].Proceedings of 20th International Conference on Very Large Data Bases[C].Santiago de Chile:ACM,1994.487-499.

基金

国家自然科学基金 (No.61303225); 航空科学基金 (No.20135553034); 中央高校基本科研业务费专项资金 (No.3102014JSJ0008)

PDF(1998 KB)

1626

Accesses

0

Citation

Detail

段落导航
相关文章

/