电子学报 ›› 2014, Vol. 42 ›› Issue (8): 1556-1563.DOI: 10.3969/j.issn.0372-2112.2014.08.015

• 学术论文 • 上一篇    下一篇

基于本体与模式的网络用户兴趣挖掘

苏雪阳, 左万利, 王俊华   

  1. 1. 吉林大学计算机科学与技术学院, 吉林长春 130012;
    2. 符号计算与知识工程教育部重点实验室(吉林大学), 吉林长春 130012
  • 收稿日期:2013-06-04 修回日期:2013-08-12 出版日期:2014-08-25 发布日期:2014-08-25
  • 通讯作者: 左万利
  • 作者简介:苏雪阳男,1989年2月出生,江苏连云港,2011年至今于吉林大学计算机学院攻读硕士学位,从事Web数据挖掘、自然语言处理和搜索引擎有关研究.E-mail:suxueyang2011@gmail.com;王俊华女,1982年3月出生,山东菏泽人,2010年至今于吉林大学计算机学院攻读博士学位,从事Web数据挖掘和自然语言处理和搜索引擎有关研究.E-mail:wangjunhua_1982@126.com
  • 基金资助:

    国家自然科学基金(No.60973040);国家自然科学青年基金(No.61300148);吉林省重点科技攻关项目(No.20130206051GX);吉林省科技发展计划青年基金(No.20130522112JH)

Web User Interest Mining Based on Ontology and Patterns

SU Xue-yang, ZUO Wan-li, WANG Jun-hua   

  1. 1. College of Computer Science and Technology, Jilin University, Changchun, Jilin 130012, China;
    2. Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University) Ministry of Education, Changchun, Jilin 130012, China
  • Received:2013-06-04 Revised:2013-08-12 Online:2014-08-25 Published:2014-08-25

摘要:

本文探讨了用户兴趣挖掘的新方法,首先从用户搜索日志中获取访问行为元素,并借助通用本体中的概念描述网页所体现的用户个体兴趣,然后提出了一种兴趣得分计算方法,并在此基础上从用户个体兴趣序列中识别不同的兴趣模式,判断用户的短期兴趣,并利用通用本体得出用户兴趣的集合表示,最后根据短期兴趣的增量积累推算长期兴趣.整个过程避开了以往兴趣挖掘方法中通过相似度计算和文档聚类算法进行兴趣合并的问题,为兴趣发现提供了新思路.实验结果表明,本文的方法对用户兴趣的描述更具体,取得了更优化的兴趣合并结果.

关键词: 搜索引擎, 用户兴趣, 通用本体, 兴趣模式

Abstract:

A novel user interest mining method is proposed.Firstly,the items of visiting behaviors are retrieved from user's search engine log,and individual user interests with every webpage are described through the concepts of common ontology.Then,a method for computing the score of interest is proposed.According to the scores,a user's interest list can be judged as different interest patterns,which can be used to find the user's short term interests.After that,a user's interest model is built with concept collection extracted from ontology.At last,based on incremental accumulation of short term interests,long term interest collection can be calculated.The whole procedure avoids the problem of using similarity computation and document clustering to merge concepts in existing interest mining methods.This paper explores a new way of thinking.And as the experiment shows,the proposed method provides a more concrete description of user interest model and obtains an optimized concepts merging result.

Key words: search engine, user interest, ntology, interest pattern

中图分类号: