

浏览全部资源
扫码关注微信
哈尔滨工程大学计算机科学与技术学院,黑龙江,哈尔滨,150001
Published:2013
移动端阅览
ZHU Guan-wen, WANG Nian-bin, WANG Hong-bin. An Improved Method for Deep Web Sources Classification Based on the Theme and Form Attributes[J]. Acta Electronica Sinica, 2013, 41(2): 260-266.
ZHU Guan-wen, WANG Nian-bin, WANG Hong-bin. An Improved Method for Deep Web Sources Classification Based on the Theme and Form Attributes[J]. Acta Electronica Sinica, 2013, 41(2): 260-266. DOI: 10.3969/j.issn.0372-2112.2013.02.009.
当前深层网络中蕴含着高质量的海量信息并且其数量不断地增长
由于深层网络具有分布、异构、自治等特点
用户高效、快捷地获取自己感兴趣的信息面临巨大挑战.将深层网络数据源按领域分类是解决这一挑战的基础.本文以对航空订票、图书、汽车和房地产领域的200多个数据源的统计和分析为基础
充分利用主题和表单属性信息
提出了一种新的深层网络数据源分类方法以及改进的查询接口相似性度量方法
实现深层网络数据源的自动分类.本文还提出了一种查询接口标记策略
以降低随机选择初始中心点所产生的影响.实验结果表明该方法具有较高的分类精度.
Nowadays
Deep web consists of vast amounts of high quality information which is rising rapidly.However
because of its distributed character
heterogeneity
autonomy etc
it is faced with huge challenges for users to obtain the information efficiently and quickly which they are interested in.Deep Web data sources are organized by the domains in the real world
which is the foundation for addressing this challenge.In this paper
based on the statistics and analysis on more than 200 data sources which are from four different fields(i.e.
Airfares
Books
Automobiles and Real estates
a novel classification method and an improved similarity measure of query interfaces were proposed to realize the automatic classification of large masses of deep web sources
which make full use of theme information and form attributes.In addition
we present a strategy of tagging query interface to reduce the influence resulted from choosing initial centers randomly.The experimental results indicated that the method is effective and has higher accuracy.
0
Views
4
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621