Search Result Clustering Based on Centroid Optimization by Ontology Extraction

CHEN Yi-heng; QIN Bing; SONG Fan; LIU Ting; LI Sheng

您当前的位置：

首页 >

文章列表页 >

Search Result Clustering Based on Centroid Optimization by Ontology Extraction

更新时间：2025-07-16

- Search Result Clustering Based on Centroid Optimization by Ontology Extraction
- Acta Electronica Sinica Vol. 36, Issue S1, Pages: 166-170,156(2008)
- 作者机构：
  
  哈尔滨工业大学计算机科学与技术学院信息检索研究室,黑龙江,哈尔滨,150001
- 作者简介：
- 基金信息：
- DOI：
  CLC： TP391.2
- Published：2008
- 稿件说明：
移动端阅览
CHEN Yi-heng, QIN Bing, SONG Fan, et al. Search Result Clustering Based on Centroid Optimization by Ontology Extraction[J]. Acta Electronica Sinica, 2008, 36(S1): 166-170,156.
DOI：

CHEN Yi-heng, QIN Bing, SONG Fan, et al. Search Result Clustering Based on Centroid Optimization by Ontology Extraction[J]. Acta Electronica Sinica, 2008, 36(S1): 166-170,156. DOI：

摘要

本文针对互联网的数据量的不断增加

准确搜索引擎的作用日益困难的问题

为了提高搜索引擎返回结果结构化聚类的效果

让信息的定位更迅速

本文采用基于标签的聚类算法

并使用自然语言处理技术中的依存句法分析和词典资源

深度挖掘语义结构

提出基于优化初始选择的K均值聚类方法.本文深入分析K均值聚类算法特点

并利用类别标签技术对该算法进行有效改进.实验证明该算法不仅在效果上优于一般聚类算法

对结果描述也有很大帮助

在效率上也得到很大提高.

Abstract

Along with the constant development of the Internet and the ever increasing amount of data

the role of search engines has become increasingly evident.More users rely on search engines to find the information needed.In order to more effectively cluster the search results

thus facilitating the positioning of information among the original unstructured results

a new label based clustering algorithm is introduced in this paper.The key idea is to use the dictionary resource and Dependency Syntax Parsing in NLP to extract the onto logies related to the query.These extracted ontologies will further guide the choosing of centroids in K-means clustering.Furthermore

the various features of K-means algorithm have been fully investigated

and a way of improvement is proposed by using the cluster labels.Experiments show that this algorithm not only yields more effective cluster results but also provides more informative descriptions of the results;meanwhile

the efficiency has also been largely improved.

关键词

Keywords

references

Views

1142

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

No data

Related Author

No data

Related Institution

No data

⁰