An IB Algorithm Based on Data Selection Model

LOU Zheng-zheng; YANG Chen; YE Yang-dong

doi:10.3969/j.issn.0372-2112.2014.09.027

您当前的位置：

首页 >

文章列表页 >

An IB Algorithm Based on Data Selection Model

更新时间：2025-07-16

- An IB Algorithm Based on Data Selection Model
- Acta Electronica Sinica Vol. 42, Issue 9, Pages: 1839-1846(2014)
- 作者机构：
  
  郑州大学信息工程学院,河南,郑州,450052
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China (No.61170223)
- DOI：10.3969/j.issn.0372-2112.2014.09.027
  CLC： TP18
- Published：2014
- 稿件说明：
移动端阅览
LOU Zheng-zheng, YANG Chen, YE Yang-dong. An IB Algorithm Based on Data Selection Model[J]. Acta Electronica Sinica, 2014, 42(9): 1839-1846.
DOI：

LOU Zheng-zheng, YANG Chen, YE Yang-dong. An IB Algorithm Based on Data Selection Model[J]. Acta Electronica Sinica, 2014, 42(9): 1839-1846. DOI： 10.3969/j.issn.0372-2112.2014.09.027.

摘要

针对数据对象自身模式特征明确程度的不同给IB（Information Bottleneck）方法数据分析带来的问题，定义一个基于明确因素的数据选择模型，使得IB方法可从数据集中选取模式特征较为明确的数据对象并对其进行模式分析，提出DSIB （Data Selection Information Bottleneck）算法.DSIB算法采用数据压缩过程中所产生的信息损失作为数据对象模式特征是否明确的判定条件，使用边选择边学习的顺序抽取-合并策略来优化DSIB目标函数.实验结果表明：随着数据选择标准的不断提高，DSIB算法在提高数据分析精度的同时所牺牲的召回率较小；与未做选择的数据分析算法相比，DSIB算法可更好地识别出数据中所固有的内在模式.

Abstract

In the original IB (Information Bottleneck) algorithms

all the data points are employed to learn the cluster patterns.However

in many real-world applications

some data show clear coherent behavior and can be summarized well

while some data present weak tendencies to be assigned to any particular pattern.For such situations

this paper proposes a DSIB (Data Selection Information Bottleneck) algorithm which has the ability to select data points with clear coherent behavior and find their corresponding cluster patterns.To realize this goal

the DSIB algorithm takes the information loss as the data selection criterion

which is generated when we try to compress the data point into one of the clusters.The DSIB algorithm adopts sequential draw-and-merge procedure to select the data and learn the cluster patterns.This learning process can take full account of each datum's natural pattern.Experimental results show that with the improvement of the data selection criterion

the DSIB algorithm can improve the clustering precision while the expense of the recall is small.In our evaluation

the DSIB algorithm is found to be consistently superior to all the other clustering methods we examine.

关键词

Keywords

references

Views

2125

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Cluster Generation Algorithm for Hierarchical Networks-on-Chip Architecture

A Low Complexity Dynamic Resource Allocation Scheme for Clustered Multiuser MB-OFDM Systems

Related Author

DAI Shi-jin

LI Le-min

WANG Hong-wei

LU Jun-lin

TONG Dong

CHENG Xu

DONG Wei-jie

YU Neng-hai

Related Institution

School of Communication and Information Engineering, University of Electronic Science and Technology of China

Microprocessor Research and Development Center,Peking University

中国科技大学信息处理中心

⁰