电子学报 ›› 2016, Vol. 44 ›› Issue (12): 2908-2915.DOI: 10.3969/j.issn.0372-2112.2016.12.014

• 学术论文 • 上一篇    下一篇

核典型关联性分析相关特征提取与核逻辑斯蒂回归域自适应学习

刘建伟1, 孙正康1, 刘泽宇2, 罗雄麟1   

  1. 1. 中国石油大学(北京)自动化系, 北京 102249;
    2. 中国科学院软件研究所基础软件国家工程研究中心, 北京 100190
  • 收稿日期:2015-05-25 修回日期:2015-11-18 出版日期:2016-12-25 发布日期:2016-12-25
  • 通讯作者: 刘建伟
  • 作者简介:孙正康,男,硕士,1990年出生.中国石油大学(北京)地球物理与信息工程学院硕士研究生,研究方向为机器学习.E-mail:sunzhengkang@126.com
  • 基金资助:

    国家重点基础研究发展规划(973计划)项目(No.2012CB720500)

Domain Adaptation Learning with Kernel Logistic Regression and Kernel Canonical Correlation Analysis

LIU Jian-wei1, SUN Zheng-kang1, LIU Ze-yu2, LUO Xiong-lin1   

  1. 1. Department of Automation, China University of Petroleum, Beijing 102249, China;
    2. National Engineering Research Center for Fundamental Software, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2015-05-25 Revised:2015-11-18 Online:2016-12-25 Published:2016-12-25

摘要:

本文提出了一种利用核典型关联性分析提取源域目标域最大相关特征,使用核逻辑斯蒂回归模型进行域自适应学习的算法,该算法称为KCCA-DAML(Kernel Canonical Correlation Analysis for Domain Adaptation Learning).该算法基于特征集关联性分析,有效的减小源域和目标域的概率分布差异性,利用提取的最大相关特征通过核逻辑斯蒂回归模型实现源域到目标域的跨域学习.实验比较源域数据上核逻辑斯蒂学习模型、目标域上核逻辑斯蒂学习模型、源域和目标域上核逻辑斯蒂学习模型和KCCA-DAML模型,结果显示KCCA-DAML在真实数据集上成功的实现了跨域学习.

关键词: 域自适应, 概率分布差异, 相关分析, 核逻辑斯蒂回归, 正则化模型

Abstract:

The domain adaptive learning algorithm using kernel logistic regression model is proposed.The proposed approach use kernel canonical correlation analysis to extract the maximum relevant features of the source and target domain.We dub it as KCCA-DAML(Kernel Canonical Correlation Analysis for Domain Adaptation Learning,KCCA-DAML).Our algorithm is based on canonical correlation analysis,which simultaneously minimizes the incompatibility among source features,target features and instance labels,extract maximum relevant features from source features,target features and instance labels,and use kernel logistic regression domain adaptation learning.In experimental comparison of the kernel logistic model and KCCA-DAML model on source domain data,the target domain data,source and the target domain data,we demonstrate the power of our techniques with the following real-world data sets:Reuters 20 Newsgroups,MNIST handwritten-digits and UCI Dermatology.

Key words: domain adaptation, distribution discrepancy, correlation analysis, kernel logistic regression, regularization model

中图分类号: