广东工业大学学报 ›› 2024, Vol. 41 ›› Issue (01): 79-85.doi: 10.12052/gdutxb.220027

• 计算机科学与技术 • 上一篇    下一篇

基于最小二乘孪生支持向量机的不确定数据学习算法

刘锦能1, 肖燕珊1, 刘波2   

  1. 1. 广东工业大学 计算机学院, 广东 广州 510006;
    2. 广东工业大学 自动化学院, 广东 广州 510006
  • 收稿日期:2022-02-21 出版日期:2024-01-25 发布日期:2024-02-01
  • 通信作者: 刘波(1978–) ,男,教授,博士,主要研究方向为机器学习、数据挖掘,E-mail:csboliu@163.com
  • 作者简介:刘锦能(1996–) ,男,硕士研究生,主要研究方向为支持向量机
  • 基金资助:
    国家自然科学基金资助项目(62076074)

A Least Squares Twin Support Vector Machine Method with Uncertain Data

Liu Jin-neng1, Xiao Yan-shan1, Liu Bo2   

  1. 1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China;
    2. School of Automation, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2022-02-21 Online:2024-01-25 Published:2024-02-01

摘要: 孪生支持向量机通过计算2个二次规划问题,得到2个不平行的超平面,用于解决二分类问题。然而在实际的应用中,数据通常包含不确定信息,这将会对构建模型带来困难。对此,提出了一种用于求解带有不确定数据的最小二乘孪生支持向量机模型。首先,对于每个实例,该方法都分配一个噪声向量来构建噪声信息。其次,将噪声向量结合到最小二乘孪生支持向量机,并在训练阶段得到优化。最后,采用一个2步循环迭代的启发式框架求解得到分类器和更新噪声向量。实验表明,跟其他对比方法比较,本方法采用噪声向量对不确定信息进行建模,并将孪生支持向量机的二次规划问题转化为线性方程,具有更好的分类精度和更高的训练效率。

关键词: 最小二乘, 孪生支持向量机, 不平行平面学习, 数据不确定性, 分类

Abstract: Twin support vector machine learns two nonparallel hyperplanes by calculating two quadratic programming problems to solve the binary classification problems. However, in practical applications, the data usually contain uncertain information, making it difficult to construct the classification model. This paper proposed a new and efficient uncertain-data-based least squares twin support vector machine (ULSTSVM) method to address the problem of data uncertainty. Firstly, since the data may contain uncertain information, a noise vector was introduced to model the uncertain information of each example. Secondly, the noise vectors were incorporated into the least squares TWSVM. Finally, to solve the derived learning problem, we employed a two-step heuristic framework to train the least squares TWSVM classifier and updated the noise vectors alternatively. The experiments showed that our proposed ULSTSVM outperforms the baselines in training time and meanwhile achieves comparable classification accuracy. In sum, ULSTSVM adopts a noise vector to model the uncertain information and transforms the quadratic programming problems of TWSVM into linear equations, such that better classification accuracy and higher training efficiency can be obtained.

Key words: least squares, twin support vector machine, nonparallel plane learning, data uncertainty, classification

中图分类号: 

  • TP391
[1] CORTES C, VAPNIK V. Support-vector networks [J]. Machine Learning, 1995, 20(3): 273-297.
[2] LI S, KWOK J T, ZHU H, et al. Texture classification using the support vector machines [J]. Pattern Recognition, 2003, 36(12): 2883-2893.
[3] KHAN N M, KSANTITI R, AHMAD I S, et al. A novel SVM+NDA model for classification with an application to face recognition [J]. Pattern Recognition, 2012, 45(1): 66-79.
[4] BILAL M. Algorithmic optimisation of histogram intersection kernel support vector machine-based pedestrian detection using low complexity features [J]. IET Computer Vision, 2017, 11(5): 350-357.
[5] YUAN L, YAO E, TAN G. Automated and precise event detection method for big data in biomedical imaging with support vector machine [J]. International Journal of Computer Systems Science & Engineering, 2018, 33(2): 105-113.
[6] MANGASARIAN O L, WILD E W. Multisurface proximal support vector machine classification via generalized eigenvalues [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 28(1): 69-74.
[7] KHEMCHANDANI R, CHANDRA S. Twin support vector machines for pattern classification [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(5): 905-910.
[8] CHEN X, YANG J, YE Q, et al. Recursive projection twin support vector machine via within-class variance minimization [J]. Pattern Recognition, 2011, 44(10-11): 2643-2655.
[9] SHAO Y H, ZHANG C H, WANG X B, et al. Improvements on twin support vector machines [J]. IEEE Transactions on Neural Networks, 2011, 22(6): 962-968.
[10] KUMAR M A, GOPAL M. Least squares twin support vector machines for pattern classification [J]. Expert Systems with Applications, 2009, 36(4): 7535-7543.
[11] KUMAR M A, KHEMCHANDANI R, GOPAL M, et al. Knowledge based least squares twin support vector machines [J]. Information Sciences, 2010, 180(23): 4606-4618.
[12] TANVEER M, SHARMA S, MUHAMMAD K. Large-scale least squares twin svms [J]. ACM Transactions on Internet Technology (TOIT) , 2021, 21(2): 1-19.
[13] CHEN S G, WU X J, XU J. Locality preserving projection least squares twin support vector machine for pattern classification [J]. Pattern Analysis and Applications, 2020, 23(1): 1-13.
[14] QI Z, TIAN Y, SHI Y. Robust twin support vector machine for pattern classification [J]. Pattern Recognition, 2013, 46(1): 305-316.
[15] MAIRAL J, BACH F, PONCE J, et al. Online dictionary learning for sparse coding[C]//Proceedings of the 26th Annual International Conference on Machine Learning. New York: ACM, 2009: 689-696.
[16] ZHAO B, WANG F, ZHANG C. Efficient multiclass maximum margin clustering[C]//Proceedings of the 25th International Conference on Machine Learning. New York: ACM, 2008: 1248-1255.
[17] LIU B, XIAO Y, CAO L, et al. One-class-based uncertain data stream learning[C]//Proceedings of the 2011 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics. Philadelphia: SIAM, 2011: 992-1003.
[18] BI J, ZHANG T. Support vector classification with input data uncertainty[C]//Neural Information Processing Systems. Cambridge: MIT Press, 2004: 161-168.
[19] TSANG S, KAO B, YIP K Y, et al. Decision trees for uncertain data [J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 23(1): 64-78.
[20] REN J, LEE S D, CHEN X, et al. Naive bayes classification of uncertain data[C]//2009 Ninth IEEE International Conference on Data Mining. Piscataway: IEEE, 2009: 944-949.
[21] GAO C, WANG J. Direct mining of discriminative patterns for classifying uncertain data[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2010: 861-870.
[22] YIN J, YANG Q, PAN J J. Sensor-based abnormal human-activity detection [J]. IEEE Transactions on Knowledge and Data Engineering, 2008, 20(8): 1082-1090.
[1] 郑丽苹, 邓秀勤, 张逸群. 基于图结构的分类数据距离度量[J]. 广东工业大学学报, 2023, 40(04): 109-116.
[2] 张欣, 王振友. 概率条件下基于双目标交替优化的知识表示模型[J]. 广东工业大学学报, 2022, 39(04): 24-31.
[3] 刘高勇, 谭依雯, 艾丹祥, 黄靖钊. 基于观点挖掘的突发事件微博意见领袖识别方法[J]. 广东工业大学学报, 2021, 38(04): 41-51.
[4] 王彦光, 朱鸿斌, 徐维超. ROC曲线及其分析方法综述[J]. 广东工业大学学报, 2021, 38(01): 46-53.
[5] 滕少华, 陈成, 霍颖翔. 小样本纠错的多层入侵检测分类研究[J]. 广东工业大学学报, 2020, 37(03): 9-16.
[6] 冯广, 孔立斌, 石鸣鸣, 贺敏慧, 何雅萱. 基于Inception与Residual组合网络的农作物病虫害识别[J]. 广东工业大学学报, 2020, 37(03): 17-22.
[7] 陈友鹏, 陈璟华. 基于鲸鱼优化参数的最小二乘支持向量机短期负荷预测方法[J]. 广东工业大学学报, 2020, 37(03): 75-81.
[8] 曾碧卿, 韩旭丽, 王盛玉, 徐如阳, 周武. 基于双注意力卷积神经网络模型的情感分析研究[J]. 广东工业大学学报, 2019, 36(04): 10-17.
[9] 王木华, 苏成悦, 朱文杰, 蔡则鹏, 任开众, 陈元电, 徐胜. 基于最小二乘的多旋翼无人机磁力计动态校准[J]. 广东工业大学学报, 2019, 36(04): 42-45,69.
[10] 刘贻新, 梁霄, 朱怀念, 张光宇. 新兴技术产业化障碍因素的识别及其分类:可持续转型视角[J]. 广东工业大学学报, 2018, 35(04): 1-9.
[11] 饶东宁, 黄思宏. 基于THUCTC的金融语料情感分析模型优化[J]. 广东工业大学学报, 2018, 35(03): 37-42.
[12] 黎启祥, 肖燕珊, 郝志峰, 阮奕邦. 基于抗噪声的多任务多示例学习算法研究[J]. 广东工业大学学报, 2018, 35(03): 47-53.
[13] 滕少华, 卢东略, 霍颖翔, 张巍. 基于正交投影的降维分类方法研究[J]. 广东工业大学学报, 2017, 34(03): 1-7.
[14] 曾碧, 林展鹏, 邓杰航. 自主移动机器人走廊识别算法研究与改进[J]. 广东工业大学学报, 2016, 33(05): 9-14.
[15] 贺科达, 朱铮涛, 程昱. 基于改进TF-IDF算法的文本分类方法研究[J]. 广东工业大学学报, 2016, 33(05): 49-53.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!