Journal of Guangdong University of Technology ›› 2017, Vol. 34 ›› Issue (03): 1-7.doi: 10.12052/gdutxb.170008

    Next Articles

Classification Method Based on Dimension Reduction

Teng Shao-hua, Lu Dong-lue, Huo Ying-xiang, Zhang Wei   

  1. School of Computers, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2017-01-11 Online:2017-05-09 Published:2017-05-09

Abstract:

Data mining algorithm in the era of big data needs to be able to efficiently deal with massive data. Traditional classification algorithms take a long time to train a model and classify the test dataset, and the algorithm is difficult to understand. To deal with the problems, a classification method based on dimension reduction is proposed in this paper. The multidimensional classification problem is transformed into multiple 2D projection surface combination by projection, and a density model of the projection surface is trained for classification. Compared with Support Vector Machines (SVM), Logistic Regression (LR), K-Nearest Neighbor (KNN) and other algorithms, the classification method based on dimension reduction has higher training efficiency and classification efficiency without loss of accuracy. The method is easy to implement, so it can be used for real-time application, such as intrusion detection and traffic scheduling.

Key words: data mining, classification, orthogonal projection, dimension reduction

CLC Number: 

  • TP391

[1] TENG L Y, TENG S H, TANG F, et al. A collaborative and adaptive intrusion detection based on SVMs and decision trees[C]//IEEE International Conference on Data Mining Workshop.[S.l.]:IEEE, 2014:898-905.
[2] 滕少华, 严远驰, 刘冬宁, 等. 基于FCM-C4.5的双过滤入侵检测机制[J]. 计算机应用与软件, 2016, 33(1):307-311. TENG S H, YAN Y C, LIU D N, et al. A dual filtration intrusion detection mechanism based on FCM and C4.5[J]. Computer Applications and Software, 2016, 33(1):307-311.
[3] VARUNA S, NATESAN P. An integration of k-means clustering and naïve bayes classifier for intrusion detection[C]//International Conference on Signal Processing, Communication and NETWORKING.[S.l.]:IEEE, 2015.
[4] GUMUS F, SAKAR C O, ERDEM Z, et al. Online naive bayes classification for network intrusion detection[C]//IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.[S.l.]:IEEE, 2014:670-674.
[5] 华辉有, 陈启买, 刘海, 等. 一种融合Kmeans和KNN的网络入侵检测算法[J]. 计算机科学, 2016, 43(3):158-162. HUA H Y, CHEN Q M, LIU H, et al. Hybrid K-means with KNN for network intrusion detection algorithm[J]. Computer Science, 2016, 43(3):158-162.
[6] 张雪芹, 顾春华, 吴吉义, 等. 基于约简支持向量机的快速入侵检测算法[J]. 华南理工大学学报(自然科学版), 2011, 39(2):108-112. ZHANG X Q, GU C H, WU J Y.et al. Fast intrusion detection algorithm based on reduced SVM[J]. Journal of South China University of Technology (Natural Science Edition). 2011, 39(2):108-112.
[7] 毕孝儒. 基于粗糙集属性约简和加权SVM的入侵检测方法研究[D]. 西安:西安科技大学计算机学院, 2011.
[8] DU H, TENG S, FU X, et al. A cooperative intrusion detection system based on improved parallel SVM[C]//Pervasive Computing (JCPC), 2009 Joint Conferences on.[S.1.]:IEEE, 2009:515-518.
[9] 刘琪. DH-SVM:基于SVM和动态混合算法的公交车辆路段运行时间估计与预测方法的研究[D]. 济南:山东大学微电子学院, 2015.
[10] 柏丛, 彭仲仁. 基于动态模型的公交车行程时间预测[J]. 计算机工程与应用, 2016, 52(3):103-107. BAI C, PENG Z R. Bus travel time prediction based on dynamic model[J]. Computer Engineering and Applications, 2016, 52(3):103-107.
[11] 杨婷, 滕少华. 改进的贝叶斯分类方法在电信客户流失中的研究与应用[J]. 广东工业大学学报, 2015, 32(3):67-72. YANG T, TENG S H. Research and application of improved bayes algorithm for the telecommunication customer churn[J]. Journal of Guangdong University of Technology, 2015, 32(3):67-72.
[12] 夏琴晔, 杨宜民. 基于biSCAN和SVM的机器人目标识别新算法研究[J]. 广东工业大学学报, 2013, 30(4):65-69. XIA Q Y, YANG Y M. Research on a new algorithm for robots's recognition of objects based on biSCAN and SVM[J]. Journal of Guangdong University of Technology, 2013, 30(4):65-69.
[13] LOHWEG V. UCI Machine Learning Repository:banknote authentication Data Set[EB/OL]. (2012-03-01)[2017-02-22]. http://archive.ics.uci.edu/ml/datasets/banknote+authentication#.
[14] BHATT R B, SHARMA G, DHALL A, et al. Efficient skin region segmentation using low complexity fuzzy decision tree model[C]//India Conference (INDICON), 2009 Annual IEEE.[S.1.]:IEEE, 2009.
[15] MALERBA D. UCI Machine Learning Repository:Page Blocks Classification Data Set[EB/OL]. (1996-11-03)[2017-02-22]. http://archive.ics.uci.edu/ml/datasets/Page+Blocks+Classification.
[16] MALERBA D. UCI Machine Learning Repository:Letter Recognition Data Set[EB/OL]. (1991-01-01)[2017-02-22]. http://archive.ics.uci.edu/ml/datasets/Letter+Recognition.

[1] Zhang Xin, Wang Zhen-you. A Knowledge Representation Model Based on Bi-Objective Alternate Optimization Under Probability [J]. Journal of Guangdong University of Technology, 2022, 39(04): 24-31.
[2] Liu Gao-yong, Tan Yi-wen, Ai Dan-xiang, Huang Jing-zhao. A Microblog Opinion Leader Identification Method Based on Opinion Mining in Emergencies [J]. Journal of Guangdong University of Technology, 2021, 38(04): 41-51.
[3] Wang Yan-guang, Zhu Hong-bin, Xu Wei-chao. A Review on ROC Curve and Analysis [J]. Journal of Guangdong University of Technology, 2021, 38(01): 46-53.
[4] Teng Shao-hua, Chen Cheng, Huo Ying-xiang. A Multi-Fold Self-Correction Small-Sample Classifier for Intrusion Detection [J]. Journal of Guangdong University of Technology, 2020, 37(03): 9-16.
[5] Zeng Bi-qing, Han Xu-li, Wang Sheng-yu, Xu Ru-yang, Zhou Wu. Sentiment Classification Based on Double Attention Convolutional Neural Network Model [J]. Journal of Guangdong University of Technology, 2019, 36(04): 10-17.
[6] Liu Yi-xin, Liang Xiao, Zhu Huai-nian, Zhang Guang-yu. Identification and Classification of Barriers to Emerging Technology Industrialization Based on Sustainable Transition (ST) Theory [J]. Journal of Guangdong University of Technology, 2018, 35(04): 1-9.
[7] Li Qi-xiang, Xiao Yan-shan, Hao Zhi-feng, Ruan Yi-bang. An Algorithm Based on Multi-task Multi-instance Anti-noise Learning [J]. Journal of Guangdong University of Technology, 2018, 35(03): 47-53.
[8] Chen Li, Cao Xi, Lin Jun-jie, Gao Hong-ming, Liu Fei-ya, Li Yan-yan. Prediction of Short-Term Load Based on Big Data Mining [J]. Journal of Guangdong University of Technology, 2017, 34(03): 105-109.
[9] HE Ke-da, ZHU Zheng-tao, CHENG Yu. A Research on Text Classification Method Based on Improved TF-IDF Algorithm [J]. Journal of Guangdong University of Technology, 2016, 33(05): 49-53.
[10] CHEN Bao-Ying, GAO Xue-Jun. New Classification Rules of 3-D Quadratic Autonomous Chaotic System [J]. Journal of Guangdong University of Technology, 2016, 33(01): 26-28.
[11] YANG Ting, TENG Shao-Hua. Research and Application of Improved Bayes Algorithm for the Telecommunication Customer Churn [J]. Journal of Guangdong University of Technology, 2015, 32(3): 67-72.
[12] FAN Dan-Jun, LUO De-Han, YU Hao. Classification of Pungent Chinese Herbals by Using Electronic Nose [J]. Journal of Guangdong University of Technology, 2015, 32(3): 91-96.
[13] ZHU Yuan-Xin, LIU Fu-Chun. Identification Study of Sedimentary Environment Based on Fuzzy Neural Network [J]. Journal of Guangdong University of Technology, 2015, 32(2): 48-52.
[14] DING Li-Juan, ZOU Guang-Tian, GUO Qiang, ZHANG Si. Research on the Theory of Extension Architecture Programming Data Mining [J]. Journal of Guangdong University of Technology, 2015, 32(1): 1-5.
[15] ZHANG Jia-Bin, ZHANG Jin-Chun, LI Ri-Hua, LI Chao-Ya. Research on Fault Diagnosis and Prevention Based on Extension [J]. Journal of Guangdong University of Technology, 2015, 32(1): 11-15.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!