Journal of Guangdong University of Technology ›› 2021, Vol. 38 ›› Issue (03): 17-21.doi: 10.12052/gdutxb.200124

Previous Articles     Next Articles

A Small Sample Data Prediction Method Based on Global Data Shuffling

Lai Jun, Liu Zhen-yu, Liu Sheng-hai   

  1. School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2020-09-22 Online:2021-05-10 Published:2021-03-12

Abstract: Based on the Guangzhou license plate auction price data set, linear regression combined with k-fold cross-validation is used to study the prediction method of a small sample data set. In order to solve the problem of increased verification errors caused by local specific data in a small sample set, a strategy to shuffle the data globally before verification is proposed. Finally, it is verified through experiments that this strategy can significantly reduce the verification error. Based on this, through multiple sets of experimental verification, the appropriate parameters are determined, and the results show that the total average correct rate of the final predicted value has reached 95%.

Key words: linear regression, k-fold cross-validation, stochastic gradient descent, data shuffling, deep learning

CLC Number: 

  • TP183
[1] SEVGICAN S, TRRAN M, GOKARSLAN K, et al. Intelligent network data analytics function in 5G cellular networks using machine learning [J]. Journal of Communications and Networks, 2020, 22(3): 269-280.
[2] MADHURI C R, ANURADHA G, PUJITHA M V. House price prediction using regression techniques: a comparative study[C]//2019 International Conference on Smart Structures and Systems (ICSSS). Chennai: IEEE, 2019: 1-5.
[3] JAHANDARI S, KALHOR A, ARAABI B N. Online forecasting of synchronous time series based on evolving linear models [J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020, 50(5): 1865-1876.
[4] HASAN M M, SULTANA M I, SALMA U, et al. Investigation of influential factors towards predicting birth rate in Bangladesh[C]//2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE). Vellore: IEEE, 2020: 1-6.
[5] 谢振东, 刘雪琴, 吴金成, 等. 公交IC卡数据客流预测模型研究[J]. 广东工业大学学报, 2018, 35(1): 16-22.
XIE Z D, LIU X Q, WU J C, et al. A Study of passenger flow prediction based on IC card data [J]. Journal of Guangdong University of Technology, 2018, 35(1): 16-22.
[6] BEAN W T, STAFFORD R, BRASHARES J S. The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models [J]. Ecography, 2012, 35: 250-258.
[7] ASIRET S, SUNBUL S O. Investigating test equating methods in small samples through various factors [J]. Educational Sciences: Theory & Practice, 2016, 16(2): 647-668.
[8] DABBAGHCHIAN S, AGHAGOLZADEH A, MOIN M S. Reducing the effects of small sample size in DCT domain for face recognition[C]//2008 International Symposium on Telecommunications. Tehran: IEEE, 2008: 634-638.
[9] ZHANG H, YUAN H, LI P. Estimation method for extremely small sample accelerated degradation test data[C]//First International Conference on Reliability Systems Engineering (ICRSE). Beijing: IEEE, 2015: 21-23.
[10] FURSOV V A, GAVRILOV A V, KOTOV A P. Prediction of estimates' accuracy for linear regression with a small sample size[C]//2018 41st International Conference on Telecommunications and Signal Processing (TSP). Athens: IEEE, 2018: 1-7.
[11] ZHENG C, WANG N, CUI J. Hyperspectral Image Classification With Small Training Sample Size Using Superpixel-Guided Training Sample Enlargement [J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(10): 7307-7316.
[12] TSAKIRIS M C, PENG L, CONCA A, et al. An Algebraic-Geometric Approach for Linear Regression Without Correspondences [J]. IEEE Transactions on Information Theory, 2020, 66(8): 5130-5144.
[13] 广州产权交易所. 广州市中小客车指标竞价情况表[EB/OL]. [2020-09-21]. http://www.gzqcjj.com/article/gonggao,
[14] ZHANG A, LI M, LIPTON Z C. Dive into deep learning[EB/OL]. [2020-09-21]. http://www.d2l.ai.
[15] GOODFELLOW I, BENGIO Y, COURVILLE A. Deep Learning[M]. Cambridge: MIT Press, 2016.
[16] Torch Contributors. Pytorch documentation [EB/OL]. [2020-09-21]. https://pytorch.org/docs/stable/index.html.
[1] Liu Dong-ning, Wang Zi-qi, Zeng Yan-jiao, Wen Fu-yan, Wang Yang. Prediction Method of Gene Methylation Sites Based on LSTM with Compound Coding Characteristics [J]. Journal of Guangdong University of Technology, 2023, 40(01): 1-9.
[2] Xu Wei-feng, Cai Shu-ting, Xiong Xiao-ming. Visual Inertial Odometry Based on Deep Features [J]. Journal of Guangdong University of Technology, 2023, 40(01): 56-60,76.
[3] Liu Hong-wei, Lin Wei-zhen, Wen Zhan-ming, Chen Yan-jun, Yi Min-qi. A MABM-based Model for Identifying Consumers' Sentiment Polarity―Taking Movie Reviews as an Example [J]. Journal of Guangdong University of Technology, 2022, 39(06): 1-9.
[4] Zhang Yun, Wang Xiao-dong. A Review and Thinking of Deep Learning with a Restricted Number of Samples [J]. Journal of Guangdong University of Technology, 2022, 39(05): 1-8.
[5] Zeng Jiang-yi, Li Zhi-sheng, Ou Yao-chun, Jin Yu-kai. PM2.5 Concentration Improving Prediction Modeling of Seasonal Index [J]. Journal of Guangdong University of Technology, 2022, 39(03): 89-94.
[6] Zheng Jia-bi, Yang Zhen-guo, Liu Wen-yin. Marketing-Effect Estimation Based on Fine-grained Confounder Balancing [J]. Journal of Guangdong University of Technology, 2022, 39(02): 55-61.
[7] Gary Yen, Li Bo, Xie Sheng-li. An Evolutionary Optimization of LSTM for Model Recovery of Geophysical Fluid Dynamics [J]. Journal of Guangdong University of Technology, 2021, 38(06): 1-8.
[8] Cen Shi-jie, He Yuan-lie, Chen Xiao-cong. A Monocular Depth Estimation Combined with Attention and Unsupervised Deep Learning [J]. Journal of Guangdong University of Technology, 2020, 37(04): 35-41.
[9] Zeng Bi, Ren Wan-ling, Chen Yun-hua. An Unpaired Face Illumination Normalization Method Based on CycleGAN [J]. Journal of Guangdong University of Technology, 2018, 35(05): 11-19.
[10] Yang Meng-jun, Su Cheng-yue, Chen Jing, Zhang Jie-xin. Loop Closure Detection for Visual SLAM Using Convolutional Neural Networks [J]. Journal of Guangdong University of Technology, 2018, 35(05): 31-37.
[11] Xie Zhen-dong, Liu Xue-qin, Wu Jin-cheng, Leng Meng-tian. A Study of Passenger Flow Prediction Based on IC Card Data [J]. Journal of Guangdong University of Technology, 2018, 35(01): 16-22.
[12] Chen Xu, Zhang Jun, Chen Wen-wei, Li Shuo-hao. Convolutional Neural Network Algorithm and Case [J]. Journal of Guangdong University of Technology, 2017, 34(06): 20-26.
[13] Liu Zhen-yu, Li Jia-jun, Wang Kun. A Fingerprint Matching Localization Method Based on Deep Auto Encoder [J]. Journal of Guangdong University of Technology, 2017, 34(05): 15-21.
[14] Liang Xun. Review of the New Methods of Fast Estimation of Construction Engineering Unit Square Cost [J]. Journal of Guangdong University of Technology, 2012, 29(3): 107-110.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!