基于改进TF-IDF算法的文本分类方法研究

doi:10.3969/j.issn.1007-7162.2016.05.009

Journal of Guangdong University of Technology ›› 2016, Vol. 33 ›› Issue (05): 49-53.doi: 10.3969/j.issn.1007-7162.2016.05.009

Previous Articles Next Articles

A Research on Text Classification Method Based on Improved TF-IDF Algorithm

He Ke-da, Zhu Zheng-tao，Cheng Yu

School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China

Received:2015-09-22 Online:2016-09-10 Published:2016-09-10

Abstract

Abstract:

Establishing category keywords is the key problem in text classification, which should be solved first. On the basis of the classification of text by using the category keywords and TF-IDF algorithm, an improved TF-IDF algorithm has been proposed to overcome the shortcomings of the vector space model, which cannot well adjust the weights. Firstly, category keyword library should be established, and the expansion and duplication be carried out. The weight of keywords in the document is modified by the addition of the length of the document, and the shortage of the original features of the entry class distinction ability is solved effectively. By using Bayesian classification method, combined with the experiments, the effectiveness of the algorithm is verified, and the accuracy of text classification improved.

Key words: keyword extraction; feature selection; text classification; pretreatment

HE Ke-da, ZHU Zheng-tao, CHENG Yu. A Research on Text Classification Method Based on Improved TF-IDF Algorithm[J].Journal of Guangdong University of Technology, 2016, 33(05): 49-53.

References

Metrics

Viewed

Full text

3468

HTML			PDF

Just accepted	Online first	Issue	Just accepted	Online first	Issue
0	0	0	0	0	3468

From	Others	local

Times	455	3013
Rate	13%	87%

Abstract

481

Just accepted	Online first	Issue

0	0	481

From	Others	local

Times	158	323
Rate	33%	67%

Cited

Web of Science	Crossref	ScienceDirect	Search for Citations in Google Scholar >>


This page requires you have already subscribed to WoS.

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

A Research on Text Classification Method Based on Improved TF-IDF Algorithm

HTML

Abstract

Cite this article

share this article

References

Related Articles 0

Metrics

Comments

Recommended 0