广东工业大学学报 ›› 2017, Vol. 34 ›› Issue (03): 8-14.doi: 10.12052/gdutxb.170036

• 大数据基础理论与应用专题 • 上一篇    下一篇

面向汽车评论的细粒度情感分析方法研究

陈炳丰1, 郝志峰1,2, 蔡瑞初1, 温雯1, 王丽娟1, 黄浩1, 蔡晓凤1   

  1. 1. 广东工业大学 计算机学院, 广东 广州 510006;
    2. 佛山科学技术学院 数学与大数据学院, 广东 佛山 528000
  • 收稿日期:2017-02-20 出版日期:2017-05-09 发布日期:2017-05-09
  • 作者简介:陈炳丰(1983-),男,助理研究员,博士研究生,主要研究方向为数据挖掘、自然语言处理.E-mail:735180@qq.com
  • 基金资助:

    国家自然科学基金资助项目(U1501254,61472089,61572143);广东省自然科学基金资助项目(2014A030308008);广东省自然科学杰出青年基金资助项目(2014A030306004);广东省科技计划项目(2015B010108006);广东省教育厅项目(2015KQNCX027)

A Fine-grained Sentiment Analysis Algorithm for Automotive Reviews

Chen Bing-feng1, Hao Zhi-feng1,2, Cai Rui-chu1, Wen Wen1, Wang Li-juan1, Huang Hao1, Cai Xiao-feng1   

  1. 1. School of Computers, Guangdong University of Technology, Guangzhou 510006, China;
    2. School of Mathematics and Big Data, Foshan University, Foshan 528000, China
  • Received:2017-02-20 Online:2017-05-09 Published:2017-05-09

摘要:

情感分析方法能够在海量的汽车评论信息中挖掘出有价值的信息,在汽车产品设计、品牌营销等方面具有较大的应用价值.针对汽车评论分析的细粒度分析要求,本文提出了基于实体的细粒度情感分析方法.首先,对汽车评论数据进行文本细粒度处理,然后采用Linear-chain CRF模型对评论数据进行情感实体识别和情感倾向分类;再对Linear-chain CRF模型进行改进,提出了一种构造双层结构的CRF模型的方法,解决2个任务间的关联问题.实验结果表明,双层结构CRF模型的情感分析效果优于Linear-chain CRF模型,能够满足汽车评论在情感实体识别与情感倾向分类的需求.

关键词: 汽车评论, 情感分析, 情感词典, 细粒度, 条件随机场

Abstract:

Sentiment analysis method can mine valuable information from a mass of automotive reviews, which has great application value in automotive product design and brand marketing. For the requirements of fine-grained analysis, a fine-grained sentiment analysis algorithm is put forward based on the entity. Firstly, the automotive reviews are preprocessed, then the model of Linear-chain CRF is used to do sentiment entity recognition and sentiment classification. Secondly, in order to relate the entity recognition with sentiment classification, the model of Linear-chain CRF is improved, and a method of two-level CRF proposed. Experimental results show that two-level CRF is better than Linear-chain CRF in sentiment analysis, which can meet the demand of fine-grained sentiment analysis of automotive reviews.

Key words: automotive reviews, sentiment analysis, sentiment lexicon, fine-grained, conditional random field

中图分类号: 

  • TP181

[1] RANA T A, CHEAH Y. Aspect extraction in sentiment analysis:comparative analysis and survey[J]. Artificial Intelligence Review, 2016, 46(4):459-483.
[2] RAVI K, RAVI V. A survey on opinion mining and sentiment analysis:Tasks, approaches and applications[J]. Knowledge-based Systems, 2015, 89:14-46.
[3] 赵妍妍, 秦兵, 刘挺. 文本情感分析[J]. 软件学报. 2010, 21(8):1834-1848. ZHAO Y Y, QIN B, LIU T. Sentiment analysis[J]. Journal of Software, 2010, 21(8):1834-1848.
[4] 李纲, 程明结, 寇广增. 基于情感倾向识别的汽车评论挖掘系统构建[J]. 情报学报. 2011, 30(2):204-211. LI G, CHENG M J, KOU G Z. The construction of car comments mining system based on sentiment analysis[J]. Journal of the China Society for Scientific and Technical Information. 2011, 30(2):204-211.
[5] 张晶, 李德玉, 王素格. 基于多标记学习的汽车评论文本多性能识别[J]. 计算机工程与科学. 2016, 38(1):188-194. ZHANG J, LI D Y, WANG S G. Multiple performances identification for car review texts based on multi-label learning[J]. Computer Engineering and Science. 2016, 38(1):188-194.
[6] 廖健, 王素格, 李德玉, 等. 基于观点袋模型的汽车评论情感极性分类[J]. 中文信息学报. 2015, 29(03):113-120. LIAO J, WANG S G, LI D Y, et al. The Bag-of-Opinions method for car review sentiment polarity classification[J]. Journal of Chinese Information Processing. 2015, 29(03):113-120.
[7] 王山雨. 面向产品领域的细粒度情感分析技术[D]. 哈尔滨:哈尔滨工业大学计算机科学与技术学院, 2011.
[8] GATTI L, GUERINI M, TURCHI M. Sentiwords:Deriving a high precision and high coverage lexicon for sentiment analysis[J]. Ieee Transactions on Affective Computing. 2016, 7(4):409-421.
[9] DEVARAJ M, PIRYANI R, SINGH V K. Lexicon ensemble and lexicon pooling for sentiment polarity detection[J]. IETE Technical Review. 2016, 33(3):332-340.
[10] CHEN B, HAO Z, CAI R, et al. Sentiment target extraction based on CRFs with multi-features for Chinese microblog[C]//MORISHIMA A. Web Technologies and Applications-APWeb 2016. Heidelberg:Springer, 2016:29-41.
[11] ZHANG S, LIU H, YANG L, et al. A cross-domain sentiment classification method based on extraction of key sentiment sentence[C]//LI J, JI H, ZHAO D, et al. Natural Language Processing and Chinese Computing. Nanchang:LNAI, 2015:9362, 90-101.
[12] 郑敏洁, 雷志城, 廖祥文, 等. 基于层叠CRFs的中文句子评价对象抽取[J]. 中文信息学报. 2013, 27(3):69-76. ZHENG M J, LEI Z C, LIAO X W, et al. Identify sentiment-objects from chinese sentences based on cascaded conditional random fields[J]. Journal of Chinese Information Processing. 2013, 27(3):69-76.
[13] WU F, HUANG Y, YUAN Z. Domain-specific sentiment classification via fusing sentiment knowledge from multiple sources[J]. Information Fusion. 2017, 35:26-37.
[14] 郝志峰, 黄灿锦, 蔡瑞初, 等. 结合用户兴趣的微博信息传播模式挖掘[J]. 模式识别与人工智能. 2016, 29(10):924-935. HAO Z F, HUANG C J, CAI R C, et al. User interest related information diffusion pattern mining in microblog[J]. Pattern Recognition and Artificial Intelligence 2016, 29(10):924-935.
[15] 温雯, 吴彪, 蔡瑞初, 等. 基于多类别语义词簇的新闻读者情绪分类[J]. 计算机应用. 2016, 36(8):2076-2081. WEN W, WU B, CAI R C, et al. Emotion classification for news readers based on multi-category semantic word clusters[J]. Journal of Computer Applications. 2016, 36(8):2076-2081.
[16] 陈培文, 傅秀芬. 采用SVM方法的文本情感极性分类研究[J]. 广东工业大学学报. 2014(03):95-101. CHEN P W, FU X F. Research on sentiment classification of texts based on SVM[J]. Journal of Guangdong University of Technology. 2014(03):95-101.
[17] LAFFERTY J D, MCCALLUM A, PEREIRA F C N. Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of ICML.[S.l.:s.n.], 2001:282-289.
[18] SUTTON C, MCCALLUM A, ROHANIMANESH K. Dynamic conditional random fields:Factorized probabilistic models for labeling and segmenting sequence data[J]. Journal of Machine Learning Research. 2007, 8:693-723.

[1] 刘洪伟, 林伟振, 温展明, 陈燕君, 易闽琦. 基于MABM的消费者情感倾向识别模型——以电影评论为例[J]. 广东工业大学学报, 2022, 39(06): 1-9.
[2] 谭有新, 滕少华. 短文本特征的组合加权方法[J]. 广东工业大学学报, 2020, 37(05): 51-61.
[3] 饶东宁, 黄思宏. 基于THUCTC的金融语料情感分析模型优化[J]. 广东工业大学学报, 2018, 35(03): 37-42.
[4] 梁礼欣, 郝志峰, 蔡瑞初, 温雯. 基于混合高斯分布伪样本生成的情感分析方法[J]. 广东工业大学学报, 2016, 33(06): 85-90.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!