广东工业大学学报 ›› 2022, Vol. 39 ›› Issue (06): 36-43.doi: 10.12052/gdutxb.220042

• 综合研究 • 上一篇    下一篇

基于多智能体强化学习的社交网络舆情增强一致性方法

谢光强, 许浩然, 李杨, 陈广福   

  1. 广东工业大学 计算机学院, 广东 广州 510006
  • 收稿日期:2022-03-06 出版日期:2022-11-10 发布日期:2022-11-25
  • 通信作者: 李杨(1980-) ,女,教授,博士,主要研究方向为多智能体、差分隐私保护,E-mail:liyang@gdut.edu.cn
  • 作者简介:谢光强(1979-) ,男,教授,博士,主要研究方向为多智能体、智能控制、差分隐私保护
  • 基金资助:
    国家自然科学基金资助项目(61972102)

Consensus Opinion Enhancement in Social Network with Multi-agent Reinforcement Learning

Xie Guang-qiang, Xu Hao-ran, Li Yang, Chen Guang-fu   

  1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2022-03-06 Online:2022-11-10 Published:2022-11-25

摘要: 针对社交网络舆情动力学的增强一致性问题,提出了一种基于多智能体强化学习的智能感知模型(Consensus Opinion Enhancement with Intelligent Perception, COEIP) 。在舆情动力学场景下的马尔科夫决策过程中,首先通过双向循环神经网络设计了智能体的决策模型以解决智能体不定长感知的问题。然后通过差分奖励的思想针对收敛效率、连通度和通信代价三类目标,设计了有效的奖励函数。最后为优化COEIP模型,设计了基于策略梯度的多智能体探索与更新算法,让智能体在彼此交互过程中,通过奖励值自适应学习具备多目标权衡能力的邻域选择策略。大量仿真验证了COEIP在社交网络舆情动力学场景下可以有效调和智能体间的矛盾,降低系统稳定时的簇数,进而增强系统的舆情一致性。本模型为大规模社交网络下提高人群意见的统一性提供了新的解决方案,具有重要的理论指导意义。

关键词: 多智能体系统, 社交网络, 观点演化, 增强一致性

Abstract: Aiming at the problem of consensus enhancement in opinion dynamics of social network, a consensus opinion enhancement with intelligent perception (COEIP) model based on multi-agent reinforcement learning is proposed. In the Markov decision-making process in opinion dynamics, firstly, the decision-making model of agent is designed through bidirectional recurrent neural network to solve the problem of uncertain-length perception. Then, through the idea of difference reward, an effective reward function is designed for the three objectives of convergence efficiency, connectivity and communication cost. Finally, in order to optimize COEIP model, a multi-agent exploration and collaborative update algorithm based on policy gradient is designed, which can enable agents to adaptively learn the neighborhood selection strategy with multi-objective trade-off ability through the reward value in the process of interaction with each other. A large number of simulations verify that COEIP can effectively reconcile the contradictions between agents and reduce the number of clusters when the system is stable in the scenario of opinion dynamics of social network, thus enhancing the consensus opinion of the system. This model provides a new solution for improving the unity of people's opinions under large-scale social networks, which has important theoretical guiding significance.

Key words: multi-agent systems, social network, opinion dynamics, consensus enhancement

中图分类号: 

  • TP391
[1] DONG Y C, ZHA Q B, ZHANG H J, et al. Consensus reaching in social network group decision making: research paradigms and challenges [J]. Knowledge-Based Systems, 2018, 162: 3-13.
[2] ZHANG Z, GAO Y, LI Z L. Consensus reaching for social network group decision making by considering leadership and bounded confidence [J]. Knowledge-Based Systems, 2020, 204: 106240.
[3] SCOTT J, CARRINGTON P J. The SAGE handbook of social network analysis[M]. California: SAGE Publications, 2011.
[4] LI Y H, KOU G, LI G X, et al. Multi-attribute group decision making with opinion dynamics based on social trust network [J]. Information Fusion, 2021, 75: 102-115.
[5] LI T Y, ZHU H M. Effect of the media on the opinion dynamics in online social networks [J]. Physica A:Statistical Mechanics and its Applications, 2020, 551: 124117.
[6] JIAO Y R, LI Y L. An active opinion dynamics model: the gap between the voting result and group opinion [J]. Information Fusion, 2021, 65: 128-146.
[7] DOUVEN I, HEGSELMANN R. Mis-and disinformation in a bounded confidence model [J]. Artificial Intelligence, 2021, 291: 103415.
[8] BISWAS K, BISWAS S, SEN P. Block size dependence of coarse graining in discrete opinion dynamics model: application to the US presidential elections [J]. Physica A:Statistical Mechanics and its Applications, 2021, 566: 125639.
[9] ZHU L X, HE Y L, ZHOU D Y. Neural opinion dynamics model for the prediction of user-level stance dynamics [J]. Information Processing & Management, 2020, 57(2): 102031.
[10] BRAVO-MARQUEZ F, GAYO-AVELLO D, MENDOZA M, et al. Opinion dynamics of elections in Twitter[C]//2012 Eighth Latin American Web Congress. Colombia: IEEE, 2012: 32-39.
[11] ZHA Q B, KOU G, ZHANG H J, et al. Opinion dynamics in finance and business: a literature review and research opportunities [J]. Financial Innovation, 2020, 6(1): 1-22.
[12] DONG Y C, ZHAN M, KOU G, et al. A survey on the fusion process in opinion dynamics [J]. Information Fusion, 2018, 43: 57-65.
[13] SÎRBU A, LORETO V, SERVEDIO V D P, et al. Opinion dynamics: models, extensions and external effects[M]//Participatory sensing, opinions and collective awareness. Berlin: Springer, 2017: 363-401.
[14] URENA R, CHICLANA F, MELANCON G, et al. A social network based approach for consensus achievement in multiperson decision making [J]. Information Fusion, 2019, 47: 72-87.
[15] CABRERIZO F J, AL-HMOUZ R, MORFEQ A, et al. Soft consensus measures in group decision making using unbalanced fuzzy linguistic information [J]. Soft Computing, 2017, 21(11): 3037-3050.
[16] LI G X, KOU G, PENG Y. Heterogeneous large-scale group decision making using fuzzy cluster analysis and its application to emergency response plan selection [J]. IEEE Transactions on Systems, Man, and Cybernetics:Systems, 2021, 52(6): 3391-3403.
[17] XU S, WANG P, LYU J. Iterative neighbour-information gathering for ranking nodes in complex networks [J]. Scientific reports, 2017, 7(1): 1-13.
[18] NEDIĆ A, OLSHEVSKY A, RABBAT M G. Network topology and communication-computation tradeoffs in decentralized optimization [J]. Proceedings of the IEEE, 2018, 106(5): 953-976.
[19] ZHANG K Q, YANG Z R, BAŞAR T. Multi-agent reinforcement learning: a selective overview of theories and algorithms [J]. Handbook of Reinforcement Learning and Control, 2021: 321-384.
[20] 郑思远, 崔苗, 张广驰. 基于强化学习的无人机安全通信轨迹在线优化策略[J]. 广东工业大学学报, 2021, 38(04): 59-64.
ZHENG S Y, CUI M, ZHANG G C. Reinforcement learning-based online trajectory optimization for secure UAV communications [J]. Journal of Guangdong University of Technology, 2021, 38(04): 59-64.
[21] SHOU Z Y, DI X. Reward design for driver repositioning using multi-agent reinforcement learning [J]. Transportation research part C:emerging technologies, 2020, 119: 102738.
[22] SUN X Z, QIU J. Two-stage volt/var control in active distribution networks with multi-agent deep reinforcement learning method [J]. IEEE Transactions on Smart Grid, 2021, 12(4): 2903-2912.
[23] ZHANG K Q, YANG Z R, LIU H, et al. Fully decentralized multi-agent reinforcement learning with networked agents[C]//International Conference on Machine Learning. Sweden: IMLS, 2018: 5872-5881.
[24] DEY R, SALEM F M. Gate-variants of gated recurrent unit (GRU) neural networks[C]//2017 IEEE 60th international midwest symposium on circuits and systems. Michigan: IEEE, 2017: 1597-1600.
[25] SILVER D, SINGH S, PRECUP D, et al. Reward is enough [J]. Artificial Intelligence, 2021, 299: 103535.
[26] SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. Massachusetts: MIT press, 2018.
[27] FOERSTER J, FARQUHAR G, AFOURAS T, et al. Counterfactual multi-agent policy gradients[C]//Proceedings of the AAAI conference on artificial intelligence. Louisiana: AAAI Press, 2018, 32(1) : 2974-2982.
[28] AGOGINO A, TURNER K. Multi-agent reward analysis for learning in noisy domains[C]//Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems. Utrecht: IFAAMAS, 2005: 81-88.
[29] BARRAT A, BARTHELEMY M, PASTOR-SATORRAS R, et al. The architecture of complex weighted networks [J]. Proceedings of the national academy of sciences, 2004, 101(11): 3747-3752.
[30] SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//International conference on machine learning. Beijing: IMLS, 2014: 387-395.
[31] BLONDEL V D, HENDRICKX J M, TSITSIKLIS J N. On Krause's multi-agent consensus model with state-dependent connectivity [J]. IEEE transactions on Automatic Control, 2009, 54(11): 2586-2597.
[32] WU C W. Algebraic connectivity of directed graphs [J]. Linear and multilinear algebra, 2005, 53(3): 203-223.
[33] ESFAHANIAN A H. Connectivity algorithms[M]//Topics in structural graph theory. Cambridge: Cambridge University Press, 2013: 268-281.
[34] WANG H J, SHANG L H. Opinion dynamics in networks with common-neighbors-based connections [J]. Physica A:Statistical Mechanics and its Applications, 2015, 421: 180-186.
[35] CHENG C, YU C B. Opinion dynamics with bounded confidence and group pressure [J]. Physica A:Statistical Mechanics and its Applications, 2019, 532: 121900.
[1] 谷志华, 彭世国, 黄昱嘉, 冯万典, 曾梓贤. 基于事件触发脉冲控制的具有ROUs和RONs的非线性多智能体系统的领导跟随一致性研究[J]. 广东工业大学学报, 2023, 40(01): 50-55.
[2] 曲燊, 车伟伟. FDI攻击下非线性多智能体系统分布式无模型自适应控制[J]. 广东工业大学学报, 2022, 39(05): 75-82.
[3] 刘建华, 李佳慧, 刘小斌, 穆树娟, 董宏丽. 事件触发机制下的多速率多智能体系统非脆弱一致性控制[J]. 广东工业大学学报, 2022, 39(05): 102-111.
[4] 曾梓贤, 彭世国, 黄昱嘉, 谷志华, 冯万典. 两种不同脉冲欺骗攻击下随机多智能体系统的均方拟一致性[J]. 广东工业大学学报, 2022, 39(01): 71-77.
[5] 谢光强, 赵俊伟, 李杨, 许浩然. 基于多集群系统的车辆协同换道控制[J]. 广东工业大学学报, 2021, 38(05): 1-9.
[6] 张弘烨, 彭世国. 基于模型简化法的含有随机时延的多智能体系统一致性研究[J]. 广东工业大学学报, 2019, 36(02): 86-90,96.
[7] 彭嘉恩, 邓秀勤, 刘太亨, 刘富春, 李文洲. 融合社交和标签信息的隐语义模型推荐算法[J]. 广东工业大学学报, 2018, 35(04): 45-50.
[8] 饶东宁, 王军星, 魏来, 王雅丽. 并行最小割算法及其在金融社交网络中的应用[J]. 广东工业大学学报, 2018, 35(02): 46-50.
[9] 张振华, 彭世国. 二阶多智能体系统拓扑切换下的领导跟随一致性[J]. 广东工业大学学报, 2018, 35(02): 75-80.
[10] 罗贺富, 彭世国. 多时变时滞的多智能体系统的分布式编队控制[J]. 广东工业大学学报, 2017, 34(04): 89-96.
[11] 饶东宁, 温远丽, 魏来, 王雅丽. 基于Spark平台的社交网络在不同文化环境中的中心度加权算法[J]. 广东工业大学学报, 2017, 34(03): 15-20.
[12] 王晓彤. 基于PageRank的微博用户影响力度量[J]. 广东工业大学学报, 2016, 33(03): 49-54.
[13] 唐平; 杨宜民;. 多智能体系统与足球机器人系统体系结构研究[J]. 广东工业大学学报, 2001, 18(4): 1-4.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!