Select
Classification Method Based on Dimension Reduction
Teng Shao-hua, Lu Dong-lue, Huo Ying-xiang, Zhang Wei
Journal of Guangdong University of Technology. 2017, 34 (03): 1-7.
DOI: 10.12052/gdutxb.170008
Data mining algorithm in the era of big data needs to be able to efficiently deal with massive data. Traditional classification algorithms take a long time to train a model and classify the test dataset, and the algorithm is difficult to understand. To deal with the problems, a classification method based on dimension reduction is proposed in this paper. The multidimensional classification problem is transformed into multiple 2D projection surface combination by projection, and a density model of the projection surface is trained for classification. Compared with Support Vector Machines (SVM), Logistic Regression (LR), K-Nearest Neighbor (KNN) and other algorithms, the classification method based on dimension reduction has higher training efficiency and classification efficiency without loss of accuracy. The method is easy to implement, so it can be used for real-time application, such as intrusion detection and traffic scheduling.
References |
Related Articles |
Metrics
Select
A Fine-grained Sentiment Analysis Algorithm for Automotive Reviews
Chen Bing-feng, Hao Zhi-feng, Cai Rui-chu, Wen Wen, Wang Li-juan, Huang Hao, Cai Xiao-feng
Journal of Guangdong University of Technology. 2017, 34 (03): 8-14.
DOI: 10.12052/gdutxb.170036
Sentiment analysis method can mine valuable information from a mass of automotive reviews, which has great application value in automotive product design and brand marketing. For the requirements of fine-grained analysis, a fine-grained sentiment analysis algorithm is put forward based on the entity. Firstly, the automotive reviews are preprocessed, then the model of Linear-chain CRF is used to do sentiment entity recognition and sentiment classification. Secondly, in order to relate the entity recognition with sentiment classification, the model of Linear-chain CRF is improved, and a method of two-level CRF proposed. Experimental results show that two-level CRF is better than Linear-chain CRF in sentiment analysis, which can meet the demand of fine-grained sentiment analysis of automotive reviews.
References |
Related Articles |
Metrics
Select
A Weighted Centrality Algorithm for Social Networks Based on Spark Platform in Different Cultural Environments
Rao Dong-ning, Wen Yuan-li, Wei lai, Wang Ya-li
Journal of Guangdong University of Technology. 2017, 34 (03): 15-20.
DOI: 10.12052/gdutxb.170023
Social networks are developed rapidly and used widely in the fields such as science and technology, business, economic and biological fields. People often use the centrality to quantify the importance degree of nodes in a social network. However, in the existing centrality algorithms, researchers only use a single centrality measuring, without considering the co-effects of different measuring. Therefore, a weighted centrality is proposed which is a function of different centrality measuring. Experiments here use a real social network database BoardEX, which is provided by our cooperative research institution, the University of Hong Kong. The size of the database is about 600G. This inspires us to use the Apache Spark platform to calculate such a big data. The experimental social network is divided into four regions:the U.S.A, the United Kingdom, Europe, others. First, the degree centrality of some persons, e.g. the chief technology officers or the chief information officers in a quoted company, in each region, is calculated. Then, a weighted function is constructed to calculate the average centrality. Experimental results show that, by setting the weighted values, the difference between the weighted centrality of regions is minimized. Besides, the weighted values reflect the contributions of various centrality measuring to the weighted centrality. With the application of real social network database and big data cluster computing, a more practical and promising application prospect is showed.
References |
Related Articles |
Metrics
Select
Group Role Assignment and its Optimization with Preorder Constraints
Liu Dong-ning, Lu Ming-jun, Huang Bao-ying, Liang Lu
Journal of Guangdong University of Technology. 2017, 34 (03): 21-29.
DOI: 10.12052/gdutxb.170013
If everyone or a unit in a team is assigned to specific work, the cooperation between teammates will be much easier than that without specific assignments. Nonetheless, due to the complexity of data coupling and space-time, the assignments with conflict constraints are a big challenge. As one of the most important but intractable constraints, the preorder constraint determines the prerequisites of assignments. Therefore, roles are introduced to abstract and model the assignment problem and express the assignment with the preorder constraints. Tested by the exhaustive method, the complexity of the proposed problem is of Σ2P. In order to optimize the solution of the problem and accelerate the processing speed, a multiple objective linear programming approach is proposed with the application of IMB ILOG CPLEX. To verify the proposed approach, simulation experiments are conducted. The optimization rate of the proposed approach could reach 80% to 100%, average 94%, which can meet the requirements of solving a certain number of problems within limited time as well as guarantee an excellent team performance and hence help support collaboration and management effectively.
References |
Related Articles |
Metrics
Select
Research on using information flow and HowNet to build big data semantic sharing channel
Mao Li-na, Li Wei-hua
Journal of Guangdong University of Technology. 2017, 34 (03): 30-35.
DOI: 10.12052/gdutxb.170026
In addition to the huge amount of data, wide range of data, different information structure, grammatical and semantic conflicts, highly heterogeneous and dynamic, big data is difficult to share. In order to share the semantic information in big data, there must be a sharing mechanism with dynamic, heterogeneous and large-scale features to enable users to share semantic information of big data. Information Flow theory, also called Channel Theory, as well as HowNet, have been analyzed. Combine them provide us bases of big data semantic understanding. The idea of building the big data semantics sharing channel based both on the information flow theory and Hownet is present. Information resources classify ontology, society ontology and channel ontology act as the kernel of the semantic sharing. Build the big data semantic sharing channel by infomorphisms. Professional information sharing as case study has carried on the preliminary practice. The experiment results show the effectiveness of the constructed channel. Combining information flow theory and HowNet technology can form a useful big data semantic sharing channel.
References |
Related Articles |
Metrics
Select
The Many to Many Friend Recommendation of Online Community Based E-CARGO
Zhang Wei, Zhang Si-qin, Song Jing-jing, Teng Shao-hua, Liu Yan
Journal of Guangdong University of Technology. 2017, 34 (03): 36-42.
DOI: 10.12052/gdutxb.170040
Friend recommendation is an effective method for establishing an online community. However, over frequent recommendations may be the opposite and become nuisances to users. To improve users' experience, a new method of friend recommendation is proposed via many-to-many assignment. This method limits the number of recommended and accepted friends. It takes as the application background the website http://www.scholat.com/, which is a large higher education and research collaboration platform. Recommendation is modeled via Role-Based Collaboration and its E-CARGO model. After that, the Kuhn-Munkres with Backtracking (KMB ) algorithm is used to solve the optimal assignment of the proposed method. Simulation experiments show that the proposed recommendation method is friendly, efficient and accurate. It can improve the online community recommendation mechanisms, which can support the development of a virtual society.
References |
Related Articles |
Metrics
Select
A Research on Mapping Knowledge Domains of Strategic Niche Management (SNM) in China: Based on the Quantitative Analysis of CiteSpace Ⅲ
Liu Yi-xin, Zhang Guang-yu, Yang Shi-wei, Zhang Yu-lei
Journal of Guangdong University of Technology. 2017, 34 (03): 59-66.
DOI: 10.12052/gdutxb.170033
At present, the international academia is waging on research of Strategic Niche Management (SNM) aiming at the brand-new technology of leaping over the "valley of death" and achieving sustainable development, and the method of SNM has also aroused interest in China. In order to sort out the current domestic SNM research status, 93 Chinese literature sources from CNKI are collected for data. Using methods of mapping knowledge domain and CiteSpace Ⅲ, a visual analysis is conducted from three aspects, which include the researcher cooperative network, research institutions cooperative network and keywords of co-occurrence mapping knowledge analysis, aiming at reviewing a comprehensive status and focus of SNM research. The research indicates that:(1) SNM research in domestic has formed a relatively significant research team at present. However, the cooperation between research institutions is low and not widespread; (2) There are lots of research hot spots of SNM and they present obvious dynamic evolution characteristics; (3) SNM research path in domestic circle can be roughly divided into two branches, which is the path of "Niche-Technology Niche-Enterprise Technical Ability" and the path of "Technology Niche-Strategic Niche Management-Localization Application"; (4) The leading research of SNM foreland mainly focuses on construction of protective space, evacuation and effect evaluation of protective space.
References |
Related Articles |
Metrics
Select
A Research on the Application of Master Data Management Technology in Enterprise Information Integration
Lin Sui, Li Yu-zhen, Sun Wei-jun
Journal of Guangdong University of Technology. 2017, 34 (03): 67-71.
DOI: 10.12052/gdutxb.170015
With the wide application of big data, accurate management of enterprise information needs to be strengthened. Under the background of big data, aiming at the problem of multi-source heterogeneous data, based on the criticality, uniqueness and long-term validity of master data, it is the best way to realize the enterprise information integration by constructing the master data platform. Through the principle of "multiple data one source, one source multiple distributions", a complete, unified, centralized and unified enterprise information integration mechanism can be built, and a data management system established conforming to enterprise information norms, and realizing the comprehensive sharing of basic information of enterprises and the distribution of unified data, for government decision-making departments to understand and master comprehensively, dynamically and accurately the business registration and production and management.
References |
Related Articles |
Metrics
Select
A Study of Traffic Status and Dynamic Control Based on IC Card Data
Wu Jin-cheng, Xie Zhen-dong, Wu Guan-hua, Fang Qiu-shui, Yu Hong-ling
Journal of Guangdong University of Technology. 2017, 34 (03): 77-82.
DOI: 10.12052/gdutxb.170010
Considering the rapidly growing demand of travel and unreasonable structure, a kind of dynamic charge model and application of bus travel is put forward. As is known, people have different travel behaviors in different periods of a day, which leads to kinds of traffic status. By analyzing data of IC card, a curve model is built in order to discover travel patterns of urban people, and further optimize transportation in rush hour. IC card data from bus-line in the city is studied by statistics, contrast and model-building to further analyze travel time, demand and varying curve to find the travel habits of urban people. Lastly, measures of dynamic charge are proposed to control travel flow in busy time. The result shows that this dynamic control design can optimize the daily travel structure and reduce travel density, which probably provides some reference for public transportation management.
References |
Related Articles |
Metrics
Select
A Research on Text Information Extraction from Annual Report Based on Domain Ontology
Liang Zhuo-qian, Wang Dong, Zhu Hui, Pan Ding
Journal of Guangdong University of Technology. 2017, 34 (03): 89-95.
DOI: 10.12052/gdutxb.170029
Significant financial information can be retrieved from the vast amount of textual data provided in Chinese business accounting reports (annual reports). Nevertheless, due to the unstructured nature, this textual information usually is difficult to be obtained and analyzed via traditional computer and database techniques. To address this issue, a set of unified domain-specific ontology is presented, combined with Chinese Natural language processing (NLP), which transforms accounting reports in unstructured text into a structured XBRL-based form via three different dimensions, namely word attribute description, word relation organization, and related knowledge links respectively.
References |
Related Articles |
Metrics
Select
RFID-based Production Logistic Synchronization Intelligent Management System in the Industrial Park
Wu Qiang, Liu Xuan, Qu Ting, Zhang Ting
Journal of Guangdong University of Technology. 2017, 34 (03): 96-104.
DOI: 10.12052/gdutxb.160125
Many problems, such as production logistic information un-synchronization, low efficiency of execution and high operation costs, exist in industrial parks. To address these problems, standard AUTOM information infrastructure is extended to a new infrastructure which supports multi-stage and multi-decision unit seamless information exchanging. And a new "three-stage two degree" production logistic synchronization has been proposed. An industrial park production logistic synchronization intelligent management system based on production logistic operation process analysis and advanced IoT technology are proposed. This system includes four key technologies, intelligent equipment synchronization, sensing synchronization, information coupling synchronization and decision-making synchronization. Apart from realizing real-time synchronization and intelligent management of production logistic information, it also improves execution efficiency.
References |
Related Articles |
Metrics
Select
Prediction of Short-Term Load Based on Big Data Mining
Chen Li, Cao Xi, Lin Jun-jie, Gao Hong-ming, Liu Fei-ya, Li Yan-yan
Journal of Guangdong University of Technology. 2017, 34 (03): 105-109.
DOI: 10.12052/gdutxb.170044
The risk of power load becomes the hot spot in the electric power industry; however, due to the single factor evaluation, the traditional power load forecasting model is not adequately comprehensive and systematic. Hence, it cannot accurately predict the risk and may cause hidden danger of power failures. To address this issue, the risk of power load is analyzed and forecast by collecting data from multiple sources:customer service center, machine, and historical weather records and so on. First by cleaning and sorting the data and then by the K-Mean clustering, variables are chosen which have strong correlation with risk degree of transformer to construct the Bayesian discriminant models. The experimental results show that this model can accurately predict the risk of transformer at a certain probability of 99.53%. In the practical aspect, this model can provide prevention scheme and control decisions to power supply security and contribute to reduce customer's electricity failure and improve customer satisfaction.
References |
Related Articles |
Metrics
Select
A Study of Construction and Cultivation of Big Data Capacity of Enterprise
Xie Zhen-dong, Wu Jin-cheng, Li Zhi-ming, Wu Guan-hua
Journal of Guangdong University of Technology. 2017, 34 (03): 110-114.
DOI: 10.12052/gdutxb.170009
In the era of big data, there is in society a consensus concerning the development and application of big data, which has infiltrated in all walks of life. Enterprises are one of important sources of big data, and are also the key carriers. The big data technology has become an important trend of future industrial development and enterprise transformation and upgrading. However, most enterprises don't have a set of methods dealing with big data. Considering that case, some ideas are put forward about construction of big data capacity of enterprises, which are expected to provide some measures and thoughts on the capacity construction and development of big data for enterprises.
References |
Related Articles |
Metrics