Loading...
Current Issue
  • , Volume 31 Issue 3 Previous Issue    Next Issue
    Comprehensive Studies
    The Application of the Adjustable Multitimes Clustering Algorithm in Telecom Data
    TENG Shao-Hua, WU Hao, LI Ri-Gui, ZHANG Wei, LIU Dong-Ning, LIANG Lu
    Journal of Guangdong University of Technology. 2014, 31 (3): 1-7.   DOI: 10.3969/j.issn.1007-7162.2014.03.001
    Abstract    HTML ( )   PDF(694KB)
    Huge amounts of telecom data are generated every day, so how to extract useful information from the data is one of the data mining problems. Because different applications need different clusters, sometimes a single K-means cluster algorithm cannot generate userspecified K-clusters. An adjustable multitimes clustering mining method is proposed. A big value K was used in the K-means clustering algorithm for the first time, and K clusters were obtained. They were used to select the number of the clusters and the initial centers of the clusters for the second time. The experimental results show that our method is effective, and it can be applied to mining different amounts of clusters and big data analysis.
    Related Articles | Metrics
    The Storage Model of Big Data Basicelements in HBase Database and Its Realization
    LI Qiao-Xing, QIANG Bao-Hua, YANG Chun-Yan
    Journal of Guangdong University of Technology. 2014, 31 (3): 8-13.   DOI: 10.3969/j.issn.1007-7162.2014.03.002
    Abstract    HTML ( )   PDF(280KB)
    Big data will have a profound impact on economics, society and life in the near future, and the research on integration and storage of big data may play an important theoretical and practical role in promoting and deepening the application scope of big data. It utilized the data storage structure of the distributed file system named as HBase and the basicelement of Extenics to integrate the heterogeneous data sets, and then stored the processed data set in HBase database. The new data set, which was obtained by extracting the typical characteristics as well as their value of data, especially the semistructured and unstructured data, provides not only a new way for the analysis and interpretation of data but also the research ideas and strategy generation for the professional issues from the perspective of big data.
    Related Articles | Metrics
    Multi-objective Optimization Algorithm Using Cloud Model and Favour Ranking
    GAO Ying, YU Qi, LIU Wai-Xi
    Journal of Guangdong University of Technology. 2014, 31 (3): 14-20.   DOI: 10.3969/j.issn.1007-7162.2014.03.003
    Abstract    HTML ( )   PDF(324KB)
    A multiobjective optimization algorithm inspired from cloud model and using favour ranking is introduced. The innovation of the algorithm lies in the estimation of good solution regions and new solution production according to the cloud model theory. The algorithm used information obtained during optimization to build the cloud model for good solution regions, and estimated three digital characteristics of the cloud model by backward cloud generators. Afterwards, forward cloud generators were used to generate current offsprings population according to three digital characteristics. The population with the current population and current offsprings population was sorted using favour ranking, and the best individuals were selected to form the next population. Regarding a set of benchmark functions, the proposed algorithm was tested and compared with some other algorithms. The experimental results show that the algorithm is effective  in the benchmark functions.
    Related Articles | Metrics
    Privacy-preserving Algorithm for Cross-organizational Collaborative
    LIU Hong-Wei, LIU Zhi-Hui, ZHU Hui, LU Tao
    Journal of Guangdong University of Technology. 2014, 31 (3): 21-26.  
    Abstract    HTML ( )   PDF(316KB)
    The cross-organizational data has the typical characteristics of big data in collaborative optimization decision-making, such as distributedness, heterogeneity, and privacy etc. Secure multiparty computation(SMC) is a privacy-preserving algorithm, based on collaborative mechanisms or protocols. However, the methods typically used, such as monotone span program, cannot get rid of the computational complexity. It discussed two kinds of   problems in collaborative optimization decision, and proposed secure multi-party computation protocols for the privacy-preserving of constraint parameters and decision variables. Then, it gave security proof. The results show that the SMC protocols can reduce the computational complexity of collaborative optimization decisions, and computation of some private information can be completed without transferred processing.
    Related Articles | Metrics
    Research and Implementation on Competitive Intelligence System Based on Big Data
    WANG Yong, XU Zhong-Tao, WANG Ying
    Journal of Guangdong University of Technology. 2014, 31 (3): 27-31.   DOI: 10.3969/j.issn.1007-7162.2014.03.005
    Abstract    HTML ( )   PDF(293KB)
    To help enterprises to obtain timely, accurate and reliable intelligence information under the environment of big data, it proposed how to build an enterprise competitive intelligence system, based on big data. It used web crawlers of focused crawler works to collect intelligence information, and employed the Hadoop-based KNN algorithm to process and classify the intelligence information via the B/S (Browser/Server) framework, which can solve the problem of high time complexity of the KNN classification algorithm in a big data environment and other issues. The system also allows users some customization, making it more individualized.
    Related Articles | Metrics
    Some Challenges in Clustering Analysis
    JIANG Sheng-Yi, WANG Lian-Xi
    Journal of Guangdong University of Technology. 2014, 31 (3): 32-38.   DOI: 10.3969/j.issn.1007-7162.2014.03.006
    Abstract    HTML ( )   PDF(294KB)
    The aim of clustering is to help people find and recognize the unknown world, so as to accumulate knowledge for us in real life. Clustering analysis is an important part for the majority of researchers in unsupervised leaning, and is usually used as an analysis tool to explore the unknown data and its regularity for many cross subjects. It analyzed the procedure of clustering, and briefly surveyed the related achievements. Moreover, the problems of clustering algorithms in processing various data types, high dimensional data, unbalanced data were concluded, and the expansibility and the selection of evaluation index for algorithms were also discussed in detail. At last, some directions for future research were proposed. The above work can give valuable reference to further studies of clustering and data mining.
    Related Articles | Metrics
    A Recognition Algorithm of Noise Applied to Environments with Intensive Noise-data Distribution
    CHEN Ping-Hua, ZHOU Peng
    Journal of Guangdong University of Technology. 2014, 31 (3): 39-43.   DOI: 10.3969/j.issn.1007-7162.2014.03.007
    Abstract    HTML ( )   PDF(345KB)
    By combining the PageRank algorithm with the features of intensive noisedata to improve the noise-data recognition rate of DBSCAN in environments with intensive Noise-Point distribution, it structured the inner-cluster mapping function for voting, and proposed the inter-cluster voting noise recognition algorithmNoiseRank. Experimental results show that in environments with intensive NoisePoint distribution, the Noise-data recognition rate of NoiseRank is much higher than that of DBSCAN.
    Related Articles | Metrics
    Hybrid Dynamic Collaborative Filtering Algorithm Based on Big Data Sets
    WANG Ling, FU Xiu-Fen, WANG Xiao-Mu
    Journal of Guangdong University of Technology. 2014, 31 (3): 44-48.   DOI: 10.3969/j.issn.10077162.2014.03.008
    Abstract    HTML ( )   PDF(367KB)
    Collaborative filtering has been widely used in the recommendation system, but the traditional  algorithm has some limitations, such as inability to adapt to the sparsity of user-item rating matrix data sets well, failure to consider the classification of item, users-scores, interest change over time and other factors when calculating the similarity of the items. Regarding these limitations, it proposed a big data set hybrid dynamic collaborative filtering algorithm, based on the traditional collaborative filtering algorithm. When calculating the similarity of items, time decay functions were introduced in the algorithm, which considered both the similarity of items, scores and items classified. The weights of project integrated similarity could be adjusted automatically. In the algorithm, some improvements have also been made in similarity computing and the selection of the neighboring items. To verify the effectiveness of the algorithm, experiments were done on movielens data sets. Experimental results show that the algorithm is better than the traditional recommendation algorithms.
    Related Articles | Metrics
    A Secure Cloud Storage Model Based on HDFS
    LIN Sui, HUANG Jian, JIANG Wen-Chao, QIN Guo-Min
    Journal of Guangdong University of Technology. 2014, 31 (3): 49-54.   DOI: 10.3969/j.issn.10077162.2014.03.009
    Abstract    HTML ( )   PDF(438KB)
    It proposed a novel secure cloud storage model (ASOM), based on HDFS. Through isolating the metadata from the physical storage and the communication between Data nodes and metadata subservers, the ASOM model can guarantee secure metadata management. The main advantages of our approach include avoiding the superposition of complex security policies and the mistrust between the users and the platform. Furthermore, our security storage service can be easily integrated into the cloud computing environment.
    Related Articles | Metrics
    Research on Intrusion Prevention Based on Trust in Cloud Environments
    WANG Shuang-Tu, HAN Jian-Hua, LUO Jun
    Journal of Guangdong University of Technology. 2014, 31 (3): 55-61.   DOI: 10.3969/j.issn.10077162.2014.03.010
    Abstract    HTML ( )   PDF(395KB)
    Cloud computing has the nature of  being dynamic, virtual and open since it was used, and all kinds of largescale cloud security incidents make the safety of cloud environments frequently questioned. To ensure the security of cloud environments, it proposed an intrusion prevention framework model, based on trusted computing in cloud environments, by combining intrusion prevention technologies and trusted computing ideas. The model began with the principle of intrusion prevention with access to behavioral characteristics. Then, these features were gradually normalized, and the weight of each feature was determined to obtain user nodes' credibility. Next, it used a variety of cloud cluster server engines to detect defense and make integrated decision analysis and cluster analysis, enabling the cloud to make timely fast intrusion prevention, which avoids the drawbacks of the traditional intrusion prevention, such as minding only their own business, lagging behind in detecting and preventing attacks. The model provides cloud users with the maximum intrusion prevention services, and ensures that the cloud can withstand attacks, making the cloud and cloud users secure.
    Related Articles | Metrics
    An Improved RFID Mutual Authentication Protocol Based on Hash Function
    XIE Jin-Biao, OU Yu-Yi , LING Jie
    Journal of Guangdong University of Technology. 2014, 31 (3): 62-66.   DOI: 10.3969/j.issn.1007-7162.2014.03.011
    Abstract    HTML ( )   PDF(319KB)
    In view of the existing defects of RFID authentication protocol, based on the Hash function, and the low efficiency of the security protocol authentication in the application of the Internet of Things,  it proposed an improved RFID security mutual authentication protocol, based on the Hash function. This protocol can protect data privacy, and prevent replay attack, privacy track, and spoofing attack. Compared with other protocols of this kind in security and performance, this protocol, which uses Label certification mark Tuse, Tstore and dynamic secret value S, can prevent desynchronization attack, has higher efficiency, and is suitable for low-cost RFID systems.
    Related Articles | Metrics
    Anonymous Authentication Protocol Based on Bilinear Pairing and Nonce in Cloud Computing
    ZHAO Guang-Qiang, LING Jie
    Journal of Guangdong University of Technology. 2014, 31 (3): 67-71.   DOI: 10.3969/j.issn.10077162.2014.03.012
    Abstract    HTML ( )   PDF(457KB)
    Aiming at the high requirements for security of cloud computing services, it  proposes an anonymous authentication protocol in cloud computing, and designs a model that suits identity authentication. The temporary identity of the user was constructed by using bilinear pairings. Besides, the nonce was used to replace time stamps to avoid the problem of clock synchronization. The protocol implements mutual authentication, based on the calculation difficulty of  the Discrete Logarithm Problem and the irreversibility of hash function. The protocol is efficient and has the characteristic of high security. It can be applied in distributed cloud environments, which need to protect the users privacy.
    Related Articles | Metrics
    Design of CAE Software Integrated System Based on Integration of Cloud Computing and Super Computing
    LIN Xin-Da, LIN Sui
    Journal of Guangdong University of Technology. 2014, 31 (3): 72-76.   DOI: 10.3969/j.issn.1007-7162.2014.03.013
    Abstract    HTML ( )   PDF(357KB)
    Cloud computing and super computing are the products of information age. Aiming at the users needs in their use of CAE software and its complex business for high performance computing, it proposes the architecture design and the key technology of implementation of the CAE software system, based on integration of  cloud computing and super computing, and gives an example to illustrate that.
    Related Articles | Metrics
    Task Scheduling Algorithm Based on Simulated Annealing Ant Colony Algorithm in Cloud Computing Environment
    ZHANG Hao-Rong, CHEN Ping-Hua, XIONG Jian-Bin
    Journal of Guangdong University of Technology. 2014, 31 (3): 77-82.   DOI: 10.3969/j.issn.1007-7162.2014.03.014
    Abstract    HTML ( )   PDF(435KB)
    It studies the task scheduling in cloud computing, and proposes a hybrid scheduling algorithm(ACOSA) combined with ant colony algorithm and simulated annealing algorithm for the MapReduce programming framework of cloud computing. This algorithm aims at minimizing the scheduling time and introduces the task and resource matching factors and load balance. Firstly, the ant colony algorithm was used to get the optimal solution to a set of tasks and resources. Then, the path was optimized, and the pheromone of solution was updated by  the simulated annealing algorithm. Lastly, they were recompiled by extending Cloudsim cloud computing simulation platform, and the ACOSA algorithm was achieved. The experimental results show that the algorithm has a good performance in scheduling time and load balancing.
    Related Articles | Metrics
    Clusterhead Selection Mechanism Based on Energy Ratio in Wireless Sensor Networks
    LI He, LIU Guang-Cong, HU Die
    Journal of Guangdong University of Technology. 2014, 31 (3): 83-87.   DOI: 10.3969/j.issn.1007-7162.2014.03.015
    Abstract    HTML ( )   PDF(357KB)
    Regarding energy consumption for different degrees of nodes in a wireless sensor network, it proposed the energy cluster head selection mechanism, based on the cost of energy. In this mechanism, the first common node was seen as the cluster head node. Nodes, necessary node energy consumption, and the current energy surplus value were computed. With the proposed concepts and formulas, the above energyrelated value was converted into the value of energy loss speed of the selected clusters at measurable sensor nodes. Simulation results show that the algorithm has a better cluster head selection, and it can effectively extend the network life cycle.
    Related Articles | Metrics
    Recognition and Normalization of Chinese Time Expressions Based on Rules
    ZUO Ya-Yao, LONG Yao-Fa, LI Jie-Jun
    Journal of Guangdong University of Technology. 2014, 31 (3): 88-94.   DOI: 10.3969/j.issn.10077162.2014.03.016
    Abstract    HTML ( )   PDF(380KB)
    Concerning the problem with the recognition and normalization of time expressions in texts, aiming at the diversity and unstructured forms of time expressions, it proposed the idea of describing temporal elements to divide the types of time expressions and their forms of normalization.  With the method that combined regular expressions with Trie tree structure, it built the recognition tree of time expressions, which could recognize time expressions automatically. Finally,  it proposed the normalization algorithm and correction algorithm to deal with the recognized results. The results are pretty good.
    Related Articles | Metrics
    Research on Sentiment Classification of Texts Based on SVM
    CHEN Pei-Wen, FU Xiu-Fen
    Journal of Guangdong University of Technology. 2014, 31 (3): 95-101.   DOI: 10.3969/j.issn.1007-7162.2014.03.017
    Abstract    HTML ( )   PDF(367KB)
    The key problem to solve in a sentiment analysis of texts is the sentiment polarity classification. Based on the analysis of various factors affecting sentiment classification of texts, it built the sentiment lexicon, extracted affective characteristics, and weighted sentimental features. Then, it used support vector machine (SVM)  classifier for emotion recognition and text classification. Finally, it performed the classification model with the corpus data sets on the single platform and the Spark distributed computing platform to analyze its classification accuracy and time cost. The experimental results verify the effectiveness of the text sentimental polarity categorization model on the single platform and on the spark distributed computing platform.
    Related Articles | Metrics
    Research on the Strategy for Temporal   Information Index Based on HBase
    CHEN Lei, FENG Chao-Yong
    Journal of Guangdong University of Technology. 2014, 31 (3): 102-108.   DOI: 10.3969/j.issn.1007-7162.2014.03.018
    Abstract    HTML ( )   PDF(424KB)
    To meet the needs for storing and quick retrieving mass unstructured temporal information, it proposed using the distributed and unstructured database HBase, which was on the Hadoop platform, to store temporal data. Then, it built the temporal data storage model with the store unit as the temporal set, and designed a Multi level indexed Distributed Hash Table (tDHT) algorithm to realize the retrieval for the temporal attribute value of temporal column quickly and efficiently. By mapping from temporal attribute value to the twodimensional space, the conversion from temporal data to space object was achieved, the temporal data area was divided by using the processing method for spatial data, Multi level temporal data subareas were generated, and the Multi level indexed DHT directory was constructed, which was stored by HBase, using the methodology of DHT. The experiment results show that the index strategy can achieve a good performance, and it can be used to accelerate temporal data retrieval in the HBase table  to a certain extent.
    Related Articles | Metrics
    Improvement of the Concurrency Correctness Testing Method Based on Nondeterministic Test Method
    LI Zhen, XU Hai-Shui
    Journal of Guangdong University of Technology. 2014, 31 (3): 109-113.   DOI: 10.3969/j.issn.1007-7162.2014.03.019
    Abstract    HTML ( )   PDF(313KB)
    The uncertainty and asynchronous nature in the implementation of multithreading makes it fairly difficult to test the correctness of the concurrent program. To improve the efficiency, it proposed a method to test the correctness of concurrent programs, based on the nondeterministic test method. Through intensifying concurrent programs to compete for resources, the potential concurrency errors were found and the correctness of concurrent programs was tested. The experimental results show that with this method, the efficiency of testing concurrent correctness is validly improved. And that the errors in the concurrency program can be found with more efficiency.
    Related Articles | Metrics
    Research on Applied Models Based on Storm
    DENG Li-Long, XU Hai-Shui
    Journal of Guangdong University of Technology. 2014, 31 (3): 114-118.   DOI: 10.3969/j.issn.1007-7162.2014.03.020
    Abstract    HTML ( )   PDF(314KB)
    It discussed the core ideas and programming model of Storm, and analyzed its working mode and application method. Finally, it implemented the performance and horizontal scaling test of a data processing system based on Storm. The experimental results show that the performance and scalability of Storm is superior to that of the traditional data processing system.
    Related Articles | Metrics
    K Nearest Neighbor Query Based on Improved KdTree Construction Algorithm
    CHEN Xiao-Kang, LIU Zhu-Song
    Journal of Guangdong University of Technology. 2014, 31 (3): 119-123.   DOI: 10.3969/j.issn.1007-7162.2014.03.021
    Abstract    HTML ( )   PDF(369KB)
    K nearest neighbor query algorithm is one of the commonly used algorithms in massive spatial data query. First, it construct an index of largescale spatial data by Kd-Tree, and hierarchical division of the search space. Then, it used the k nearest neighbor query to ensure the efficiency of the search. However, the traditional Kd-Tree construction has two drawbacks: the use of test data points are required for each k nearest neighbor query back to the root, thus affecting the efficiency of the query; Kd-Tree uses the splitlevel domain for the space division of space into cubes (twodimensional data are rectangular), extra space appears in polygonal space at the intersection of judgment, making the comparison of data unnecessary,   thus affecting the efficiency of the query. Regarding these two shortcomings, it proposed the corresponding improved algorithmRB algorithm. Experimental results show that the algorithm has a higher query efficiency than the traditional KD algorithm.There are two main contribution from this paper: (1) This paper construct a quickly create Kd-Tree indexes to support queries KNN akgorithm to classify largescale data. (2) RB algorithm is proposed to improve the traditional Kd-Tree index construction method,and improving query efficiency for KNN algorithm.
    Related Articles | Metrics
    Power Efficient Bimodal Electronic Shelf Label System
    LI Zheng, FENG Yong-Jin, JIANG Zhi-Wen
    Journal of Guangdong University of Technology. 2014, 31 (3): 124-129.   DOI: 10.3969/j.issn.1007-7162.2014.03.022
    Abstract    HTML ( )   PDF(411KB)
    To replace traditional paper shelf labels, it proposes a power efficient bimodal electronic shelf label system, based on the technologies of the Internet of Things and wireless sensor networks. Focusing on the power efficiency, the proposed electronic shelf label adopts an electrophoretic electronic paper display and a Bluetooth module as the display and communication modules respectively. Moreover, this label has two working states: wake-up state and sleep state, enabling it to switch the working state under the server commands, to report label status, and to update display contents with low power consumption. The experimental results show that this system can meet the basic requirements of application scenarios. It has an advantage of comparatively low running costs over traditional paper labels.
    Related Articles | Metrics
    Sleep Method of Wireless Electronic Shelf Labels Using Relative Time
    FENG Yong-Jin, LI Zheng, ZHANG Hai-Xiao
    Journal of Guangdong University of Technology. 2014, 31 (3): 130-136.   DOI: 10.3969/j.issn.1007-7162.2014.03.023
    Abstract    HTML ( )   PDF(440KB)
    It proposes a sleep method of the wireless electronic shelf label system using relative time, based on the Internet of Things. Then, it provides a device model for electronic shelf labels controlled by relative time, the system workflow, a timesharing queuing algorithm, and a relativetime sleep algorithm. The experimental results show that the effective use of relative time helps to manage a large number of asynchronous electronic shelf labels working together. The working status of each label is accurately and simply controlled via a server that commands the time of sleep, wakeup, and communication of each label, thus increasing each label's power efficiency.
    Related Articles | Metrics
    A Method of RFID Tag Data Validation and Recovery Based on Cloud Storage Technology
    LI Zhi-Ke, LIU Zhu-Song
    Journal of Guangdong University of Technology. 2014, 31 (3): 137-142.   DOI: 10.3969/j.issn.10077162.2014.03.024
    Abstract    HTML ( )   PDF(382KB)
    Aiming at more efficient and reliable storage and management of RFID tag data, it designed a management system of RFID tag data, based on cloud storage, and proposed a data integrity check algorithm, based on Hash function. According to the field of the RFID tag data, the algorithm generated a unique checksum to confirm whether the data stored in the RFID tag was damaged. If RFID tag data had been damaged, the corresponding RFID tag data can be read via any network from the cloud storage system to recover its data.
    Related Articles | Metrics