Volume 32,Issue 12,2021 Table of Contents

vCGG: Virtual-node Based Spatial Graph Grammar Formalism

2021, 32(12):3669-3683. DOI: 10.13328/j.cnki.jos.006164

Abstract (1534) HTML (1838) PDF 1.71 M (3252) Comment (0) Favorites

Abstract:As a two-dimensional formal method, Graph grammar provides an intuitive and formal way to specify visual programming languages. However, most existing graph grammar formalisms have some deficiencies in the ability of dealing with spatial semantics, which influences the expressive power and practical application scope of graph grammar. For solving the problems, this study defines visual node to build a new spatial graph grammar formalism vCGG (virtual-node based coordinate graph grammar). Different from other spatial graph grammars, vCGG takes the virtual nodes to specify the relationships of syntax structure and spatial semantic between host graphs and productions, which reserves the power of the abstraction and improves the specification of spatial semantics. Compared with other spatial graph grammars, the formalism of vCGG has good performance in the intuitiveness, normalization, expressive power, and analysis efficiency.

Reinforcement Learning Heuristic Algorithm for Solving the Two-dimensional Strip Packing Problem

YANG Ming-Gang , CHEN Meng-Fan , YANG Shuang-Yuan , ZHANG De-Fu

2021, 32(12):3684-3697. DOI: 10.13328/j.cnki.jos.006161

Abstract (2135) HTML (2300) PDF 1.42 M (5617) Comment (0) Favorites

Abstract:The two-dimensional strip packing problem is a classic NP-hard combinatorial optimization problem, which has been widely used in daily life and industrial production. This study proposes a reinforcement learning heuristic algorithm for it. The reinforcement learning is used to provide an initial boxing sequence for the heuristic algorithm to effectively improve the heuristic cold start problem. The reinforcement learning model can perform self-driven learning, using only the value of the heuristically calculated solution as a reward signal to optimize the network, so that the network can learn a better packing sequence. A simplified version of the pointer network is used to decode the output boxing sequence. The model consists of an embedding layer, a decoder, and an attention mechanism. Actor-critic algorithm is used to train the model, which improves the efficiency of the model. The reinforcement learning heuristic algorithm is tested on 714 standard problem instances and 400 generated problem instances. Experimental results show that the proposed algorithm can effectively improve the heuristic cold start problem and outperform the state-of-the-art heuristics with much higher solution quality.

Open Source Community Review Process Measurement System and Its Empirical Research

JIANG Jing , WU Qiu-Di , ZHANG Li

2021, 32(12):3698-3709. DOI: 10.13328/j.cnki.jos.006127

Abstract (914) HTML (1781) PDF 1.18 M (2715) Comment (0) Favorites

Abstract:In the open source community, the code level of different developers varies, and code reviews are required to check the quality of the submitted code. Decision makers are the key persons in the code review, auditing the submitted code and finding software defects. Code reviews affect the quality of open source software. Therefore, it is necessary to establish code review process measurement system, understand the code review situation, and promote the quality of open source software projects. Existing software process measurement methods mainly consider code submission and review comments, but lack of consideration for decision making activities, and it is difficult to fully measure the review behavior. This study considers decision-maker factor, and proposes an open source community’s review process measurement system, including evaluation activity indicators and personnel distribution indicators. Review activity indicators include numbers of review, length of review information, number of lines that code changes, and review time. The personnel distribution indicators mainly consider the proportion and number of modifiers, commenters, and decision makers. Then, this study collects data from three popular open source projects and analyzes the relationship between evaluation process metrics and the number of software defects. Through empirical research and analysis, it is found that the number of decision-makers, the proportion of decision-makers with few changes, few comments, and few decision-makers are moderately positively correlated with the number of software defects. At the same time, compared with the measurement system without the decision maker, it is found that the measurement system with the decision maker has a higher correlation with software defects. The results of the empirical study verify the effectiveness of review process measurement system, and illustrate the necessity of adding relevant indicators for decision makers.

Automated Refactoring Approach for Fine-grained Lock Based on Pushdown Automaton

ZHANG Yang , SHAO Shuai , ZHANG Dong-Wen

2021, 32(12):3710-3727. DOI: 10.13328/j.cnki.jos.006132

Abstract (862) HTML (2034) PDF 1.15 M (2933) Comment (0) Favorites

Abstract:As coarse-grained locks have a negative impact on the scalability of concurrent programs, this study proposes an automatic refactoring approach to convert a coarse-grained lock into a fine-grained one. Several static analyses, such as visitor pattern analysis, alias analysis, and side-effect analysis are employed in this approach. The read and write pattern of a critical section is inferred by side effect analysis, and then a push down automaton is proposed to identify the read and write pattern. Finally, refactoring is conducted based on these results. An automatic tool FLock is implemented as the Eclipse plug-in. The proposed approach is evaluated by eleven open-source projects including HSQLDB, Jenkins, and Cassandra, by presenting results such as the number of refactored locks, changed lines of code, refactoring time, accuracy, program performance after refactoring. FLock is also compared with the existing tools Relocker and CLOCK. The experimental results show that a total of 1757 built-in monitors are refactored and each refactoring takes an average of 17.5 seconds. The experiments reveal that the proposed tool can help developers convert coarse-grained locks into fine-grained locks effectively.

SSRules: Make it Easier to Write and Check Automation Rules for Smart Home Systems

WANG Bo , ZHANG Yu , GENG Jia-Ning , LI Xiang-Yang

2021, 32(12):3728-3750. DOI: 10.13328/j.cnki.jos.006098

Abstract (1134) HTML (2543) PDF 2.68 M (4049) Comment (0) Favorites

Abstract:Smart home systems make home devices smart and are widely welcomed by users. Due to different user needs, service providers use “trigger-action” programming (TAP) mode to support user-tailored rules. However, the Event-State paradigm, which is now popular in TAP programming and smart home rule engines, is highly error-prone, and the modification of the rules and the tracking of errors are difficult. After systematic analysis of the causes of TAP defects, a scheme with low difficulty in writing and modification and being able to detect abnormal rule operation is proposed, denoted as SSRules. SSRules allows users to enter rules written in improved State-State paradigm, and SSRules can translate them into rules written in Event-State paradigm and acceptable by the open-source smart home system Home Assistant based on the Z3 Theorem Prover. Considering that smart homes need to master the dynamics of the device in real-time, SSRules introduces a runtime subsystem to obtain state information and perform rule execution validity checks. Finally, a smart home simulator HA-Simulator is developed in Unity3D. Tests on it show that SSRules is more concise than traditional methods, the number of rules is reduced by around 60% on average. It can detect transient anomalies promptly and record the cause, which is easier for users to understand and use.

GAT2VEC-based Web Service Classification Method

XIAO Yong , LIU Jian-Xun , HU Rong , CAO Bu-Qing , CAO Ying-Cheng

2021, 32(12):3751-3767. DOI: 10.13328/j.cnki.jos.006102

Abstract (787) HTML (1669) PDF 2.12 M (2553) Comment (0) Favorites

Abstract:With the development of SOA technology, Web service is widely used and the number of services is growing rapidly. It is very important to classify Web service correctly and efficiently to improve the quality of service discovery and promote the efficiency of service composition. However, the existing Web service classification technologies have some problems, such as sparse description text, insufficient consideration of attribute information, and structural relationship. Therefore, it is difficult to effectively improve the accuracy of Web service classification. In order to solve this problem, this study proposes a GAT2VEC-based Web service classification method. Firstly, according to the structural relationship between Web services and their own attribute information, several corresponding structural diagrams and attribute bipartite diagrams are constructed respectively, and the random walk algorithm is used to generate the structural context and attribute context of Web services. Then, the SkipGram model is used to train the joint context to obtain the word vector which merges the multidimensional information. Finally, the SVM model is used to perform the classification and prediction of Web services. The experimental results show that compared with the five methods of Doc2vec, LDA, Deepwalk, Node2Vec, and TriDNR, the proposed method has 135.3%, 60.3%, 12.4%, 10.5%, and 4.3% improvement in Macro F1 value, which effectively improves the accuracy of service classification.

Green Heterogeneous Scheduling Algorithm Through Deep Integration of Hardware and Software Energy Saving Principles

WANG Jing-Lian , GONG Bin , LIU Hong , LI Shao-Hui

2021, 32(12):3768-3781. DOI: 10.13328/j.cnki.jos.006133

Abstract (581) HTML (1759) PDF 1.72 M (2481) Comment (0) Favorites

Abstract:The computing evolution from high performance to high efficiency of the virtual cloud is an urgent need of environmental protection and human sustainable developments. However, on the one hand, nowadays there are moderate extension demands of the hardware energy-saving space; on the other hand, meta-heuristics scheduling algorithms, such as genetic algorithms and artificial immune algorithms, underperform in the optimization dynamics with the balance conflict between convergence and distribution. In fact, there are some inevitable and logical relationships between every candidate solution (scheduling scheme) and some physical feedback; and nonlinearity and heterogeneity of the allocated resources means a big discrepancy in the feedback effects between different scheduling schemes, such as the energy-efficiencies related. Therefore, the research methods of this study are to respect the scientific laws, and to ingeniously follow the hardware energy-saving principle, in order for injecting new energy into the algorithm optimization power, and also for further enhancing the energy-saving dominance of software methods. Then, the green heterogeneous scheduling algorithm through deep integration of hardware and software energy saving principles, is presented in this paper (i.e., GHSA_di/II), with the multi angle and all-round improvements of the internal drive of co-evolutionary simulation in the meta-heuristics algorithms. The experimental results show that compared with the other three meta-heuristic heterogeneous scheduling algorithms, GHSA_di/II algorithm has obvious advantages in overall performance, energy saving, and scalability, for both data intensive and computing intensive instances.

Corpus Construction for Chinese Zero Anaphora from Discourse Perspective

KONG Fang , GE Hai-Zhu , ZHOU Guo-Dong

2021, 32(12):3782-3801. DOI: 10.13328/j.cnki.jos.006119

Abstract (605) HTML (1158) PDF 1.93 M (2522) Comment (0) Favorites

Abstract:As a common phenomenon in Chinese, zero anaphora plays an important role in many natural language processing tasks, such as machine translation, text summarization and machine reading comprehension. Currently, it has become a research hotspot in the field of natural language processing. Towards better discourse analysis, this study proposes a representation architecture for Chinese zero anaphora from the discourse perspective. Firstly, the elementary discourse unit is taken as the investigation object to determine whether it contains zero elements. Secondly, according to the roles of zero elements in the elementary discourse unit, the zero elements are divided into two categories: the core type and the modifier type. Thirdly, the discourse rhetorical tree of the paragraph is used as the basic unit to evaluate the Chinese zero coreferential relationship. According to the positional relationship between the antecedent and the zero element, the coreferential relationship is classified into two types, i.e., Intra-EDU and Inter-EDU. After that, for Inter-EDU type, the coreferential relationship is furtherly divided into four categories according to the status of the antecedent, i.e., entity, event, union, and others. Finally, this study selects the overlapped 325 texts of the Chinese treebank (CTB), the connective-driven Chinese discourse treebank (CDTB), and the OntoNotes corpus to annotate the Chinese zero anaphora. System evaluation shows the high quality of the constructed corpus for Chinese zero anaphora. Moreover, a complete zero anaphor resolution baseline system is constructed to show the appropriateness and the effectiveness of the proposed representation architecture for Chinese zero anaphora from computability perspective.

Deep Generative Neural Networks Based on Real-valued RBM with Auxiliary Hidden Units

ZHANG Jian , DING Shi-Fei , DING Ling , ZHANG Cheng-Long

2021, 32(12):3802-3813. DOI: 10.13328/j.cnki.jos.006126

Abstract (630) HTML (1549) PDF 1.70 M (2327) Comment (0) Favorites

Abstract:Restricted Boltzmann machine (RBM) is a probabilistic undirected graph, and most traditional RBM models assume that their hidden layer units are binary. The advantage of binary units is their calculation process and sampling process are relatively simple. However, binarized hidden units may bring information loss to the process of feature extraction and data reconstruction. Therefore, a key research point of RBM theory is to construct real-valued visible layer units and hidden layer units, meanwhile, maintain the effectiveness of model training. In this study, the binary units are extended to real-valued units to model data and extract features. To achieve this, specifically, an auxiliary unit is added between the visible layer and the hidden layer, and then the graph regularization term is introduced into the energy function. Based on the binary auxiliary unit and graph regularization term, the data on the manifold has a higher probability to be mapped as a parameterized truncated Gaussian distribution, simultaneously, the data far from the manifold has a higher probability to be mapped as Gaussian noises. The hidden units can be sampled as real-valued units from the parameterized Gaussian distribution and Gaussian noises. In this study, the resulting RBM based model is called restricted Boltzmann machine with auxiliary units (ARBM). Moreover, the effectiveness of the proposed model is analyzed theoretically. The effectiveness of the model in image reconstruction task and image generation task is verified by experiments.

Diversity Based Surrogate-assisted Evolutionary Algorithm for Expensive Multi-objective Optimization Problem

SUN Zhe-Ren , HUANG Yu-Hua , CHEN Zhi-Yuan

2021, 32(12):3814-3828. DOI: 10.13328/j.cnki.jos.006109

Abstract (995) HTML (2156) PDF 1.57 M (3336) Comment (0) Favorites

Abstract:The surrogate-assisted evolutionary algorithm (SAEA) is an effective way to solve expensive problems. This study proposed a diversity-based surrogate-assisted evolutionary algorithm (DSAEA) to solve the expensive multi-objective optimization problem. DSAEA approximates each objective with the Kriging model to replace the original objective function evaluation, accelerating the optimization process of the evolutionary algorithm. It decomposes the problem into several subproblems with the reference vectors. The correlation between the solution and the reference vector is established according to the angle between them. Then the minimum correlative solution set is computed. Based on it, the candidate producing operator and the selection operator tend to preserve the solutions of diversity. In addition, as the training set, Archive A is updated after each iteration, deleting the little value samples according to diversity to reduce the modeling time. In the experiment section, large scale 2- and 3-objective comparative experiments for DSAEA and several current popular SAEAs were done. Each algorithm on different test problems ran 30 times independently, and the inverted generational distance (IGD), hypervolume (HV), and running time were calculated and collected. At last, rank sum test was used to analyze the experimental results. The results show that DSAEA performs better on the most experimental test problems, therefore, it is effective and feasible.

HAN Peng-Yu , YU Zheng-Tao , GAO Sheng-Xiang , HUANG Yu-Xin , GUO Jun-Jun

2021, 32(12):3829-3838. DOI: 10.13328/j.cnki.jos.006110

Abstract (635) HTML (1727) PDF 1.37 M (2396) Comment (0) Favorites

Abstract:The case-related public opinion summarization is the task of extracting a few sentences that can summarize the subject information from some case-related news documents. The case-related public opinion summarization can be regarded as a multi-document summarization in a specific field. Compared with the general multi-document summarization, the topic information can be characterized by some case elements that run through the entire text cluster. In text clusters, sentences and sentences are associated with each other, case elements also have associations of varying degree with sentences. These associations play an important role in extracting abstract sentences. A case-related public opinion summarization method based on graph convolution of sentence association graph with case elements is proposed, which uses graph structure to model all text clusters, with sentences as the main node, words and case elements as auxiliary nodes to enhance the relationship between sentences. Multiple features are used to calculate the relationship between different nodes. Then, graph convolutional neural network is used to learn this sentence association graph, and the sentence is classified to obtain the candidate summary sentence. Finally, the sentence is deduplicated and ranked to obtain the case-related public opinion summarization. Experiments are performed on the case-related public opinion summary dataset. The results show that the method achieves better results than the benchmark model, indicating that both the composition method and the graph convolution learning method are effective.

Passenger Demand Forecast Model Based on Deformable Convolution Spatial-temporal Network

YU Rui-Yun , LIN Fu-Yu , GAO Ning-Wei , LI Jie

2021, 32(12):3839-3851. DOI: 10.13328/j.cnki.jos.006115

Abstract (837) HTML (1621) PDF 1.93 M (3005) Comment (0) Favorites

Abstract:With the increasing popularity of taxi services such as Didi and Uber, passengers’ demand has gradually become an important part of smart cities and smart transportation. The accurate prediction model can not only meet the travel needs of users, but also reduce the no-load rate of road vehicles, which can effectively avoid waste of resources and relieve traffic pressure. Vehicle service providers can collect a large amount of GPS data and passenger demand data, but how to use this big data to forecast demand is a key and practical problem. This study proposes a deformable convolution spatial-temporal network (DCSN) model that combines urban POI to predict regional ride demand. Specifically, the model proposed in this study consists of two parts: the deformable convolution spatial-temporal model and the POI requirement correlation model. The former models the correlation between future demand and time and space through DCN and LSTM, while the latter captures the similar relationship among regions through the regional POI differentiation index and the demand differentiation index. Finally, the two models are integrated by a fully connected network. Then the prediction results are obtained. In this study, the large real ride demand data of Didi trips is used for experiments. The final experimental results show that the proposed method outperforms the existing forecasting methods in terms of prediction accuracy.

Recommendation Approach Based on Attentive Federated Distillation

CHEN Ming , ZHANG Lei , MA Tian-Yi

2021, 32(12):3852-3868. DOI: 10.13328/j.cnki.jos.006128

Abstract (1105) HTML (2738) PDF 1.84 M (3457) Comment (0) Favorites

Abstract:Data privacy protection has become one of the major challenges of recommendation systems. With the release of the Cybersecurity Law of the People's Republic of China and the general data protection regulation in the European Union, data privacy and security have become a worldwide concern. Federated learning can train the global model without exchanging user data, thus protecting users' privacy. Nevertheless, federated learning is still facing many issues, such as the small size of local data in each device, over-fitting of local model, and the data sparsity, which makes it difficult to reach higher accuracy. Meanwhile, with the advent of 5G (the 5th generation mobile communication technology) era, the data volume and transmission rate of personal devices are expected to be 10 to 100 times higher than the current ones, which requires higher model efficiency. Knowledge distillation can transfer the knowledge from the teacher model to a more compact student model so that the student model can approach or surpass the performance of teacher model, thus effectively solve the problems of large model parameter and high communication cost. However, the accuracy of student model is lower than teacher model after knowledge distillation. Therefore, a federated distillation approach is proposed with attentional mechanisms for recommendation systems. First, the method introduces Kullback-Leibler divergence and regularization term to the objective function of federated distillation to reduce the impact of heterogeneity between teacher network and student network; then it introduces multi-head attention mechanism to improve model accuracy by adding information to the embeddings. Finally, an improved adaptive training mechanism is introduced for learning rate to automatically switch optimizers and choose appropriate learning rates, thus increasing convergence speed of model. Experiment results validate efficiency of the proposed methods: compared to the baselines, the training time of the proposed model is reduced by 52%, the accuracy is increased by 13%, the average error is reduced by 17%, and the NDCG is increased by 10%.

Emotion Classification of Spatiotemporal EEG Features Using Hybrid Neural Networks

CHEN Jing-Xia , HAO Wei , ZHANG Peng-Wei , MIN Chong-Dan , LI Yue-Chen

2021, 32(12):3869-3883. DOI: 10.13328/j.cnki.jos.006123

Abstract (2090) HTML (1952) PDF 682.07 K (4209) Comment (0) Favorites

Abstract:This study proposes a data representation of electroencephalogram (EEG), which transforms 1D chain-like EEG vector sequences into 2D mesh-like matrix sequences. The mesh structure of the matrix at each time point corresponds to the distribution of EEG electrodes, which could better represent the spatial correlation of EEG signals among multiple physically adjacent electrodes. Then, the sliding window is used to divide the 2D mesh sequence into segments containing equal time length, and each segment is seen as an EEG sample integrating the temporal and spatial correlation of raw EEG recordings. Two hybrid deep learning models are also proposed, i.e., cascaded convolutional recurrent neural network (CASC_CNN_LSTM) and cascaded double convolutional neural network (CASC_CNN_CNN). Both of them use the CNN model to capture the spatial correlation between physically adjacent EEG signals from the converted 2DEEG meshes. The former uses the LSTM model to learn the time dependency of the EEG sequence, and the latter uses another CNN model to extract the deeper discriminative features of local time and space. Extensive binary emotion classification experiments in valence are carried out on a large scale open DEAP dataset (32 subjects, 9830 400 EEG recordings). The results show that the average classification accuracy of the proposed CASC_CNN_LSTM and CASC_CNN_CNN networks on spatiotemporal 2D meshlike EEG sequence reaches 93.15% and 92.37%, respectively, which significantly outperform the baseline models and the state-of-the-art methods. It demonstrates that the proposed method effectively improves the accuracy and robustness of EEG emotion classification due to its ability of jointly learning deeper spatiotemporal correlated features using hybrid deep neural network.

Incremental Data Sampling Method Using Affinity Propagation with Dynamic Weighting

CHEN Xiao-Qi , XIE Zhen-Ping , LIU Yuan , ZHAN Qian-Yi

2021, 32(12):3884-3900. DOI: 10.13328/j.cnki.jos.006118

Abstract (543) HTML (1450) PDF 2.06 M (2269) Comment (0) Favorites

Abstract:Data sampling is an important manner to efficiently extract useful information from original huge datasets. In order to fit with the requirements of efficiently dealing with more and more large-scale data, a novel incremental data sampling method originated from affinity propagation method is proposed, in which two integrated algorithm strategies including hierarchical incremental processing and the dynamic weighting of data samples are introduced. The proposed method mainly can balance the computational efficiency and sampling quality very well. For hierarchical incremental processing strategy, it firstly samples data items in batches and then composites samples by hierarchical way. For dynamic weighting of data samples strategy, it dynamically re-weights the preference to retain better global consistency of possible samples on data space in the incremental sampling procedure. In the experiments, artificial datasets, UCI datasets, and image datasets are used to analyze the sampling performance. The experimental results with several compared algorithms indicate that, the proposed method can gain similar sampling quality but with notably higher computational efficiency especially for more large-scale datasets. This study further applies the new method to data augmentation task in deep learning, and the corresponding experimental results show that the proposed method performs excellently. Concretely, if basic training dataset are processed by sampling enhancement with the proposed new method, the trained model performance using similar number of training samples can be significantly improved compared to traditional data enhancement strategies.

Publicly Verifiable Outsourced Database with Full Delegations

ZHOU Bo-Yang , CHEN Chun-Yu , WANG Qiang , ZHOU Fu-Cai

2021, 32(12):3901-3916. DOI: 10.13328/j.cnki.jos.006129

Abstract (1070) HTML (2169) PDF 762.69 K (2907) Comment (0) Favorites

Abstract:To solve the problem of high preprocessing cost and public verifiability in the verifiable outsourced database schemes, a publicly verifiable outsourced database with full delegation is proposed. The architecture and the definition of security and correctness of the model are present. Based on the bilinear map and verifiable outsourced modular exponentiations protocol, a publicly verifiable outsourced database scheme with full delegation is constructed, and each algorithm is designed in detail. The rigorous security proof is presented under the bilinear Diffie-Hellman exponent (BDHE) problem. Compared with performing the protocol without full delegation scheme and the existing schemes, the data owner in publicly verifiable outsourced database with full delegations scheme outsources more operations to the cloud because of the application of the verifiable outsourced modular exponentiation operation. The theoretical analysis and simulation confirm that the cost of the proposed scheme is lower in the preprocessing phase, which makes it more efficient and practical. In the verification phase, any user can verify the result since the verification algorithm does not take any secret key as input. Therefore, the proposed scheme achieves public verifiability.

Deep Matrix Factorization Recommendation Algorithm

TIAN Zhen , PAN La-Mei , YIN Pu , WANG Rui

2021, 32(12):3917-3928. DOI: 10.13328/j.cnki.jos.006141

Abstract (1340) HTML (2670) PDF 2.01 M (3632) Comment (0) Favorites

Abstract:Matrix factorization in collaborative filtering recommendation algorithms is widely used because of its simplicity and facility of implementation, but matrix factorization utilizes a simple linear inner product to model the non-linear interaction between the user and the item, which limits the model's expressive power. He et al. proposed a generalized matrix factorization model, which extended the matrix factorization to the generalized matrix factorization through a non-linear activation function and connection weights, and gave the model the ability to model second-order non-linear interactions between users and items. Nevertheless, the generalized matrix factorization model is a shallow model and does not model the high-order interaction between users and items, which may affect the performance of the model to a certain extent. Inspired by the generalized matrix factorization model, this study proposes a deep matrix factorization model, abbreviated as DMF. Based on the generalized matrix factorization model, a hidden layer is introduced, and a deep neural network is used to learn the higher-order interaction between users and items. The deep matrix factorization model, which has a good expression ability, not only solves the linear problem of simple inner product, but also models high-order interactions between users and items. In addition, a lot of rich comparative experiments are performed on two datasets, MovieLens and Anime, and the results confirm its feasibility and effectiveness. Meanwhile the optimal parameters of the model were determined through experiments.

Privacy Preserving Frequent Pattern Mining Based on Grouping Randomization

GUO Yu-Hong , TONG Yun-Hai , SU Yan-Qing

2021, 32(12):3929-3944. DOI: 10.13328/j.cnki.jos.006101

Abstract (659) HTML (1864) PDF 1.58 M (2572) Comment (0) Favorites

Abstract:Existing randomization methods of privacy preserving frequent pattern mining use a uniform randomization parameter for all individuals, without considering the differences of privacy requirements. This equal protection cannot satisfy individual preferences for privacy. This study proposes a method of privacy preserving frequent pattern mining based on grouping randomization (referred to as GR-PPFM). In this method, individuals are grouped according to their different privacy protection requirements. Different group of data is assigned to different privacy protection level and corresponding random parameter. The experimental results of both synthetic and real- world data show that compared with the uniform single parameter randomization of mask, grouping randomization with multi parameters of GR-PPFM can not only meet the needs of different groups of diverse privacy protection, but also improve the accuracy of mining results with the same overall privacy protection.

Function Delivery Network: Container-based Smart Edge Computing Platform

YANG Shu , CHEN Zi-Teng , CUI Lai-Zhong , MING Zhong-Xing , CHENG Lu , TANG Xiao-Lin , XIAO Wei

2021, 32(12):3945-3959. DOI: 10.13328/j.cnki.jos.006113

Abstract (904) HTML (1863) PDF 1.68 M (3673) Comment (0) Favorites

Abstract:With the development of big data and machine learning, the network traffic and data computation are growing fast. Researchers developed the content network delivery (CDN), edge computing, etc. for these challenges. Nevertheless, the CDN only addresses the data storage, and it is still challenging to manage and schedule resources among edge clusters in edge computing. Containerization has been widely employed in edge computing, but the current container orchestrators utilize the inefficient orchestration schemes, which leads to high computation latency. Thisstudy proposes the function delivery network (FDN). On the one hand, FDN provides the interface and containerization computation platform for users to access the edge computing resources. On the other hand, FDN optimizes the resource utilization and computation latency by orchestrating the containers to the appropriate edge clusters. Moreover, a heuristic container orchestrating algorithm is developed that enables the inter-cluster container orchestrating. The FDN system is implemented based on Openwhisk and the FDN system is deployed in China Mobile network, and the FDN system is evaluated. The results show that the proposed FDN system can decrease the task computation latency, and the heuristic container orchestration algorithm outperforms the traditional container orchestration schemes.

Prediction of the Efficacy of Radiotherapy and Chemotherapy for Cervical Squamous Cell Carcinoma Based on Random Forests

DENG Cheng-Long , GUAN Bei , LIU De-Feng , LIU Lan-Xiang , SHI Qing-Lei , WANG Hao-Ran , WANG Yong-Ji

2021, 32(12):3960-3976. DOI: 10.13328/j.cnki.jos.006136

Abstract (726) HTML (1480) PDF 2.01 M (2587) Comment (0) Favorites

Abstract:For patients with cervical squamous cell carcinoma (SCC) of stage IIB~IVA, complete or incomplete remission may occur in the tumor area after radiotherapy and chemotherapy. According to clinical experience, if the tumor area cannot be completely relieved after receiving chemoradiotherapy, the patient’s survival rate is very low, and other treatments such as surgery or oral targeted drug therapy are difficult to be effective. Therefore, it is necessary to screen patients who are not sensitive to radiotherapy and chemotherapy before treatment and then to explore personalized treatment plans. In view of the above problems, this paper regards the prediction of the efficacy of radiotherapy and chemotherapy as the image classification problem, and proposes a model to predict the efficacy of radiotherapy and chemotherapy for SCC based on random forests algorithm, and screens out patients who are not sensitive to radiotherapy and chemotherapy. First, the 3D SCC MRI (magnetic resonance imaging) is preprocessed by wavelet transform and Gaussian Laplacian; Second, U-net is used to segment the tumor area in MR images; Then, combined with 3D SCC MRI and corresponding tumor segmentation results, the texture and shape features of lesions are extracted and the extracted features are screened to train random forests. The experimental data set consisted of pre-treatment MR image slices of 85 patients with SCC stage IIB~IVA. The experimental results shows that the prediction model based on random forests predicts the efficacy of radiotherapy and chemotherapy for SCC with an AUC value of 0.921, which is better than the most advanced prediction method.

Low-light Image Enhancement Method Based on MDARNet

JIANG Ze-Tao , QIN Lu-Lu , QIN Jia-Qi , ZHANG Shao-Qin

2021, 32(12):3977-3991. DOI: 10.13328/j.cnki.jos.006112

Abstract (835) HTML (2075) PDF 3.01 M (3686) Comment (0) Favorites

Abstract:Due to the low-quality problems such as low brightness, poor contrast, noise, and color imbalance, the performance of the images collected in low-illumination environment is seriously affected in the process of image processing applications. The purpose of this paper is to improve the quality of low-illumination images to obtain natural and clear images with complete structure and details. Combining Retinex theory and convolutional neural network, this paper proposes a low-light image enhancement method based on MDARNet, which includes Attention mechanism module and dense convolution module to improve performance. Firstly, MDARNet uses three different scale convolution kernels that contain both two-dimensional kernels and one-dimensional kernels to perform preliminary feature extraction on the image, and the pixel attention module to perform targeted learning on multi-scale feature maps. Secondly, the skip connection structure is designed for feature extraction, so that the features of the image can achieve maximum utilization. Finally, the channel attention module and the pixel attention module are employed to perform weight learning and illumination estimation on the extracted feature maps simultaneously. The experimental results show that MDARNet can effectively improve the brightness, contrast, and color of low-light images. Compared with some classical algorithms, the MDARNet adopted in this thesis can achieve better results in visual effects and objective evaluation (PSNR, SSIM, MS-SSIM, MSE).

Survey on RISC-V System Architecture Research

LIU Chang , WU Yan-Jun , WU Jing-Zheng , ZHAO Chen

2021, 32(12):3992-4024. DOI: 10.13328/j.cnki.jos.006490

Abstract (3096) HTML (8746) PDF 3.30 M (12572) Comment (0) Favorites

Abstract:ISA (instruction set architecture) is the interface specification between software and hardware, which is also the origin point of an information technology ecosystem. RISC-V is the inevitable product of computer architecture gradually moving towards openness. It brings a new paradigm for system research, i.e. software research issues can be tracked down to ISA, which expands or even subverts the traditional full-stack design theory on the system function, performance, security and other issues, showing a promising development prospect. This paper reviews the research results of RISC-V architecture in recent years. Firstly, the development status of RISC-V instruction set is introduced, and the scope of instruction set that should be paid attention to in RISC-V research is pointed out. Then, the current RISC-V CPU platforms, particularly RISC-V processors are analyzed, and the design points and application scope are summarized. Then, focusing on the design of RISC-V CPU, this paper discusses four fundamental research topics: instruction set, function implementation, performance improvement and security strategy, and reviews some research results in recent years. Finally, with the help of some specific cases, this paper expounds the role of risc-v in specific domains, analyzes the possible future directions of RISC-V research.

One-shot Video-based Person Re-identification Based on Neighborhood Center Iteration Strategy

ZHANG Yun-Peng , WANG Hong-Yuan , ZHANG Ji , CHEN Li , WU Lin-Yu , GU Jia-Hui , CHEN Qiang

2021, 32(12):4025-4035. DOI: 10.13328/j.cnki.jos.006108

Abstract (1481) HTML (1447) PDF 606.74 K (2701) Comment (0) Favorites

Abstract:In order to solve the problem of labeling difficulty in video-based person re-identification dataset, a neighborhood center iteration strategy based on one-shot video-based person re-identification is proposed in this study, which gradually optimizes the network by using pseudo-labeled tracklets to obtain the best model. Aiming at the problem that the accuracy of predicting pseudo labels of unlabeled tracklets is low, a novel label evaluation method is proposed. After each training, the center points of each class in the features of the selected pseudo-labeled tracklets and labeled tracklets are used as the measurement center points for predicting the pseudo labels in the next training. At the same time, a loss control strategy based on cross entropy loss and online instance matching loss is proposed in this study, which makes the training process more stable and the accuracy of the pseudo labels higher. Experiments are implemented on two large datasets: MARS and DukeMTMC-VideoReID, and the result demonstrates that the proposed method outperforms the current state-of-the-art methods.

微信服务号

微信订阅号

>Review Articles

Current Issue

Volume

Issue