Volume 35,Issue 11,2024 Table of Contents

Method of Data Service Integrity Verification Based on Remote Attestation

ZHANG Su , ZHANG Ying , ZHANG Wei , HUANG Gang

2024, 35(11):4949-4972. DOI: 10.13328/j.cnki.jos.007001 CSTR: 32375.14.jos.007001

Abstract (1122) HTML (529) PDF 7.60 M (2797) Comment (0) Favorites

Abstract:As an important production factor, data need to be exchanged between different entities to create value. In this process, data integrity needs to be ensured, or in other words, data cannot be tampered without authorization, or otherwise, it may lead to extremely serious consequences. The existing work realizes data evidence preservation by combining distributed ledger with data encryption and verification technology to ensure the integrity of data to be exchanged in transmission, storage, and other related data processing phrases. However, such work is difficult to confirm the integrity of the data provided by the data supplier. Once the data supplier provides forged data, all subsequent integrity assurance will be meaningless. Therefore, this study proposes a method for verifying the integrity of data services based on remote attestation. By using the trusted execution environment as the trust anchor, this method can measure and verify the integrity of the static code, execution process, and execution result of a specific data service. It also optimizes the integrity verification of a specific data service through program slicing, thus extending the scope of data integrity assurance to the time point when the data supplier provides data. A series of experiments are carried out on 25 data services of three real Java information systems to validate the proposed method.

Test Case Selection for Neural Network via Data Mutation

CAO Xue-Jie , CHEN Jun-Jie , YAN Ming , YOU Han-Mo , WU Zhuo , WANG Zan

2024, 35(11):4973-4992. DOI: 10.13328/j.cnki.jos.007005 CSTR: 32375.14.jos.007005

Abstract (700) HTML (502) PDF 8.70 M (2270) Comment (0) Favorites

Abstract:Nowadays, deep neural network (DNN) is widely used in autonomous driving, medical diagnosis, speech recognition, face recognition, and other safety-critical fields. Therefore, DNN testing is critical to ensure the quality of DNN. However, labeling test cases to judge whether the DNN model predictions are correct is costly. Therefore, selecting test cases that reveal incorrect behavior of DNN models and labeling them earlier can help developers debug DNN models as soon as possible, thus improving the efficiency of DNN testing and ensuring the quality of DNN models. This study proposes a test case selection method based on data mutation, namely DMS. In this method, a data mutation operator is designed and implemented to generate a mutation model to simulate model defects and capture the dynamic pattern of test case bug-revealing, so as to evaluate the ability of test case bug-revealing. Experiments are conducted on the combination of 25 deep learning test sets and models. The results show that DMS is significantly better than the existing test case selection methods in terms of both the proportion of bug-revealing and the diversity of bug-revealing directions in the selected samples. Specifically, taking the original test set as the candidate set, DMS can filter out 53.85%–99.22% of all bug-revealing test cases when selecting 10% of the test cases. Moreover, when 5% of the test cases are selected, the selected cases by DMS can cover almost all bug-revealing directions. Compared with the eight comparison methods, DMS finds 12.38%–71.81% more bug-revealing cases on average, which proves the significant effectiveness of DMS in the task of test case selection.

Modeling and Verification for Mobile and Communication Behaviors of IoT Devices

LIU Jing-Yu , LI Xuan-Song , CHEN Zhi-Fei , YE Hai-Bo , SONG Wei

2024, 35(11):4993-5015. DOI: 10.13328/j.cnki.jos.007031 CSTR: 32375.14.jos.007031

Abstract (507) HTML (513) PDF 8.07 M (2002) Comment (0) Favorites

Abstract:The utilization range of Internet of Things (IoT) devices is expanding. Model checking is an effective approach to improve the reliability and security of such devices. However, the commonly adopted model checking methods cannot well describe the cross-space movement and communication behavior common in such devices. To this end, this study proposes a modeling and verification method for the mobile and communication behavior of IoT devices to verify their spatio-temporal properties. Additionally, push/pull action and global communication mechanism are integrated into ambient calculus to propose the ambient calculus with global communication (ACGC) and provide a model checking algorithm for ACGC against the ambient logic. Then, the modeling language for mobility and communication (MLMC) is put forward to describe mobile and communication behavior of IoT devices. Additionally, a method to convert the MLMC-based description into an ACGC model is given. Furthermore, a model checking tool ACGCCk is implemented to verify whether the properties of IoT devices are satisfied. Meanwhile, some optimizations are conducted to accelerate the checking. Finally, the effectiveness of the proposed method is demonstrated by case study and experimental analysis.

Functional Modeling and Mechanized Verification of LLRB Algorithm

ZUO Zheng-Kang , HUANG Zhi-Peng , HUANG Qing , SUN Huan , ZENG Zhi-Cheng , HU Ying , WANG Chang-Jing

2024, 35(11):5016-5039. DOI: 10.13328/j.cnki.jos.007034 CSTR: 32375.14.jos.007034

Abstract (722) HTML (494) PDF 13.33 M (1987) Comment (0) Favorites

Abstract:Unlimited by the state and space, the formal verification technology based on mechanized theorem proof is an important method to ensure software correctness and avoid serious loss from potential software bugs. LLRB (left-leaning red-black trees) is a variant of binary search trees, and its structure has an additional left-leaning constraint over the traditional red-black trees. During verification, conventional proof strategies cannot be employed, which requires more manual intervention and effort. Thus, the LLRB correctness verification is widely acknowledged as a challenging problem. To this end, based on the Isabelle verification framework for the binary search tree algorithm, this study refines the additional property part of the framework and provides a concrete verification scheme. The LLRB insertion and deletion operations are functionally modeled in Isabelle, with modular treatment of the LLRB invariants. Subsequently, the function correctness is verified. This is the first mechanized verification of functional LLRB insertion and deletion algorithms in Isabelle. Compared to the current Dafny verification of the LLRB algorithm, the theorem number is reduced from 158 to 84, and it is unnecessary for constructing intermediate assertions, which alleviates the verification burden. Meanwhile, this study provides references for functional modeling and verification of complex tree structure algorithms.

Empirical Study on UI Display Issue Detection in Mobile Applications

LIU Zhe , WANG Jun-Jie , CHEN Chun-Yang , CHE Xing , SU Yu-Hui , WANG Qing

2024, 35(11):5040-5064. DOI: 10.13328/j.cnki.jos.007043 CSTR: 32375.14.jos.007043

Abstract (595) HTML (541) PDF 13.20 M (2047) Comment (0) Favorites

Abstract:The graphical user interface (GUI/UI) provides a visual bridge between the application and its end users, and users can use the application through interactive operations. With the development of mobile applications, GUI, which combines aesthetics and interaction design, has become more and more complex, and users are increasingly concerned about the accessibility and availability of applications. However, the complexity of GUI also brings great challenges to its design and implementation. Due to user-defined settings for mobile devices and different device models and screen resolutions, UI display issues frequently occur. For example, due to software or hardware compatibility, when rendering interfaces on different devices, there will always be display issues such as text overlap, component masking, and image loss. They have a negative impact on the availability and accessibility of applications, resulting in poor user experience. Unfortunately, little is known about the causes of UI display issues of mobile applications. In order to cope with this challenge, this study collects 6729 screenshots of applications with UI display issues from Baidu crowdtesting platform and 1016 screenshots of applications provided by issue reports in GitHub and identifies nine types of UI display issues using the theme analysis method. Through the analysis of 1061 UI issue reports from GitHub and the corresponding defective code, the essence and causes of UI display issues are summarized. The research found that (1) 62.1% of the total screenshots in crowdtesting dataset are defective screenshots displayed on the UI; (2) the reason for the UI display issues is that the font scaling setting does not match the adaptive setting of components to a great extent; (3) the layout setting of the interface will lead to display issues; (4) If the hardware acceleration is not turned on, the normal display of the interface will be affected.

Description Generation Method for Pull Requests Based on Retrieval of Modification Tree

JIANG Jing , LIU Zi-Hao , ZHANG Li , WANG Liang

2024, 35(11):5065-5082. DOI: 10.13328/j.cnki.jos.007047 CSTR: 32375.14.jos.007047

Abstract (385) HTML (507) PDF 6.97 M (1945) Comment (0) Favorites

Abstract:As the scale of open-source artificial intelligence (AI) systems expands, software development and maintenance become difficult. GitHub is one of the most important hosting platforms for open-source projects in the open-source community. Developers can easily participate in the development of open-source projects through pull request systems provided by GitHub. The description of pull requests can help the core teams of the project understand the content of the pull requests and the intention of the developers and promote the acceptance of the pull request. At present, a considerable proportion of developers do not provide a description for the pull request, which not only increases the workload of the core team but also is not conducive to the maintenance of the project in the future. This study proposes a method named PRSim to automatically generate descriptions for pull requests. This method extracts features including commit messages, comment updates, and code changes from pull requests, builds a syntax modification tree, and uses a tree-structured autoencoder to find other pull requests with similar code changes. Then, with the help of the description of a similar pull request, it summarizes commit messages and comment updates through an encoder-decoder network to generate the description of a new pull request. The experimental results show that the generation effect of PRSim reaches 36.47%, 27.69%, and 35.37% in terms of the F1 score of metrics Rouge-1, Rouge-2, and Rouge-L, respectively, which is 34.3%, 75.2%, and 55.3% higher than LeadCM, 16.2%, 22.9%, and 16.8% higher than Attn+PG+RL, and 23.5%, 72.0%, and 24.8% higher than PRHAN.

Temporal Knowledge Graph Reasoning Based on Diffusion Probability Distribution

ZHOU Guang-You , LI Peng-Fei , XIE Peng-Hui , LUO Chang-Yin

2024, 35(11):5083-5097. DOI: 10.13328/j.cnki.jos.007002 CSTR: 32375.14.jos.007002

Abstract (803) HTML (487) PDF 3.04 M (2900) Comment (0) Favorites

Abstract:Temporal knowledge graph reasoning aims to fill in missing links or facts in knowledge graphs, where each fact is associated with a specific timestamp. The dynamic variational framework based on variational autoencoder is particularly effective for this task. By jointly modeling entities and relations using Gaussian distributions, this method not only offers high interpretability but also solves complex probability distribution problems. However, traditional variational autoencoder-based methods often suffer from overfitting during training, which limits their ability to accurately capture the semantic evolution of entities over time. To address this challenge, this study proposes a new temporal knowledge graph reasoning model based on a diffusion probability distribution approach. Specifically, the model uses a bi-directional iterative process to divide the entity semantic modeling process into multiple sub-modules. Each sub-module uses a forward noisy transformation and a backward Gaussian sampling to model a small-scale evolution process of entity semantics. Compared with the variational autoencoder-based method, this study can obtain more accurate modeling by learning the dynamic representation of entity semantics in the metric space over time through the joint modeling of multiple submodules. Compared with the variational autoencoder-based method, the model improves by 4.18% and 1.87% on the Yago11k dataset and Wikidata12k dataset for evaluating the MRR of the indicator and by 1.63% and 2.48% on the ICEWS14 and ICEWS05-15 datasets, respectively.

Multi-view Interaction Graph Convolutional Network for Semi-supervised Classification

WANG Yue-Tian , FU Si-Chao , PENG Qin-Mu , ZOU Bin , JING Xiao-Yuan , YOU Xin-Ge

2024, 35(11):5098-5115. DOI: 10.13328/j.cnki.jos.007007

Abstract (735) HTML (495) PDF 8.93 M (2853) Comment (0) Favorites

Abstract:In current real life where data sources are diverse, and manual labeling is difficult, semi-supervised multi-view classification algorithms have important research significance in various fields. In recent years, graph neural networks-based semi-supervised multi-view classification algorithms have achieved great progress. However, most of the existing graph neural networks carry out multi-view information fusion only in the classification stage, while neglecting the multi-view information interaction between the same sample during the training stage. To solve the above issue, this study proposes a model for semi-supervised classification, named multi-view interaction graph convolutional network (MIGCN). The Transformer Encoder module is introduced to the graph convolution layer trained on different views, which aims to adaptively acquire complementary information between different views for the same sample during the training stage. More importantly, the study introduces the consistency constraint loss to make the similar relationship of the final feature expressions of different views as similar as possible. This operation can make graph convolutional neural networks during the classification stage better utilize the consistency and complementarity information between different views reasonably, and then it can further improve the robust performance of the multi-view fusion feature. Extensive experiments on several real-world multi-view datasets demonstrate that compared with the graph-based semi-supervised multi-view classification model, MIGCN can better learn the essential features of multi-view data, thereby improving the accuracy of semi-supervised multi-view classification.

Attention Guided Logo Detection and Recognition

ZHANG Dong-Ming , JIN Guo-Qing , LU Ding-Yu , ZHANG Jing , ZHANG Yong-Dong

2024, 35(11):5116-5132. DOI: 10.13328/j.cnki.jos.007033 CSTR: 32375.14.jos.007033

Abstract (431) HTML (428) PDF 7.36 M (1980) Comment (0) Favorites

Abstract:In natural scenes, logos such as trademarks and traffic signs are susceptible to shooting angle, carrier deformation, and scale changes, which reduces logo detection accuracy. Thus, this study proposes an attention guided logo detection and recognition network (AGLDN) to jointly optimize the model robustness for multi-scale and complex deformation. First, a logo synthesis dataset is established by image collection and mask generation of logo templates, image selection of logo background, and logo image generation. Then, based on RetinaNet and FPN, multi-scale features are extracted and high-level semantic feature mapping is formed. Finally, the attention mechanism guided network is employed to focus on the logo area, and the influence of logo deformation on feature robustness is suppressed to improve logo detection and recognition. Experimental results show that the proposed method can reduce the influence of scale changes and non-rigid deformation, and improve detection accuracy.

Microblog Topic Model Based on Message Passing and Graph Prior Distribution

WANG Hao-Cheng , HE Rui-Fang , WU Chen-Hao , LIU Huan-Yu

2024, 35(11):5133-5148. DOI: 10.13328/j.cnki.jos.007035 CSTR: 32375.14.jos.007035

Abstract (370) HTML (447) PDF 6.20 M (2023) Comment (0) Favorites

Abstract:Detecting latent topics in social media texts is a meaningful task, and the short and informal posts will cause serious data sparsity. Additionally, models based on variational auto-encoders (VAEs) ignore the social relationships among users during topic inference and VAE assumes that each input data point is independent. This results in the lack of correlation information between the inferred latent topic variables and incoherent topics. Social network structure information can not only provide clues for aggregating contextual messages but also indicate topic correlation among users. Therefore, this study proposes to utilize the microblog topic model based on message passing and graph prior distribution. This model can encode richer context information by graph convolution network (GCN) and integrate the interactive relationship of users by graph prior distribution during VAE topic inference to better understand the complex correlation among multiple data points and mine social media topic information. The experiments on three actual datasets validate the effectiveness of the proposed model.

Nested Named Entity Recognition Based on Span Boundary Perception

CAI Yu-Xiang , LUO Da , GAN Yang-Lei , HOU Rui , LIU Xue-Yi , LIU Qiao , SHI Xiao-Jun

2024, 35(11):5149-5162. DOI: 10.13328/j.cnki.jos.007040 CSTR: 32375.14.jos.007040

Abstract (611) HTML (916) PDF 2.04 M (2444) Comment (0) Favorites

Abstract:Named entity recognition (NER) is a fundamental task in information extraction and aims to locate the boundaries of entities in a sentence and classify them. In response to the fuzzy boundaries of nested entities based on span detection models, this study proposes a nested NER model based on span boundary perception. Firstly, it utilizes a bidirectional affine attention mechanism to capture the semantic relevance among word tokens and then generates a span semantic representation matrix. Secondly, it designs a second-order diagonal neighborhood difference operator and establishes a span semantic difference mechanism to extract semantic difference information among spans. Additionally, a span boundary perception mechanism is introduced to employ the local feature extraction ability of sliding windows to enhance the span boundary semantic differences, thereby accurately locating the entity span. The model is validated on three benchmark datasets of ACE04, ACE05, and Genia. The experimental results show that the proposed model outperforms related work in entity recognition accuracy. Additionally, the study conducts ablation experiments and case studies to verify the effectiveness of the proposed semantic difference mechanism and span boundary perception mechanism, providing new ideas and empirical evidence for further research on NER.

IRCGN: Generation Network for Effective Multi-view Outlier Detection

ZHENG Xiao , WANG Quan-Xin , HUANG Jun

2024, 35(11):5163-5178. DOI: 10.13328/j.cnki.jos.007044 CSTR: 32375.14.jos.007044

Abstract (375) HTML (439) PDF 2.31 M (1814) Comment (0) Favorites

Abstract:Due to the complex features of multi-view data, multi-view outlier detection has become a very challenging research topic in outlier detection. There are three types of outliers in multi-view data, namely class outliers, attribute outliers, and class-attribute outliers. Most of the early multi-view outlier detection methods are based on the assumption of clustering, which makes it difficult to detect outliers when there is no clustering structure in the data. In recent years, many multi-view outlier detection methods use the multi-view consistent nearest neighbor assumption instead of the clustering assumption, but they still suffer from the problem of inefficient detection of new data. In addition, most existing multi-view outlier detection methods are unsupervised, which are affected by outliers during model learning and do not work well when dealing with datasets with high outlier rates. To address these issues, this study proposes an intra-view reconstruction and cross-view generation network for effective multi-view outlier detection to detect the three types of outliers, which consists of two modules: intra-view reconstruction and cross-view generation. By training with normal data, the proposed method can fully capture the features of each view in the normal data and reconstruct and generate the corresponding views better. In addition, a new outlier calculation method is proposed to calculate the corresponding outlier scores for each sample to efficiently detect new data. Extensive experimental results show that the proposed method significantly outperforms existing methods. This is a piece of work to apply a deep model based on generative adversarial networks to multi-view outlier detection.

Cascade Graph Convolution Network Based on Multi-level Graph Structures in Heterogeneous Graph

SONG Ling-Yun , LIU Zhi-Zhen , ZHANG Yang , LI Zhan-Huai , SHANG Xue-Qun

2024, 35(11):5179-5195. DOI: 10.13328/j.cnki.jos.007051

Abstract (728) HTML (506) PDF 7.34 M (2087) Comment (0) Favorites

Abstract:A heterogeneous graph is a graph with multiple types of nodes and edges, also known as a heterogeneous information network, which is often used to model systems with rich features and association patterns in the real world. Link prediction between heterogeneous nodes is a fundamental task in network analysis. In recent years, the development of heterogeneous graph neural network (HGNN) has greatly advanced the task of link prediction, which is usually regarded as a feature similarity analysis between nodes or a binary classification problem based on paired node features. However, when learning node feature representations, existing HGNNs usually only focus on the associations between adjacent nodes or the meta-path-based structural information. This not only makes these HGNNs difficult to capture the semantic information of the ring structure inherent in heterogeneous graphs but also ignores the complementarity of structural information at different levels. To solve the above issues, this study proposes a cascade graph convolution network based on multi-level graph structures (CGCN-MGS), which is composed of graph neural networks based on three graph structures of different levels: neighboring, meta-path, and ring structures. CGCN-MGS can mine rich and complementary information from multi-level features and improve the representation ability of the learned node features on the semantics and structure information of nodes. Experimental results on several benchmark datasets show that CGCN-MGS can achieve state-of-the-art performance on the link prediction of heterogeneous graphs.

Pseudo-labeling Algorithm Based on Optimal Transport for Deep Semi-supervised Learning

ZHAI De-Ming , SHEN Si-Xian , ZHOU Xiong , JIANG Jun-Jun , LIU Xian-Ming , JI Xiang-Yang

2024, 35(11):5196-5209. DOI: 10.13328/j.cnki.jos.007054 CSTR: 32375.14.jos.007054

Abstract (532) HTML (553) PDF 8.16 M (2733) Comment (0) Favorites

Abstract:Deep learning has been widely employed in many fields and yields excellent performance. However, this often requires the support of large amounts of labeled data, which usually means high costs and harsh application conditions. Therefore, with the development of deep learning, how to break through data limitations in practical scenarios has become an important research problem. Specifically, as one of the most important research directions, semi-supervised learning greatly relieves the data requirement pressure of deep learning by conducting learning with the assistance of abundant unlabeled data and a small number of labeled data. The pseudo-labeling method plays a significant role in semi-supervised learning, and the quality of its generated pseudo labels will influence the final results of semi-supervised learning. Focusing on pseudo-labeling in semi-supervised learning, this study proposes the pseudo-labeling method based on optimal transport theory, which introduces the pseudo-labeling procedure constraint with labeled data as generation process guidance. On this basis, the pseudo-labeling procedure is converted to the optimization problem of optimal transport, which offers a new form for solving pseudo-labeling. Meanwhile, to solve this problem, this study introduces the Sinkhorn-Knopp algorithm for approximate fast solutions to avoid the heavy computation burden. As an independent module, the proposed method can be combined with other semi-supervised learning tricks such as consistency regularization for complete semi-supervised learning. Finally, this study conducts experiments on four classic public image classification datasets of CIFAR-10, SVHN, MNIST, and FashionMNIST to verify the effectiveness of the proposed method. The experimental results show that compared with the state-of-the-art semi-supervised learning methods, this method yields better performance, especially under fewer labeled data.

Adversarial Attack of Time Series Forecasting Based on Local Perturbations

ZHANG Yao-Yuan , YUAN Ji-Dong , LIU Hai-Yang , WANG Zhi-Hai , ZHAO Pei-Xiang

2024, 35(11):5210-5227. DOI: 10.13328/j.cnki.jos.007056 CSTR: 32375.14.jos.007056

Abstract (484) HTML (405) PDF 3.49 M (2141) Comment (0) Favorites

Abstract:Time series forecasting models have been widely used in various domains of daily life, and the attack against these models is related to the security of data in applications. At present, adversarial attacks on time series mostly perform large-scale perturbation at the global level, which leads to the easy perception of adversarial samples. At the same time, the effectiveness of adversarial attacks decreases significantly with the magnitude shrinkage of the perturbation. Therefore, how to generate imperceptible adversarial samples while maintaining a competitive performance of attack is an urgent problem that needs to be solved in the current adversarial attack field of time series forecasting. This study first proposes a local perturbation strategy based on sliding windows to narrow the perturbation interval of the adversarial sample. Second, it employs the differential evolutionary algorithm to find the optimal attack points and combine the segmentation function to partition the perturbation interval to further reduce the perturbation range and complete the semi-white-box attack. The comparison experiments with existing adversarial attack methods on several different deep learning models show that the proposed method can generate less perceptible adversarial samples and effectively change the prediction trend of the model. The proposed method achieves sound attack results in four challenging tasks, namely stock trading, electricity consumption, sunspot observation, and temperature prediction.

MCL4DGA: DGA Domain Detection Method Based on Multi-view Contrastive Learning

WANG Ji-Hu , LIU Zi-Yan , NI Jin-Chao , KONG Fan-Yu , SHI Yu-Liang

2024, 35(11):5228-5248. DOI: 10.13328/j.cnki.jos.007003 CSTR: 32375.14.jos.007003

Abstract (494) HTML (491) PDF 11.06 M (1848) Comment (0) Favorites

Abstract:In the field of cyber security, the mendacious domains generated by the domain generation algorithm (DGA) are called DGA domains. Similar to real domains, they are usually a random combination of characters or numbers, which makes DGA domains highly camouflaged. Hackers take advantage of the disguised nature of DGA domains to carry out cyber attacks, so as to bypass security detection. How to effectively detect DGA domains has become a research hotspot. Traditional statistical machine learning detection methods require the manual construction of domain feature sets. However, the quality of domain features constructed manually or semi-automatically varies, which affects the accuracy of detection. In view of the powerful automatic feature extraction and representation capability of deep neural networks, a DGA domain detection method based on multi-view contrastive learning (MCL4DGA) is proposed. Different from existing methods, it incorporates attentional neural networks, convolutional neural networks, and recurrent neural networks to effectively capture global, local, and bidirectional multi-view feature dependencies of domain sequences. Besides, the self-supervision signals derived by contrastive learning can enhance the expressiveness between multi-view feature learning encoders and thus improve the accuracy of detection. The effectiveness of the proposed method is verified by experimental comparison with current methods on a real dataset.

Personalized Federated Learning Method Based on Bregman Divergence and Differential Privacy

ZHANG Shao-Bo , ZHANG Ji-Yong , ZHU Geng-Ming , LONG Sai-Qin , LI Zhe-Tao

2024, 35(11):5249-5262. DOI: 10.13328/j.cnki.jos.007032 CSTR: 32375.14.jos.007032

Abstract (771) HTML (439) PDF 7.10 M (2739) Comment (0) Favorites

Abstract:Federated learning has caught much attention because it can solve data islands. However, it also faces challenges such as the risk of privacy leakage and performance degradation due to model heterogeneity under non-independent and identically distributed data. To this end, this study proposes a personalized federated learning method based on Bregman divergence and differential privacy (FedBDP). This method employs Bregman divergence to measure the differences between local and global parameters and adopt it as a regularization term to update the loss function, thereby reducing model differences to improve model accuracy. Meanwhile, adaptive differential privacy technology is utilized to perturb local model parameters, and the attenuation coefficient is defined to dynamically adjust the level of the differential privacy noise in each round, and thus reasonably allocate the privacy noise level and improve the model availability. Theoretical analysis shows that FedBDP satisfies convergence conditions under both strongly convex and non-convex smooth functions. Experimental results demonstrate that the FedBDP method can guarantee accuracy in the MNIST and CIFAR10 datasets on the premise of satisfying differential privacy.

White-box Implementation and Analysis of SIMON

LUO Yi-Nuo , CHEN Jie , WANG Chao

2024, 35(11):5263-5278. DOI: 10.13328/j.cnki.jos.007049 CSTR: 32375.14.jos.007049

Abstract (431) HTML (466) PDF 5.54 M (2068) Comment (0) Favorites

Abstract:In the white-box attack context, an attacker can access the implementation process of the cryptographic algorithm, observe the dynamic execution and internal details of the algorithm, and modify it arbitrarily. In 2002, Chow et al. proposed the concept of white-box cipher and pointed out the white-box implementation of the AES algorithm and DES algorithm by using lookup table technology, which is called the CEJO framework. The white-box implementation obfuscates the existing cryptographic algorithms, protects the key in the form of software under white-box attack, and ensures the correctness of the algorithm results. SIMON is a lightweight block cipher algorithm, which is widely used in Internet of Things devices because of its great software and hardware performance. It is of great practical significance to study the white-box implementation of this algorithm. This study presents two white-box implementations of the SIMON algorithm. The first scheme (SIMON-CEJO) uses the classical CEJO framework to protect the lookup tables by using network codings, so as to confuse the key. In this scheme, the occupied memory space is 369.016 KB. The security analysis shows that the SIMON-CEJO scheme can resist BGE attack and affine equivalent algorithm attack, but it fails to resist differential computing analysis. The second scheme (SIMON-Masking) uses the encoding method proposed by Battistello et al. to encode the plaintext information and key information, and it uses the homomorphism of encoding to convert the XOR operation and AND operation into modular multiplication and table lookup operation. Finally, the corresponding ciphertext result is obtained by decoding. During the operation of the algorithm, the Boolean mask is added to the AND operation. The randomness of the codings protects the real key information and improves the ability of the scheme to resist differential computing analysis and other attacks. SIMON-Masking occupies 655.81 KB of memory space, and the time complexity of the second-order differential computing based on the Legendre symbol is O(n²klog₂p). The comparison results of the two schemes show that the classical CEJO framework cannot effectively defend against differential computing analysis, but using new coding and adding masks are effective white-box implementation methods.

ElasticDAG: Elastic DAG-based Blockchain

YUE Jing-Tao , XIAO Jiang , ZHANG Shi-Jie , CHENG Feng , CHEN Han-Hua , JIN Hai

2024, 35(11):5279-5305. DOI: 10.13328/j.cnki.jos.007050 CSTR: 32375.14.jos.007050

Abstract (431) HTML (491) PDF 10.30 M (1881) Comment (0) Favorites

Abstract:A directed acyclic graph (DAG)-based blockchain adopts a parallel topology and can significantly improve system performance compared with conventional chain-based blockchains with a serial topology. As a result, it has attracted wide attention from the industry. However, the storage model and the consensus protocol of the existing DAG-based blockchains are highly coupled, which lacks the flexibility to meet diversified application demands. Furthermore, most DAG-based blockchains lack flexibility at the consensus protocol level and are limited to probabilistic consensus protocols, which is difficult to take into account confirmation latency and security and is especially unfriendly to delay-sensitive applications. Therefore, this study presents the elastic DAG-based blockchain, namely ElasticDAG. The core idea is to decouple the storage model and the consensus protocol, enabling them to proceed in parallel and independently, so as to flexibly adapt to diversified applications. In order to improve the throughput and activity of the system, an adaptive block confirmation strategy and an epoch-based block ordering algorithm are designed for the storage model. In response to the need to reduce transaction confirmation latency, a low-latency DAG blockchain hybrid consensus protocol is designed. Experimental results demonstrate that the ElasticDAG prototype in WAN can achieve a throughput exceeding 11 Mb/s, and it yields a confirmation latency of tens of seconds. Compared with OHIE and Haootia, ElasticDAG can reduce confirmation latency by 17 times and improve security from 91.04% to 99.999 914% while maintaining the same throughput and consensus latency.

Partitioned DM Scheduling for Sporadic Real-time Tasks Based on Interference Time in Parallel Machine

LIU Hong-Biao , SONG Cheng-Hao , WANG Ting-Yu , JIANG Jing-Jing , QIAO Lei , YANG Meng-Fei

2024, 35(11):5306-5318. DOI: 10.13328/j.cnki.jos.007036 CSTR: 32375.14.jos.007036

Abstract (335) HTML (450) PDF 8.31 M (1783) Comment (0) Favorites

Abstract:Partitioned DM (deadline-monotonic) scheduling of sporadic real-time tasks is a classic research problem. This study proposes a partitioned scheduling algorithm PDM-FFD (partitioned deadline-monotonic first-fit decrease) with higher processor utilization for constrained-deadline sporadic tasks. In PDM-FFD, firstly tasks are sorted in non-decreasing order according to the relative deadline, then the first-fit strategy is utilized to select the processor core to allocate tasks, and each core adopts DM scheduling policy. Finally, a tighter schedulability determination method is obtained by analyzing the task interference time to determine the task schedulability. This study proves that the speedup factor of PDM-FFD is 3 - (3△ + 1)/(m + △) and the time complexity is O(n²) + O(nm). △= ∑_{τ_j∈τ}C_j×u_j/D_max where τ_j belongs to the task set τ, C_j is the worst-case execution time, u_j is the utilization, D_max is the maximum relative deadline, n is the task number, and m is the processor core number. The speedup factor of PDM-FFD is strictly less than 3 - 1/m, which outperforms the existing multi-core partitioned scheduling algorithm FBB-FFD. Experiments show that PDM-FFD improves processor utilization by 18.5% compared to other available algorithms on a four-core processor. The PDM-FFD performance improves with the increasing processor core number, task set utilization, and task number. Due to high performance, PDM-FFD can be widely utilized in typical real-time systems such as resource-constrained spacecraft, autonomous vehicles, and industrial robots.

Optimal Scheduling Approach for Device and Human Resource Collaborative Tasks

HE Fei-Jia , SHEN Li-Wei , SONG Yu-Chen , NIU Jun-Yu , ZHAO Wen-Yun

2024, 35(11):5319-5340. DOI: 10.13328/j.cnki.jos.007037 CSTR: 32375.14.jos.007037

Abstract (433) HTML (481) PDF 9.48 M (2012) Comment (0) Favorites

Abstract:Ubiquitous computing for human-cyber-physical integration is becoming a new requirement and trend in software development. Based on this new computing paradigm, human-cyber-physical applications further extend software technology to the effective utilization of offline resources, including physical devices and human resources. As a typical human-cyber-physical scenario, the collaboration between the device and human resources in the physical world features resource selectivity, high task frequency, and worker dynamics. Traditional resource scheduling techniques cannot meet the scheduling requirements of this task type (referred to as DHRC task). Thus, this study proposes an optimal scheduling method for collaborative tasks between device and human resources. This method includes two stages of device resource scheduling and human resource scheduling. In the device resource scheduling stage, a device resource scheduling algorithm based on NSGA-II is proposed to optimize task resource selection by comprehensively considering such factors as task distance, device load, and the worker number around the device location. In the human resource scheduling stage, a human resource scheduling algorithm based on DPSO is put forward to optimize the worker selection and corresponding path planning according to such factors as worker location and collaboration dependency. Experiments in a simulated environment show that the algorithm in the first stage is equivalent in efficiency and superior in utility to the compared algorithm (discrete particle swarm optimization algorithm). The algorithm in the second stage is superior in efficiency and utility to the compared algorithm (the genetic algorithm improved by the tournament mechanism).

微信服务号

微信订阅号

Current Issue

Volume

Issue