• Online First

    Select All
    Display Type: |
    • Survey on High-dimensional Bayesian Optimization

      Online: March 05,2025 DOI: 10.13328/j.cnki.jos.007304

      Abstract (55) HTML (0) PDF 8.56 M (25) Comment (0) Favorites

      Abstract:Bayesian optimization is a technique for optimizing black-box functions. Due to its high sample utilization efficiency, it is widely applied across various scientific and engineering fields, such as hyperparameters tuning of deep models, compound design, drug development, and material design. However, the performance of Bayesian optimization significantly deteriorates when the input space is of high dimensionality. To overcome this limitation, numerous studies carry out high-dimensional extensions on Bayesian optimization methods. To deeply analyze research methods of high-dimensional Bayesian optimization, this study categorizes these methods into three types based on assumptions and characteristics of different kinds of work: methods based on the effective low-dimensional hypothesis, methods based on additive assumptions, and methods based on local search. Then, this study elaborates on and analyzes these methods. This study first focuses on analyzing the research progress of these three types of methods. Then, the advantages and disadvantages of each method in the application of Bayesian optimization are compared. Finally, the main research trends in high-dimensional Bayesian optimization at the current stage are summarized, and future development directions are discussed.

    • Robustness Evaluation of ChatGPT Against Chinese Adversarial Attacks

      Online: February 26,2025 DOI: 10.13328/j.cnki.jos.007299

      Abstract (15) HTML (0) PDF 2.17 M (31) Comment (0) Favorites

      Abstract:Large language model (LLM) like ChatGPT has found widespread applications across various fields due to their strong natural language understanding and generation capabilities. However, deep learning models exhibit vulnerability when subjected to adversarial example attacks. In natural language processing, current research on adversarial example generation methods typically employs CNN-based models, RNN-based models, and Transformer-based pre-trained models as target models, with few studies exploring the robustness of LLMs under adversarial attacks and quantifying the evaluation criteria of LLM robustness. Taking ChatGPT against Chinese adversarial attacks as an example, this study introduces a novel concept termed offset average difference (OAD) and proposes a quantifiable LLM robustness evaluation metric based on OAD, named OAD-based robustness score (ORS). In a black-box attack scenario, this study selects nine mainstream Chinese adversarial attack methods based on word importance to generate adversarial texts, which are then employed to attack ChatGPT and yield the attack success rate of each method. The proposed ORS assigns a robustness score to LLMs for each attack method based on the attack success rate. In addition to the ChatGPT that outputs hard labels, this study designs ORS for target models with soft-labeled outputs based on the attack success rate and the proportion of misclassified adversarial texts with high confidence. Meanwhile, this study extends the scoring formula to the fluency assessment of adversarial texts, proposing an OAD-based adversarial text fluency scoring method, named OAD-based fluency score (OFS). Compared to traditional methods requiring human involvement, the proposed OFS greatly reduces evaluation costs. Experiments conducted on real-world Chinese news and sentiment classification datasets to some extent initially demonstrate that, for text classification tasks, the robustness score of ChatGPT against adversarial attacks is nearly 20% higher than that of Chinese BERT. However, the powerful ChatGPT still produces erroneous predictions under adversarial attacks, with the highest attack success rate exceeding 40%.

    • Implicit-enhanced Causal Modeling Method for Phrasal Visual Grounding

      Online: February 26,2025 DOI: 10.13328/j.cnki.jos.007303

      Abstract (9) HTML (0) PDF 3.22 M (28) Comment (0) Favorites

      Abstract:Phrasal visual grounding, a fundamental and critical research task in the field of multimodal studies, aims at predicting fine-grained alignment relationships between textual phrases and image regions. Despite the remarkable progress achieved by existing phrasal visual grounding approaches, they all ignore the implicit alignment relationships between textual phrases and their corresponding image regions, commonly referred to as implicit phrase-region alignment. Predicting such relationships can effectively evaluate the ability of models to understand deep multimodal semantics. Therefore, to effectively model implicit phrase-region alignment relationships, this study proposes an implicit-enhanced causal modeling (ICM) approach for phrasal visual grounding, which employs the intervention strategies of causal reasoning to mitigate the confusion caused by shallow semantics. To evaluate models’ ability to understand deep multimodal semantics, this study annotates a high-quality implicit dataset and conducts a large number of experiments. Multiple sets of comparative experimental results demonstrate the effectiveness of the proposed ICM approach in modeling implicit phrase-region alignment relationships. Furthermore, the proposed ICM approach outperforms some advanced multimodal large language models (MLLMs) on the implicit dataset, further promoting the research of MLLMs towards more implicit scenarios.

    • Twin Support Function Machine for Set-valued Data

      Online: February 26,2025 DOI: 10.13328/j.cnki.jos.007306

      Abstract (20) HTML (0) PDF 6.04 M (34) Comment (0) Favorites

      Abstract:Twin support vector machine (TSVM) can effectively tackle data such as cross or XOR data. However, when set-valued data are handled, TSVM usually makes use of statistical information of set-valued objects such as the mean and the median. Unlike TSVM, this study proposes twin support function machine (TSFM) that can directly deal with set-valued data. In terms of support functions defined for set-valued objects, TSFM obtains nonparallel hyperplanes in a Banach space. To suppress outliers in set-valued data, TSFM adopts the pinball loss function and introduce the weights of set-valued objects. Considering that TSFM involves optimization problems in the infinite-dimensional space, the measure is taken in the form of a linear combination of Dirac measures. Thus the optimization model in the finite-dimensional space is constructed. To solve the optimization model effectively, this study employs the sampling strategy to transform the model into quadratic programming (QP) problems. The dual formulations of the QP problems are derived, which provides theoretical foundations for determining which sampling points are support vectors. To classify set-valued data, the distance from the set-valued object to the hyperplane in a Banach space is defined, and the decision rule is derived therefrom. This study also considers the kernelization of support functions to capture the nonlinear features of data, which makes the proposed model available for indefinite kernels. Experimental results demonstrate that TSFM can capture the intrinsic structure of cross-plane set-valued data and obtain good classification performance in the case of outliers or set-valued objects containing a few high-dimensional examples.

    • Multi-layer Recurrent GCN Cross-domain Recommendation with Pseudo-overlap Detection Mechanism

      Online: February 26,2025 DOI: 10.13328/j.cnki.jos.007289

      Abstract (27) HTML (0) PDF 1012.00 K (28) Comment (0) Favorites

      Abstract:Cross-domain recommendation (CDR) alleviates the cold start problem by transferring the user-item rating patterns from a dataset in a dense rating auxiliary domain to one in a sparse rating target domain, and has been widely studied in recent years. The clustering methods based on single-domain recommendation adopted by most CDR algorithms fail to effectively utilize overlapping information and sufficiently adapt to CDR, resulting in inaccurate clustering results. In CDR, graph convolution network (GCN) methods can fully utilize the associations between nodes to improve recommendation accuracy. However, GCN-based CDR often employs static graph learning for node embedding, ignoring the fact that user preferences may change with different recommendation scenarios, which causes poor model performance across different recommendation tasks and ineffective mitigation of data sparsity. To this end, a multi-layer recurrent GCN CDR model based on a pseudo-overlap detection mechanism is proposed. Firstly, by fully leveraging overlapping data based on the community clustering algorithm Louvain, a pseudo-overlap detection mechanism is designed to mine user trust relationships and similar user communities, thereby enhancing the adaptability and accuracy of clustering algorithms in CDR. Secondly, a multi-layer recurrent GCN consisting of an embedding learning module and a graph learning module is proposed to learn dynamic domain-shared features, domain-specific features, and dynamic graph structures. By conducting iterative enhancement of the two modules, the latest user preferences are obtained to alleviate data sparsity. Finally, a multi-layer perceptron (MLP) is employed to model user-item interactions and obtain predicted ratings. Comparative results with 12 related models across four groups of data domains demonstrate the effectiveness of the proposed method, with average improvements of 5.47%, 3.44%, and 2.38% in MRR, NDCG, and HR metrics respectively.

    • Fine-grained Dichotomies for Symmetric 2-spin System on Regular Graphs

      Online: February 26,2025 DOI: 10.13328/j.cnki.jos.007302

      Abstract (7) HTML (0) PDF 2.53 M (35) Comment (0) Favorites

      Abstract:This study discusses the computational complexity of the partition function of the symmetric dual-spin system on regular graphs. Based on # exponential time hypothesis (#ETH) and random exponential time hypothesis (rETH), this study develops the classical dichotomies of this problem class into the exponential dichotomies, also known as the fine-grained dichotomies. In other words, this study proves that when the given tractable conditions are satisfied, then the problem is solvable in polynomial time; otherwise, there is no sub-exponential time algorithm when #ETH holds. This study also proposes two solutions to solve the in-effectiveness of existing interpolation methods on building sqrt-sub-exponential time reductions under the restriction of planar graphs. It also utilizes these two solutions to discuss the related fine-grained complexity and dichotomy of this problem under the planar graph restriction.

    • Visual-language Multimodal Pre-training Based on Multi-entity Alignment

      Online: February 26,2025 DOI: 10.13328/j.cnki.jos.007321

      Abstract (10) HTML (0) PDF 2.93 M (29) Comment (0) Favorites

      Abstract:Visual-language pre-training (VLP) aims to obtain a powerful multimodal representation by learning on a large-scale image-text multimodal dataset. Multimodal feature fusion and alignment is a key challenge in multimodal model training. In most of the existing visual-language pre-training models, for the multimodal feature fusion and alignment problem, the main approach is that the extracted visual features and text features are directly input into the Transformer model. Since the attention mechanism in the Transformer calculates the similarity between pairs, it is difficult to achieve the alignment among multiple entities. Considering that the hyperedges of hypergraph neural networks possess the characteristics of connecting multiple entities and encoding high-order entity correlations, thus enabling the establishment of relationships among multiple entities. In this study, a visual-language multimodal model pre-training method based on multi-entity alignment of hypergraph neural networks is proposed. In this method, the hypergraph neural network learning module is introduced into the Transformer multi-modal fusion encoder to learn the alignment relationship of multi-modal entities, thereby enhancing the entity alignment ability of the multi-modal fusion encoder in the pre-training model. The proposed visual-language pre-training model is pre-trained on the large-scale image-text datasets and fine-tuned on multiple visual-language downstream tasks such as visual question answering, image-text retrieval, visual grounding, and natural language visual reasoning. The experimental results indicate that compared with the baseline method, the proposed method has performance improvements in multiple downstream tasks, among which the accuracy is improved by 1.8% on the NLVR2 task.

    • Collective Emotional Stabilization Method for Social Network Rumor Detection

      Online: February 26,2025 DOI: 10.13328/j.cnki.jos.007322

      Abstract (21) HTML (0) PDF 2.50 M (32) Comment (0) Favorites

      Abstract:There are numerous and miscellaneous sources of online information. Judging whether it is a rumor in a timely and accurate manner is a crucial issue in the research of the cognitive domain of social media. Most of the previous studies have mainly concentrated on the text content of rumors, user characteristics, or the inherent features confined to the propagation mode, ignoring the key clues of the collective emotions generated by users’ participation in event discussions and the emotional steady-state characteristics hidden in the spread of rumors. In this study, a social network rumor detection method that is oriented by collective emotional stabilization and integrates temporal and spatial steady-state features is proposed. Based on the text features and user behaviors in rumor propagation, the temporal and spatial relationship steady-state features of collective emotions are combined for the first time, which can achieve strong expressiveness and detection accuracy. Specifically, this method takes the emotional keywords of users’ attitude towards a certain event or topic as the basis and uses recurrent neural networks to construct emotional steady-state features of the temporal relationship, enabling the collective emotions to have temporally consistent features with strong expressiveness, which can reflect the convergence effect of the collective emotions over time. The heterogeneous graph neural network is utilized to establish the connections between users and keywords, as well as between texts and keywords so that the collective emotions possess the fine-grained collective emotional steady-state features of the spatial relationship. Finally, the two types of local steady-state features are fused, possessing globality and improving the feature expression. Further classification can obtain the rumor detection results. The proposed method is run on two internationally publicly available and widely used Twitter datasets. Compared with the best-performing method in the baselines, the accuracy is improved by 3.4% and 3.2% respectively; the T-F1 value is improved by 3.0% and 1.8% respectively; the N-F1 value is improved by 2.7% and 2.3% respectively; the U-F1 value is improved by 2.3% and 1.0% respectively.

    • Offline Reinforcement Learning Method with Diffusion Model and Expectation Maximization

      Online: February 19,2025 DOI: 10.13328/j.cnki.jos.007296

      Abstract (17) HTML (0) PDF 5.20 M (202) Comment (0) Favorites

      Abstract:Offline reinforcement learning has yielded significant results in tasks with continuous and intensive rewards. However, since the training process does not interact with the environment, the generalization ability is reduced, and the performance is difficult to guarantee in a discrete and sparse reward environment. The diffusion model combines the information in the neighborhood of the sample data with noise addition to generate actions that are close to the distribution of the sample data, which strengthens the learning and generalization ability of the agents. To this end, offline reinforcement learning with diffusion models and expectation maximization (DMEM) is proposed. The method updates the objective function by maximizing the expectation of the maximum likelihood logarithm to make the strategy more generalizable. Additionally, the diffusion model is introduced into the strategy network to utilize the diffusion characteristics to enhance the ability of the strategy to learn data samples. Meanwhile, the expectile regression is employed to update the value function from the perspective of high-dimensional space, and a penalty term is introduced to make the evaluation of the value function more accurate. DMEM is applied to a series of tasks with discrete and sparse rewards, and experiments show that DMEM has a large advantage in performance over other classical offline reinforcement learning methods.

    • Impact of Mislabeled Changes by SZZ on Performance and Interpretation of Just-in-time Defect Prediction for Mobile APP

      Online: February 19,2025 DOI: 10.13328/j.cnki.jos.007297

      Abstract (20) HTML (0) PDF 3.90 M (187) Comment (0) Favorites

      Abstract:In recent years, as an algorithm for identifying bug-introducing changes, SZZ has been widely employed in just-in-time software defect prediction. Previous studies show that the SZZ algorithm may mislabel data during data annotation, which could influence the dataset quality and consequently the performance of the defect prediction model. Therefore, researchers have made improvements to the SZZ algorithm and proposed multiple variants of SZZ. However, there is no empirical study to explore the effect of data annotation quality by SZZ on the performance and interpretability of just-in-time defect prediction for mobile APP. To investigate the influence of mislabeled changes by SZZ on just-in-time defect prediction for mobile APP, this study conducts an extensive and in-depth empirical comparison of four SZZ algorithms. Firstly, 17 large-scale mobile APP projects are selected from the GitHub repository, and software metrics are extracted by adopting the PyDriller tool. Then, B-SZZ (original SZZ), AG-SZZ, MA-SZZ, and RA-SZZ are employed for data annotation. Then, the just-in-time defect prediction models are built with random forest, naive Bayes, and logistic regression classifiers based on the time-series data partitioning. Finally, the performance of the models is evaluated by traditional measures of AUC, MCC, and G-mean, and effort-aware measures of F-measure@20% and IFA, and a statistical significance test and interpretability analysis are conducted on the results by employing SKESD and SHAP respectively. By comparing the annotation performance of the four SZZ algorithms, the results are as follows. (1) The data annotation quality conforms to the progressive relationship among SZZ variants. (2) The mislabeled changes by B-SZZ, AG-SZZ, and MA-SZZ can cause performance reduction of AUC and MCC of different levels, but cannot lead to performance reduction of G-mean. (3) B-SZZ is likely to cause a performance reduction of F-measure@20%, while B-SZZ, AG-SZZ, and MA-SZZ are unlikely to increase effort during code inspection. (4) In terms of model interpretation, different SZZ algorithms will influence the three metrics with the largest contribution during the prediction, and the la metric has a significant influence on the prediction results.

    Prev 1 2 3 Next Last
    Result 118 Jump to Page GO
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063