• Volume 29,Issue S2,2018 Table of Contents
    Select All
    Display Type: |
    • Modality Compensation Based Action Recognition

      2018, 29(S2):1-15.

      Abstract (1730) HTML (0) PDF 1.60 M (3313) Comment (0) Favorites

      Abstract:With the prevalence of depth cameras, video data of different modalities become more common. Multi-Modal data based human action recognition attracts increasing attention. Different modal data describe human actions from distinct perspectives. How to effectively utilize the complementary information of multi-modal data is a key topic in this area. In this study, we propose a modality compensation based method for action recognition. With RGB/optical flow as source modal data and skeletons as auxiliary modal data, we aim to compensate the feature learning from source modal data, through exploring the common spaces between source and auxiliary modalities. The proposed model is based on deep convolutional neural network (CNN) and long short term memory (LSTM) network to extract spatial and temporal features. With the help of residual learning, a modality adaptation block is proposed to align the distributions of different modalities and achieve modality compensation. To deal with different alignment of source and auxiliary modal data, we propose hierarchical modality adaptation schemes. The proposed model only requires auxiliary modal data in the training process, and is able to improve the recognition performance only with source modal data in the testing phase, which expands the application scenarios of the proposed model. The experiment results illustrate that proposed method outperforms other state-of-the-art approaches.

    • Visual Feature Combination Approach for Zero-Shot Learning

      2018, 29(S2):16-29.

      Abstract (1759) HTML (0) PDF 1.51 M (3816) Comment (0) Favorites

      Abstract:Zero-Shot learning is an important research in the field of machine learning and image recognition. Zero-Shot learning methods normally use the semantic information among unseen classes and seen classes to transfer the knowledge which is learned from examples of seen classes to unseen classes, so as to recognize and classify the examples of unseen classes. In this study, a zero-shot learning approach based on construction of visual feature combination is proposed. The approach generates many examples of unseen class on visual feature level by the way of feature combination, which is first proposed, and thus transforms zero-shot learning problem to be a traditional classification problem solved by supervised learning. The approach mimics human cognition process of associative memory, and includes four steps:feature-attribute relation extraction, example construction, example screening, and domain adaption. On training examples of seen classes, the relationship between class attributes and dimensions of feature is extracted; on visual feature level, examples of unseen classes are generated by visual feature combination; dissimilarity representation is introduced to filter the generated examples of unseen classes; semi-supervised and unsupervised feature domain adaption are proposed to linearly transform the generated examples of unseen classes to be more effective. The proposed approach shows superior performance on three benchmark datasets (AwA, AwA2, and SUN), especially on dataset AwA, it obtains 82.6% top-1 accuracy which is the best result as far as we know. Experiment results demonstrate the effectiveness and superiority of the proposed approach.

    • Image Description Method Based on Generative Adversarial Networks

      2018, 29(S2):30-43.

      Abstract (2102) HTML (0) PDF 1.80 M (4584) Comment (0) Favorites

      Abstract:In recent years, deep learning has gained more and more attention in image description. The existing deep learning methods using CNNs to extract features and RNNs to fold into one sentence. Nevertheless, when dealing with complex images, the feature extraction is inaccurate. And the fixed mode of sentence generation model leads to inconsistent sentences. To solve this problem, this study proposes a method combine channel-wise attention model and GANs, named CACNN-GAN. The channel-wise attention mechanism is added to each conv-layer to extract features, compare with the COCO dataset, and select the top features to generate sentence. Using GANs to generate the sentences, which is generated by the game process between the generator and the discriminator. After that, we can get a sentence generator contains the varied syntax, smooth sentence, and rich vocabulary. Experiments on real datasets illustrates that CACNN-GAN can effectively describe images, and get higher accuracy compared with the state-of-art.

    • Fusion of Gray Scale Cost Aggregation for Stereo Matching

      2018, 29(S2):44-53.

      Abstract (1450) HTML (0) PDF 1.26 M (3344) Comment (0) Favorites

      Abstract:Coarse-to-Fine (CTF), hierarchical strategy, and cross-scale cost aggregation have efficiently expanded cost aggregation methods and yield a highly accurate disparity map to some extent. They are committed to providing a good trick to find the correct matching points in the weak texture region. However, these methods must be multi-scale as the prerequisite and usually need the assistance of image pyramid. They are limited to the propagation of errors from coarse to fine levels and poor recovery of thin structures. In this study, a generic fusion of gray scale cost aggregation framework is proposed which encourages the initial cost aggregation to integrate the cost aggregation of gray image. The main purpose of the gray image after Gaussian filter is to match the corresponding pixels in the weak texture region of the image better. Meanwhile, it does not need to scale down to build the image pyramid and aggregate cost at each scale and thus accelerate the step of cost aggregation. Furthermore, guided image filtering and fast weighted median filtering are introduced in this study for cost aggregation and disparity refine. In addition, to avoid choosing ambiguity that WTA (winner-take-all) brings, the interrelationship between the minimum value of cost aggregation and the second smallest value is utilized to determine the final disparity. It is shown that the fusion of gray scale cost aggregation framework is important as it effectively leads to significant improvement evaluated on Middlebury.

    • Robotic Writing System with Intelligent Interactive Learning Ability

      2018, 29(S2):54-61.

      Abstract (1595) HTML (0) PDF 1.17 M (4484) Comment (0) Favorites

      Abstract:In this study, a robotic intelligence writing system is built based on the Uarm to learn Chinese character strokes. This system can finish automaticly strokes spliting and writing of unfamiliar charater. Besides, based on the dialogue technology and image processing technology, the system can learn the correct strokes from human. Firstly, the system gets the keyword which user want to write and user intention according to the input voice information and the word image information from camera. Then it analyzes the word image and splitting and extracting the strokes if the keyword is detected. If the word is being taught by human, the system would record the strokes order and learn the correct way to write the character. Through the dialogue management, the Uarm can interact with human through wrting and dialogue, learn form human, and write the characters correctly. According to the experimental analysis and subjective evaluation of the test, the system has been well recognized.

    • Gesture Interaction Model Based on Muscle Sensing

      2018, 29(S2):62-74.

      Abstract (1395) HTML (0) PDF 1.39 M (3881) Comment (0) Favorites

      Abstract:Gesture interaction based on muscle sensing has received great attention due to its wearability, implicit interaction, and reliability under the background of human-computer interaction technique shifting from computer-centered to human-centered. However, current researchs lack of a unified semantic model and system model. This paper discussed the classification of interactive gestures, summarized the input primitives suitable for physiological computing technology, and proposed muscle sensing based gesture interaction semantic model and hierarchical processing a system model. Finally, this paper implemented a prototype of object operating gesture recognition under scenario of office environment.

    • Eye Tracking and Gesture Based Interaction for Target Selection on Large Display

      2018, 29(S2):75-85.

      Abstract (1711) HTML (0) PDF 1.16 M (3836) Comment (0) Favorites

      Abstract:Mouse based target selection will require much movement when locating target across long distance on large display. On the other hand, eye tracking technique can locate target more easily and quickly across long distance. Hence, eye tracking has a high potential for fast targets selection on large displays. However, eye tracking still faces the challenges, such as low accuracy, and high error rate of selection operation, especially for gaze-only interaction. This study proposed a multimodal interaction method that combined gaze with gestures. This method utilized gaze for rough selection first, and then utilized hand gesture to confirm accurate selection. Furthermore, in order to keep the selection accuracy when targets are small and crowded, authors used semi-fixed gaze cursor and secondary selection mechanism to optimize the selection process. Finally, the method conducted a user study in different levels of target sizes and distance among targets. The results show that the selection speed and accuracy rate of proposed method are higher than those of the method only using gaze with 16% and 82.6% respectively. In addition, for the selection of the hierarchical menu items, the selection speed and accuracy rate of proposed method are higher than the those of method only using gaze with 13.6% and 55.7% respectively. In addition, the overall performance of proposed method is similar to the mouse based selection method, and it also validates the effectiveness of the proposed method.

    • Method of Behavioral Correlated Stress Perception in Smart Driving

      2018, 29(S2):86-95.

      Abstract (1559) HTML (0) PDF 1.09 M (3578) Comment (0) Favorites

      Abstract:Driver stress detection has great potential for implementing assisted driving because the stress of the people is closely related to their behavior, especially in smart driving. The existing stress perception methods are often used in static environments and lack of convenience, so it is difficult to satisfy the highly dynamic smart driving environments. This study proposes a behavior-assisted stress perception method based on wearable system to achieve natural, accurate, and reliable stress detection in smart driving. This method based on the behavior and multiple metrics to distinguish stress state, can effectively improve the stress detection accuracy. The basic principle is that each person's physiological characteristics and behavioral habits under different stress conditions will have unique effects on stress-related PPG data and behavior-related IMU data. The driver's physiology and motion information are measured using a multi-sensor wearable glove, and then reliable physiological and behavior metrics are obtained through multi-signal fusion techniques. Finally, the SVM model is used to classify the driver's stress state because of good generalization performance. Based on the proposed method, this study deploys a verification experiment in a simulated driving environment, the experimental results show that the stress classification accuracy can reach 95%.

    • Leakage-Resilient Password Entry on Smartwatches Based on Semantic Tactile Feedback Guide

      2018, 29(S2):96-107.

      Abstract (1493) HTML (0) PDF 1.21 M (3369) Comment (0) Favorites

      Abstract:Nowadays, smartwatches are increasingly used in our daily lives. Smartwatches store a large number of personal information of users and it is necessary to design appropriate ways to protect them. PIN is a widely adopted method, but it is not resistant to shoulder-surfing. This work proposes a smart-watch-based identity authentication scheme. This scheme is based on the traditional PIN authentication and prompt password entry by vibration. Three experiments have been designed to examine the performance of this method. In the first experiment, it is tested that what kind of vibration time combination is more acceptable. Results show that the vibration combination of 400 ms and 100 ms is the optimal one. In the second experiment, a set of vibration prompt scheme is designed to establish the mapping relationship between vibration and number. Results prove that the scheme can be effectively remembered and practiced. In the last experiment, the actual password input process is simulated and the traditional unlock method is compared with. Results show that inputting four digits of five-digit password can lead to an overall fast entry speed and high accuracy, while maintaining a high security. This study offers insights into identification design for smartwatches.

    • BCI Assisted Dynamic Target Selection Technique

      2018, 29(S2):108-119.

      Abstract (1627) HTML (0) PDF 1.25 M (3928) Comment (0) Favorites

      Abstract:Dynamic target selection is one of the most basic interactive tasks in modern interaction interfaces. There are a variety of assistive techniques, but the design and parameters of these techniques are largely based on experimental data and cannot be adjusted according to the users' current state. In order to solve this problem, a brain-computer interface assisted dynamic target selection technique based on two assumptions of cognitive load and difficulty perception in this study is proposed, which uses the functional near-infrared spectroscopy (fNIRS) signals to cognitive load perception of users and adjusts the parameters of target selection techniques in real time. This technique can provide personalized assistance to different users and be applicable to different scenarios, user status and task difficulty. The proposed hypothesis through a set of experiments is verified, and brain-computer assisted dynamic target selection technique constructed based on this assumption is better than both the auxiliary and fixed auxiliary technologies. Specifically, the selection error rate is reduced by 20.55% and 12.09% respectively, and the completion time is reduced by 998.35 ms and 208.67 ms respectively.

    • Real-Time Force Generation Algorithms Based on Double-Touch Interaction

      2018, 29(S2):120-126.

      Abstract (1598) HTML (0) PDF 797.38 K (3176) Comment (0) Favorites

      Abstract:Force generation and interaction can improve the immersive and authenticity of the virtual environment, which is an important research direction in the field of human-computer interaction. For the limitation of single-touch force interaction, this study proposes a real-time force generation method based on double-touch interaction. First of all, the double touch interaction is divided into four states. Then, the real-time force generation method of double-touch interaction in different states is proposed. Finally, an experimental environment of double-contact interaction is established to test the prosposed method. The evaluation research of force sense generation shows that the method can generate the double-touch force sense in real time, enhance the immersive sense and authenticity of the virtual environment, and improve the naturalness of human hand haptic interaction.

Current Issue


Volume , No.

Table of Contents

Archive

Volume

Issue

联系方式
  • 《Journal of Software 》
  • 主办单位:Institute of Software, CAS, China
  • 邮编:100190
  • 电话:010-62562563
  • 电子邮箱:jos@iscas.ac.cn
  • 网址:https://www.jos.org.cn
  • 刊号:ISSN 1000-9825
  •           CN 11-2560/TP
  • 国内定价:70元
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063