Abstract:This paper proposes a framework of gesture detection and classification in continuous sequence data. The goal of detection is to determine the start and end frame of a gesture in the continuous sequence. The detection method using multi-modal features ensures the robustness and high accuracy. To classify the detected gestures represented by covariance matrices, a distance measurement on Grassmann manifold is presented to strengthen the discriminative power. The framework is evaluated on ChaLearn Multimodal Gesture dataset 2013 and achieves high accuracy. Both Recall and Precision are higher than 93%.