Abstract:This study proposes a data representation of electroencephalogram (EEG), which transforms 1D chain-like EEG vector sequences into 2D mesh-like matrix sequences. The mesh structure of the matrix at each time point corresponds to the distribution of EEG electrodes, which could better represent the spatial correlation of EEG signals among multiple physically adjacent electrodes. Then, the sliding window is used to divide the 2D mesh sequence into segments containing equal time length, and each segment is seen as an EEG sample integrating the temporal and spatial correlation of raw EEG recordings. Two hybrid deep learning models are also proposed, i.e., cascaded convolutional recurrent neural network (CASC_CNN_LSTM) and cascaded double convolutional neural network (CASC_CNN_CNN). Both of them use the CNN model to capture the spatial correlation between physically adjacent EEG signals from the converted 2DEEG meshes. The former uses the LSTM model to learn the time dependency of the EEG sequence, and the latter uses another CNN model to extract the deeper discriminative features of local time and space. Extensive binary emotion classification experiments in valence are carried out on a large scale open DEAP dataset (32 subjects, 9830 400 EEG recordings). The results show that the average classification accuracy of the proposed CASC_CNN_LSTM and CASC_CNN_CNN networks on spatiotemporal 2D meshlike EEG sequence reaches 93.15% and 92.37%, respectively, which significantly outperform the baseline models and the state-of-the-art methods. It demonstrates that the proposed method effectively improves the accuracy and robustness of EEG emotion classification due to its ability of jointly learning deeper spatiotemporal correlated features using hybrid deep neural network.