Abstract:Memorability of a video is a metric to describe that how memorable the video is. Memorable videos contain huge values and automatically predicting the memorability of large numbers of videos can be applied in various applications including digital content recommendation, advertisement design, education system, and so on. This study proposes a global and local information based framework to predict video memorability. The framework consists of three components, namely global context representation, spatial layout, and local object attention. The experimental results of the global context representation and local object attention are remarkable, and the spatial layout also contributes a lot to the prediction. Finally, the proposedmodel improves the performances of thebaseline of MediaEval 2018 Media Memorability Prediction Task.