Abstract:The task of completing knowledge graphs aims to reveal the missing fact triples within the knowledge graph based on existing fact triples (head entity, relation, tail entity). Existing research primarily focuses on utilizing the structural information within the knowledge graph. However, these efforts overlook that other modal information contained within the knowledge graph may also be helpful for knowledge graph completion. In addition, since task-specific knowledge is typically not integrated into general pre-training models, the process of incorporating task-related knowledge into modal information extraction becomes crucial. Moreover, given that different modal features contribute uniquely to knowledge graph completion, effectively preserving useful multimodal information poses a significant challenge. To address these issues, this study proposes a multimodal knowledge graph completion method that incorporates task knowledge. It utilizes a fine-tuned multimodal encoder tailored to the current task to acquire entity vector representations across different modalities. Subsequently, a modal fusion-filtering module based on recurrent neural networks is utilized to eliminate task-independent multimodal features. Finally, the study utilizes a simple isomorphic graph network to represent and update all features, thus effectively accomplishing multimodal knowledge graph completion. Experimental results demonstrate the effectiveness of our approach in extracting information from different modalities. Furthermore, it shows that our method enhances entity representation capability through additional multimodal filtering and fusion, consequently improving the performance of multimodal knowledge graph completion tasks.