Abstract:In healthcare domain, the care process is critical for the care quality. Clinical pathway (CP), which integrates a lot of medical knowledge, is a tool for standardizing the care process. However, most of existing CPs are designed by experts with limited experience and data, and consequently they are always static and non-adaptive for implementation. According to authors' previous work, topic-based CP mining is an effective approach which can discover the process model from clinical data. The various clinical activities are summarized into several topics by latent dirichlet allocation (LDA), and each clinical day in the patient trace is converted to a topic distribution. A CP model can be derived by applying process mining method on the topic-based sequences. However, LDA ignores the similarity between clinical days, which means that in some cases, two similar days may be assigned quite different topic distributions. This paper proposes an optimized topic model for clinical topic discovering by incorporating the similarity constraint, which is based on the domain knowledge. Experiments on real data demonstrate that this new approach can discover quality topics which are useful for topic-based CP mining.