Abstract:A system of structural integrity constraints for XML (XSICs) is introduced, which specifies five structural relationships between different paths or nodes in XML documents, including path implication, path cooccurrence, path mutual-exclusion, obligatory inclusion and exclusive inclusion. This paper defines the syntax and semantics of these XSICs, and studies their core role in XML query optimization. Based on the concept of sub-path, this paper proposes an algorithm for minimizing path expression in the presence of XSICs. By using the path implication closure as a tool, the algorithm cannot only effectively eliminate redundant nodes or predicates, but also identify invalid path expressions. Experimental results show the effectiveness and efficiency of the proposed minimization algorithm.