Abstract:Computing a repair for inconsistent XML documents is significant in applications. But getting an optimum repair is a NP complete problem, especially when XML documents violate both the function dependence and the key constraints. This paper proposes a cost-based heuristic algorithm, which can find a repair with the lowest cost in polynomial time. It first scans the original XML documents once to get the inconsistent data. Then it computes the general candidate repairs for each inconsistent data, and gets a whole document repair heuristicallybased on its cost. The experimental evaluation show that even when XML documents are large, with high percent of dirty elements, and against many different constraints, the algorithm can still run in less than O(n3) w.r.t. the size of inconsistent elements.