Abstract:Highest cumulative reward (HCR) is a rule for developing conventions in multi-agent systems.But it will keep system maintaining an emerged convention from evolving to more rational ones while conditions of system are developing.In this paper,the notion of conventions is defined,and the stability of them is analyzed.Furthermore,two rules called highest average reward (HAR) and highest recent reward (HRR) are introduced.They both guarantee the evolving process of stable conventions,and the convergence rate of them is better than that of HCR.