Abstract:In big data era, database systems face three challenges. Firstly, the traditional empirical optimization techniques (e.g., cost estimation, join order selection, knob tuning) cannot meet the high-performance requirement for large-scale data, various applications and diversified users. It is needed to design learning-based techniques to make database more intelligent. Secondly, many database applications require to use AI algorithms, e.g., image search in database. AI algorithms can be embedded into database, utilizing database techniques to accelerate AI algorithms, and providing AI capability inside databases. Thirdly, traditional databases focus on using general hardware (e.g., CPU), but cannot fully utilize new hardware (e.g., ARM, GPU, AI chips). Moreover, besides relational model, tensor model can be utilized to accelerate AI operations. Thus, it is needed to design new techniques to make full use of new hardware. To address these challenges, an AI-native database is designed. On one hand, AI techniques are integrated into databases to provide self-configuring, self-optimizing, self-monitoring, self-diagnosis, self-healing, self-assembling, and self-security capabilities. On the other hand, databases are enabled to provide AI capabilities using declarative languages in order to lower the barrier of using AI. This study introduce five levels of AI-native databases and provide several open challenges of designing an AI-native database. Autonomous database knob tuning, deep reinforcement learning based optimizer, machine-learning based cardinality estimation, and autonomous index/view advisor are also taken as examples to showcase the superiority of AI-native databases.