Abstract:Request load balancing is the core issue in distributed file system metadata management. To maximize the throughput of the metadata service, an adaptive request load balancing framework is critical. This paper presents a distributed cache framework above the distributed metadata management schemes to manage hotspots rather than managing all metadata to achieve request load balancing. Compared with the existing distributed metadata load balancing framework, it has a higher degree of flexibility of the two-tier load balancing structure, and is stronger on the perception of the overall load. It also avoids hot spots redistribution and namespace structure destruction caused by metadata migration. Compared with data, metadata has its own distinct characteristics, such as small size and large quantity. The cost of non-use metadata prefetching is much less than data prefetching. Based on this study, a time period-based prefetching strategy and a perfecting-based adaptive replacement cache algorithm are devised to improve the performance of the distributed caching layer to adapt constantly changing workloads. Finally, the presented approach is evaluated with a Hadoop distributed file system cluster.