Mortise: Composable and Customizable Memory Allocator Framework
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Dynamic memory allocators are fundamental components of modern applications. They manage free memory and handle user memory requests. Modern general-purpose dynamic memory allocators ensure the balance of performance and memory footprint. However, in view of different memory footprints and optimization goals in application scenarios, a general-purpose memory allocator is not the optimal solution. Special-purpose memory allocators for specific application scenarios usually can better satisfy system requirements. However, they are time-consuming and error-prone to implement. Developers often use the memory allocation framework to build special-purpose dynamic memory allocators. However, the existing memory allocator framework has the problems of poor abstraction ability and insufficient composability and customizability. For this reason, this study proposes a composable and customizable dynamic memory allocator framework, namely mortise, based on function composability by reviewing the dynamic memory allocation process from the perspective of functional programming. The framework abstracts system memory allocation as a composition of hierarchical functions of several multiple decoupled memory allocations, and these functions can provide policies to ensure higher customizability and composability. Mortise is implemented by using standard C. To achieve zero performance overhead of hierarchical function composition, mortise uses the metaprogramming features offered by the C preprocessor. Developers can quickly build a memory allocator for targeted application scenarios by composing and customizing the hierarchical function of allocators. In order to prove the effectiveness of mortise, this study presents three different memory allocator instances, namely tlsfcc, hslab, and wfslab, by using mortise. Specifically, tlsfcc is designed for multi-core embedded application scenarios, which improves the parallel throughput by replacing the synchronization strategy; hslab is a core-aware slab-type allocator, which optimizes performance on heterogeneous hardware by customizing thread cache; wfslab is a low-latency and wait-free/lock-free allocator. This study runs benchmarks to compare these allocators with several existing memory allocators. The experiments are carried out on an 8-core x86/64 platform and an 8-core heterogeneous aarch64 embedded platform, and the experimental results show that tlsfcc achieves a mean speedup of 1.76 and 1.59 on the two platforms compared with the original tlsf allocator; hlsab achieves only 69.6% and 85.0% execution time compared with the tcmalloc with a similar architecture; the worst-case memory request latency of wfslab is the smallest among all memory allocators in the experiment, including the state-of-art lock-free memory allocators: mimalloc and snmalloc.

    Reference
    Related
    Cited by
Get Citation

欧阳湘臻,朱怡安,史先琛.榫卯:一种可组合的定制化内存分配框架.软件学报,2024,35(4):2076-2098

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:April 17,2022
  • Revised:August 02,2022
  • Adopted:
  • Online: June 28,2023
  • Published: April 06,2024
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063