Writeback I/O Scheduler Based on Small Synchronous Writes
Author:
Affiliation:

Fund Project:

National Basic Research Program of China (973) (2015CB352400); National Natural Science Foundation of China (61572487, 61672513, 61572377, U1401258, 61550110250); Science and Technology Planning Project of Guangdong Province (2015B01 0129011, 2016A030313183); Shenzhen Overseas High-Caliber Personnel Innovation Funds (KQCX20140521115045446),Research Program of Shenzhen (JSGG20150512145714248, JSGG20160229200957727)

  • Article
  • | |
  • Metrics
  • |
  • Reference [25]
  • |
  • Related
  • | | |
  • Comments
    Abstract:

    Small and synchronous writes are pervasive in various environments and manifest in various levels of software stack, ranging from device drivers to application software. Given a block interface, small write can cause serious write amplifications, which can substantially degrade the overall I/O performance. To address this issue, this paper presents a block I/O scheduler, named Hitchhike. Hitchhike is able to identify small writes, and embed them into other data blocks through data compression. With Hitchhike, a writeback buffer for small synchronous writes can be enabled, not only removing the write amplification, but also dramatically improving the performance of small synchronous writes. Hitchhike is implemented based on the Deadline I/O schedulers in Linux 2.6.32, and evaluated by running Filebench benchmark. Testing results show that compared to traditional approaches, Hitchhike can significantly improve the performance of synchronous small writes up to 48.6%.

    Reference
    [1] Bryant RE. Data-Intensive supercomputing:The case for DISC. Pdl.cmu.edu, 2007.
    [2] Hey T, Trefethen A. The data deluge:An e-science perspective. In:Proc. of the Grid Computing:Making the Global Infrastructure a Reality. 2003. 809-824.[doi:10.1002/0470867167.ch36]
    [3] Szalay AS, Kunszt PZ, Thakar A, Gray J, Slutz D, Brunner RJ. Designing and mining multi-terabyte astronomy archives:The Sloan digital sky survey. ACM SIGMOD Record, 1999,29(2):451-462.[doi:10.1145/342009.335439]
    [4] How much text verus metadata is in a tweet? 2011. http://goo.gl/EBFIFs
    [5] Decandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W. Dynamo:Amazon's highly available key-value store. ACM SIGOPS Operating Systems Review, 2007,41(6):205-220.[doi:10.1145/1323293.1294281]
    [6] Miller EL, Greenan K. Reliable and efficient metadata storage and indexing using NVRAM. 2008.
    [7] Ousterhout JK, Costa HD, Harrison D, Kunze JA, Kupfer M, Thompson JG. A trace-driven analysis of the UNIX 4.2BSD file system. ACM SIGOPS Operating Systems Review, 1985,19:15-24.[doi:10.1145/323647.323631]
    [8] Atikoglu B, Xu Y, Frachtenberg E, Jiang S, Palaczny M. Workload analysis of a large-scale key-value store. ACM SIGMETRICS Performance Evaluation Review, 2012,40(1):53-64.[doi:10.1145/2318857.2254766]
    [9] Wu X, Xu Y, Shao Z, Jiang S. LSM-Trie:An LSM-tree-based ultra-large key-value store for small data. In:Proc. of the 2015 USENIX Annual Technical Conf. USENIX Association, 2015. 71-82.
    [10] Rosenblum M, Ousterhout JK. The design and implementation of a log-structured file system. In:Proc. of the ACM SIGOPS Operating Systems Review. ACM Press, 1996. 1-15.[doi:10.1145/121132.121137]
    [11] Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE. Bigtable:A distributed storage system for structured data. ACM Trans. on Computer Systems, 2008,26(2):205-218.[doi:10.1145/1365815.1365816]
    [12] O'Neil P, Cheng E, Gawlick D, O'Neil E. The log-structured merge-tree (LSM-tree). Acta Informatica, 1996,33(4):351-385.[doi:10. 1007/s002360050048]
    [13] Wu X, Shao Z, Jiang S. Selfie:Co-Locating metadata and data to enable fast virtual block devices. In:Proc. of the ACM Int'l Systems and Storage Conf. ACM Press, 2015. 1-11.[doi:10.1145/2757667.2757676]
    [14] Chen PM, Ng WT, Chandra S, Aycock C, Rajamani G, Lowell D. The Rio file cache:Surviving operating system crashes. ACM Sigplan Notices, 1996,31(9):74-83.[doi:10.1145/248209.237154]
    [15] Wang Y, Davis K, Xu Y, Jiang S. iHarmonizer:Improving the disk efficiency of I/O-intensive multithreaded codes. In:Proc. of the IEEE Int'l Parallel and Distributed Processing Symp. IEEE Computer Society, 2012. 921-932.[doi:10.1109/IPDPS.2012.87]
    [16] Srinivasan K, Bisson T, Goodson G, Voruganti K. iDedup:Latency-aware, inline data deduplication for primary storage. In:Proc. of the USENIX Conf. on File and Storage Technologies. USENIX Association, 2012.
    [17] Chen J, Wei Q, Chen C, Wu L. FSMAC:A file system metadata accelerator with non-volatile memory. In:Proc. of 2013 IEEE the 29th Symp. on Mass Storage Systems and Technologies (MSST). IEEE, 2013. 1-11.[doi:10.1109/MSST.2013.6558440]
    [18] Condit J, Nightingale EB, Frost C, Ipek E, Lee B, Burger D, Coetzee D. Better I/O through byte-addressable, persistent memory. In:Proc. of the ACM Symp. on Operating Systems Principles (SOSP 2009). 2009. 133-146.[doi:10.1145/1629575.1629589]
    [19] Non-Volatile cache for host-based raid controls. 2011. http://www.dell.com/downloads/global/products/pvaul/en/NV-Cache-for-Host-Based-RAID-Controllers.pdf
    [20] Ganger GR, Mckusick MK, Soules CAN, Patt YN. Soft updates:A solution to the metadata update problem in file systems. ACM Trans. on Computer Systems, 2000,18(2):127-153.[doi:10.1145/350853.350863]
    [21] Chen F, Koufaty DA, Zhang X. Hystor:Making the best use of solid state drives in high performance storage systems. In:Proc. of the Int'l Conf. on Supercomputing. 2011. 22-32.[doi:10.1145/1995896.1995902]
    [22] Huang H, Hung W, Shin KG. FS2:Dynamic data replication in free disk space for improving disk performance and energy consumption. ACM SIGOPS Operating Systems Review, 2005,39(5):263-276.[doi:10.1145/1095809.1095836]
    [23] Wallace G, Douglis F, Qian H, Shilane P, Smaldone S, Chamness M, Hsu W. Characteristics of backup workloads in production systems. In:Proc. of the USENIX Conf. on File and Storage Technologies. USENIX Association, 2012.
    [24] Chidambaram V, Sharma T, Arpaci-Dusseau AC, Arpaci-Dusseau RH. Consistency without ordering. In:Proc. of the USENIX Conf. on File and Storage Technologies. USENIX Association, 2012.
    [25] Lu Y, Shu J, Wang W. ReconFS:A reconstructable file system on flash storage. In:Proc. of the USENIX Conf. on File and Storage Technologies. USENIX Association, 2014. 75-88.
    Related
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

刘星,江松,王洋,范小朋,须成忠.一种基于小数据同步写的回写I/O调度器.软件学报,2017,28(8):1968-1981

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 07,2016
  • Revised:September 21,2016
  • Online: August 15,2017
You are the first2034776Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063