EPJ Web Conf.
Volume 214, 201923rd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2018)
|Number of page(s)||8|
|Section||T4 - Data handling|
|Published online||17 September 2019|
Best Practices in Accessing Tape-Resident Data in HPSS*
Scientific Data & Computing Center, Brookhaven National Laboratory,
* Corresponding author: firstname.lastname@example.org (or email@example.com)
Published online: 17 September 2019
Tape is an excellent choice for archival storage because of the capacity, cost per GB and long retention intervals, but its main drawback is the slow access time due to the nature of sequential medium. Modern enterprise tape drives now support Recommended Access Ordering (RAO), which is designed to reduce data recall/retrieval times. BNL SDCC's mass storage system currently holds more than 100 PB of data on tapes, managed by HPSS. Starting with HPSS version 7.5.1, a new feature called “Tape Order Recall (TOR) has been introduced. It supports both RAO and non-RAO drives. The file access performance can be increased by 30% to 60% over the random file access. Prior to HPSS 7.5.1, we have been using an in-house developed scheduling software, aka ERADAT. ERADAT accesses files based on the file logical position order. It has demonstrated a great performance over the past decade long usage in BNL. In this paper we will present a series of test results, compare TOR and ERADAT's performance under different configurations to show how effective TOR (RAO) and ERADAT perform and what is the best solution in data recall from SDCC's tape storage
© The Authors, published by EDP Sciences, 2019
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Initial download of the metrics may take a while.