Overview:
Standby should be leveraged for infrequently accessed data that is still directly available for reference, retrieval or analysis. Standby can act as an intermediary location; a space to organize and prepare research data for long-term retention. Data stored in Standby does not need to be recalled or downloaded for access, the data is available immediately.
Standby is accessible via desktop or the transfer cluster on O2, providing a similar degree of data protection as Active storage including snapshots and off-site backup.
How to Access:
Standby Storage can be provisioned in both NFS and SMB. Windows and MacOS clients typically access shared storage over SMB, whereas Linux hosts access Standby Storage via NFS.
To access and transfer data to/from Standby and the O2 cluster:
- Login to the transfer cluster at transfer.rc.hms.harvard.edu; /n/standby is not available from O2 login or compute nodes.
- To migrate data from an existing O2 folder to a Standby folder, we recommend you utilize this approved method:
- rsync - Copies files either to or from a remote host, or locally on the current host
- (e.g. rsync --archive --verbose --log-file=/path/to/logfile.txt /path/to/source/data /path/to/destination)
Data Transfer Validation
- The du command should NOT be used for transfer validation; it reports physical disk usage, which changes with disk pool, compression, etc., all of which is “invisible” to the end user. The most definitive tool for copy/transfer with validation is rsync since it checks time, date, ownership, group and most importantly it can do a checksum on the file’s contents.
- See above for example rsync command
- For individual file validation, use an md5 checksum
To transfer data to/from Standby and research.files (collaborations):
-
In order to connect to Standby from a desktop, a user needs to connect to (aka mount) standby.files (similar to how a user mounts research.files). Here is how you can proceed, depending on whether you have a Windows or Mac:
Windows:
How users can connect to an HMS file server (like research.files or standby.files) via Windows
\\standby.files.med.harvard.edu
Mac OSX:
How users can connect to an HMS file server (like research.files or standby.files) via Mac
smb://standby.files.med.harvard.edu/
Once you mount/access the top level of standby.files you will have to navigate to the directory that you are interested in. You will see many folders to which you do not have access.
- Begin to 'drag and drop' folders from research.files into the Standby ‘collaborations’ subfolder
IMPORTANT: At this time, please only use the drag/drop method to transfer data to Standby. Users should refrain from copying/pasting data or using the Windows “Move to Folder” option as this has led to confusion with duplicate datasets. Please do not migrate data from /n/files to the Standby 'collaborate' subfolder using the transfer server on O2, this will interfere with the existing permissions structure.
- Note: Transferring files while using VPN will be quite slow and can interfere with other users accessing HMS resources over the VPN.
- If a remote desktop is not available, please contact rdmhelp@hms.harvard.edu to discuss further options.
Permissions
When a Standby folder is created, HMS Storage will add two subfolders within the lab folder called 'collaborations’ and ‘compute,’ which are designed to mirror permissions set on Active storage. The ‘compute’ subfolder will store data generated from computations on O2 and the ‘collaborations’ subfolder will store data originating from collaboration folders (research.files). Please make sure data are moved into the correct subfolder, depending on the source location; this is necessary to maintain correct permissions in Standby.
Directory Structure
For new labs, a Standby storage folder will be created as part of the Active storage request, to encourage integration of Standby in your lab’s data workflow. If you belong to an established lab, you can request creation of a Standby folder by filling out the Standby Storage Form. Modification of the directory structure is available to all users of the directory. Researchers are advised to structure the folders to correspond to how the records were generated, to complement proposed or existing workflows.
Storage Limits
Filesystem quotas will be applied to all lab level directories. They enable HMS to appropriately scale and lifecycle stored research data. Defining and limiting filesystem sizes allows for adequate provisioning of standby capacity to the HMS community.
Storage Limit Notifications
Storage limits are designed to allow researchers to continue to conduct research while optimizing the use and cost of your storage, and can be modified to accommodate changing research needs. When a folder grows above the pre-established limit, automatic email notifications are sent to the responsible party (often the PI) and the Lab Data Manager (if one has been designated by the PI) so they can work with lab members to manage the lab’s storage usage. If you are the responsible party for more than one folder, you will receive individual emails with details about each folder.
Eligibility
HMS, HSDM, and HSPH quad-based faculty and staff are currently eligible to use Standby storage. Other HMS affiliates may also qualify for access. Please email rdmhelp@hms.harvard.edu to discuss available storage options.
Request Storage:
Allocation amounts dependent on lab needs and available resources.
To request storage: Standby Storage Request Form
Contact:
Research Data Management: rdmhelp@hms.harvard.edu