Securely store and maintain your research data
HMS offers three primary storage types — Active, Standby, and Cold — each designed with distinct behaviors, performance, and means of access. Storage options listed here focus on HMS resources, so eligibility requirements may apply. Institutions around the Longwood Medical Area (LMA) maintain additional storage solutions, so comparing and selecting the best storage option for your research is essential.
The HMS Research Data Management (RDM) team collaborates with researchers to navigate the complexities of data storage and dissemination throughout the research data lifecycle. Proper storage maintenance throughout the lifecycle is imperative to ensure data remains secure and adheres to recommended safety protocols.
All tools and dashboards maintained by HMS IT display storage amounts in tebibytes (TiB). For filesystems that utilize snapshots (refer to “Data Protection” within the HMS Storage Comparison Chart below), deleted data can be recovered from snapshots within 60 days of the last weekly snapshot taken before the deletion. Snapshots are taken weekly, along with daily snapshots retained for 14 days. Please note the temporary modification to the Standby storage snapshot retention schedule summarized in the “Upcoming data center relocation” news article.
To learn more about best practices and support services for research data lifecycles, visit the Harvard Biomedical Data Management website or contact the HMS Research Data Management team at rdmhelp@hms.harvard.edu.
-
Eligibility
Eligibility varies by storage type.Researchers StaffQuad Status Active Compute (O2) Standby Cold Quad A Available* Available* Available Quad B Up to 10TiB Up to 10TiB TBD Up to 10TiB Quad C Up to 10TiB Up to 10TiB TBD Up to 10TiB None** Up to 10TiB (fee-based) Up to 10TiB (fee-based) N/A * Allocation amounts dependent on lab needs and available resources.
** None refers to HMS professors at affiliate hospitals without an appointment in an HMS Basic Science department.
Additional information about quad status is available in the External Use of HMS High-Performance Computing (HPC) policy.
-
Security
Level 3 security across all storage optionsEach storage offering integrates robust security measures to protect and preserve research data. Deleted data can be recovered from snapshots within 60 days of the last weekly snapshot taken before the deletion. Snapshots are taken weekly, along with daily snapshots retained for 14 days.
- All storage offerings displayed support up to Harvard Security Level 3.
- An active HMS account ID is required to access HMS Storage offerings.
-
HMS Storage Comparison Chart
Storage options description and overviewDescription Use CasesActive Compute (O2)
Scratch (O2)
Active Collaborations (research.files)
Standby
Cold
Description Shared group or project folders connected to the High Performance Compute Cluster (O2). Temporary storage (days to weeks) for data that can be easily regenerated (transient files). An automatic deletion process periodically removes files. Shared group or project drives designed for active research data. Allows individuals to share documents and files within and outside of their department. Shared group folders for infrequently accessed data; available for reference/retrieval. Long-term storage of inactive research data; retained to meet data retention requirements. Use Cases Active research data that is frequently accessed, modified, or computed against.
Next-gen sequencing analysis, molecular dynamics, mathematical modeling, image analysis, proteomics, and other research areas.
Pipeline that writes intermediate files that can be easily regenerated and do not require permanent retention.
Temporary storage location for additional data analysis outside of Active Compute.
Storage location for data derived from tools, instruments, or desktops.
Allows individuals to share documents and files within or outside of their department/group.
An intermediary location; data associated with lab members who have recently departed or projects in the process of being completed. Examples found in the Cold Storage Frequently Asked Questions documentation. Filesystem Paths /n/data1 /n/data2
/n/groups
/n/scratch3 research.files
/n/files
/n/standby
standby.files
N/A Location HPC (O2)
o2.hms.harvard.edu
Transfer cluster:
transfer.rc.hms.harvard.edu
HPC (O2)
o2.hms.harvard.edu
Transfer Cluster:
transfer.rc.hms.harvard.edu
Local workstation
(research.files)
Viewable in O2 (/n/files)
Local workstation (standby.files)
Viewable in O2 (/n/standby)
Amazon Web Services (Cloud) Read/Write Speed Fast Fast Moderate Moderate N/A Eligibility Available on request to researchers at HMS.
Fee-based for affiliate hospitals.
Available on request to researchers at HMS and its affiliate hospitals.
10TiB per user*
Available on request to researchers at HMS.
Storage limits may apply based on eligibility.
Available on request to researchers at HMS.
Fee-based for affiliate hospitals.
Available on request to researchers at HMS.
Storage limits may apply based on eligibility.
Data Protection Snapshots and disaster recovery. No snapshots or backups Snapshots and disaster recovery. Snapshots and disaster recovery. Off-site disaster recovery. Cost to HMS Very High Very High High Medium Low Request Storage Active Compute (O2) Storage Request Form To Create a User Scratch Directory Active Collaborations Storage Request Form Standby Storage Request Form Cold Storage Request Form *Labs can inquire with Research Computing to discuss collaborative options (ex. a group storage space).
-
Active Compute (O2)
Shared high-performance computing environmentHigh-performance ComputingThe O2 cluster is a shared high-performance computing environment with dedicated hardware available for high-memory and GPU-intensive tasks. Active Compute comprises shared group or project folders accessible when signed in to the O2 cluster or the transfer cluster. It is intended for active research data that is frequently accessed, modified, or computed against.
- Access: Login to the O2 or transfer cluster
- Eligibility: The O2 cluster is available to researchers at HMS and its affiliate hospitals. An HMS ID is required. Use of O2 CPUs and GPUs is free for labs in HMS Basic Science departments, and fee-based for affiliates. If ineligible, please contact HMS Research Computing to discuss further options.
- Notifications: When a folder grows above the pre-established storage limit, automatic email notifications are sent to the responsible party (often the PI) and the Lab Data Manager (if the PI has designated one), so they can work with lab members to manage the lab's storage usage.
Contact
- To request storage, complete the Active Compute Storage request form. Allocation amounts are dependent on lab needs and available resources.
- Contact Research Computing Consultants: rchelp@hms.harvard.edu
Resources
-
Active Scratch
Temporary compute storageHigh-performance ComputingScratch space is available on O2, mounted at /n/scratch/ It is intended to be used as temporary storage (days to weeks) for data that can be easily regenerated. Should be utilized for temporary files during a single job on the HMS High-Performance Compute (HPC) Cluster.
Note – Non-temporary data should be stored elsewhere, as an automatic deletion process will take place, periodically removing files.
- Access: Create a user scratch directory
- Eligibility: An O2 account must use the scratch storage space—additional information is provided in the resources below. 25TiB is provided per user.
Contact
Contact Research Computing Consultants: rchelp@hms.harvard.edu.
Resources
- Additional information about scratch storage is available on the Research Computing Confluence web pages.
-
Active Collaborations (research.files)
Shared group foldersDepartment shares StorageActive Collaborations are shared group or project folders designed for active research data that is frequently accessed or modified. Active Collaborations allows individuals to share documents and files with colleagues, both within and outside their department.
- Access – Shared group or project drives accessed from desktops and laptops. Also available on the O2 cluster as /n/files.
- Eligibility – Requires at least two co-owners who can add, edit, and remove files and grant additional user access. If you seek access to an existing collaboration, you need to have the manager or owner of the collaboration grant access.
- Notifications – When a folder grows above the pre-established limit, automatic email notifications are sent to the responsible party (often the principle investigator) and the Lab Data Manager (if the principle investigator has designated one) so they can work with lab members to manage the lab's storage usage. Storage limit notifications have been applied to all Active Collaboration folders over 1TiB in size.
Contact
- To request storage, complete the Active Collaborations storage request form. Allocation amount depended on lab needs and available resources.
- Contact Research Data Management: rdmhelp@hms.harvard.edu
Resources
-
Standby storage
Standby storage suits data accessed less frequentlyTransitional IntermediateWe recommend Standby storage for infrequently accessed data that is still directly available for reference, retrieval, or analysis. It can act as an intermediary location, a space to organize and prepare research data for long-term retention. Data stored in Standby does not need to be recalled or downloaded for access; it is available immediately.
Standby is accessible via desktop or the transfer cluster on O2 and provides a similar degree of data protection as Active storage, including snapshots and off-site backup. Please note the temporary modification to the Standby storage snapshot retention schedule summarized in the “Upcoming data center relocation” news article.
- Access – Standby Storage can be provisioned in both NFS and SMB. Windows and macOS clients typically access shared storage over SMB, whereas Linux hosts access Standby Storage via NFS. HMS Storage will provide two subfolders within the lab folder called collaborations and compute, which are designed to mirror permissions set on Active storage. The compute subfolder will store data generated from computations on O2, and the collaborations subfolder will store data originating from collaboration folders (research.files). Ensure data are moved into the correct subfolder, depending on the source location; this is necessary to maintain correct permissions in Standby.
- Eligibility – HMS, HSDM, and HSPH Quad-based faculty and staff are currently eligible to use Standby storage. Other HMS affiliates may also qualify for access. Email rdmhelp@hms.harvard.edu to discuss available storage options.
- Notifications – When a folder grows above the pre-established limit, automatic email notifications are sent to the responsible party (often the principal investigator) and the Lab Data Manager (if the principal investigator has designated one) so they can work with lab members to manage the lab's storage usage.
- Storage limits – Standby storage enforces filesystem quotas to optimize storage capacity. Exceeding these limits triggers email notifications to the responsible party and the Lab Data Manager. It's important to note that storage limits are flexible and can be adapted to meet changing research needs.
Learn more about how to access and transfer data across HMS filesystems.
Contact
To request storage, complete the Standby Storage Request Form. Allocation amounts are dependent on lab needs and available resources.
If you have any questions, email Research Data Management at rdmhelp@hms.harvard.edu.
-
Cold storage
Cold storage is designed for long-term retention of inactive dataLong-term Project completionCold storage is a low-cost data storage service intended for long-term storage of inactive research data, such as after project completion, that must be retained to meet data retention requirements. Data identified for Cold storage should not be expected to be accessed or retrieved except in rare and unexpected circumstances. Due to the slow transfer rates and costs of transfer (currently covered by HMS), only the minimum amount of data needed should be retrieved. The current Cold Storage offering does not support any sharing or distribution. If you are interested in data sharing and distribution email, contact rdmhelp@hms.harvard.edu.
Learn more about the Cold storage offering.
- Access – Direct access to the files in Cold storage will be limited to HMS IT. Metadata associated with the migrated files will be viewable using your group’s Starfish Zone dashboard and corresponding manifest files. HMS IT will perform the data migrations to Cold storage, as it continues to investigate options for self-service. HMS IT will scan the dataset prior to the data transfer to document the files to be moved. HMS IT will inform the lab if they encounter any unsupported files during the pre-migration screening process
- Eligibility – Cold storage is available to labs whose PIs have a primary or secondary faculty appointment in an HMS Quad department. If ineligible, HMS IT encourages labs to contact their affiliate institutions to discuss other long term storage options. There is currently no cost for eligible labs to utilize Cold Storage. HMS IT is currently assuming all costs associated with the administration of Cold storage (including storage and transfer).
Examples for when it is appropriate to move data to Cold storage:
- Project Completion
- Grant or funding agency data retention requirements for project datasets
- Journal stipulations for data retention of published data
- Harvard institutional policy stipulating 7-year retention of “essential research records”
- Data retained for intellectual property purposes (e.g., patents)
- Inactive data associated with a departed lab member
- Raw data difficult to regenerate, associated with completed projects
Contact
To request data be moved to or retrieved from Cold Storage please complete the Cold Storage Request Form. You can also contact the Research Data Management team: rdmhelp@hms.harvard.edu