HMS IT invests in several projects, driven by research needs and informed by research priorities, to support the research enterprise at Harvard Medical School.
This page provides a summary listing of those projects and their scope and timeline. We hope this will be informative and seek participation – advice and testing — from interested research community members.
If you have questions, contact the Research Computing Lead for each project or email us at rchelp@hms.harvard.edu
Active Projects
-
Electronic Lab Notebooks
Description
In recent years, Research Computing (RC) has implemented a Pilot ELN service utilizing eLabNext technology. Presently, we are engaged in a Transition to Operation Project aimed at fully establishing and supporting the ELN service. The primary objective of this service remains to assist principal investigators in setting electronic documentation standards for their laboratories, in accordance with the findable, accessible, interoperable, and reusable (FAIR) resource guidelines set forth by the NIH. Additionally, the service provides a platform for labs to effectively track samples and seamlessly integrate them with experiment documentation.
Objectives
- Define the ELN service
- Decide on user authentication method (HMS id or Harvard Key)
- Decide whether we should move to private cloud
- Clarify roles and responsibilities for IT and non-IT personnel
- Define the eLabNext on-boarding, off-boarding, and support models
- Plan long-term platform financing
- Ensure that STAT provisioning processes are complete
- Provide effective online support tools
- Market and launch a production ELN service
Stakeholders
This project, led by Research Computing under the guidance of Linsdsey Sudbury, is advised by the HMS ELN Advisory committee which includes individuals drawn from the HMS community: Steve Blacklow, David Corey, Lucas Farnung, Kristen Buttinger, Katherine Stebbins, Dan Wainstock, Elain Martin, Johanna Gutlerner, Linsdsey Sudbury, Caroline Shamu, Jim Gould, Kelly Arnett, Paula Montero Llopis, Marie Bao, Jeremy Muhlich, Alon Oyler-Yaniv, and David Heitmeyer.
If you are interested in more details please contact: Lindsey Sudbury (lindsey.sudbury@hms.harvard.edu)
Status
Implementation:
- As of June 27, 2024, there are 107 activated users from 10 Cores, and 11 labs.
- The ELN service model is in the process of being refined, including processes for user onboarding and off-boarding, training, and support.
- The service governance model is being defined.
- The service website has been updated, with more updates to follow.
- Best practice guidelines are being developed by a non-IT working group.
-
Research Data Visualization Service
Description
In Phase II of the Research Data Visualization Platform (RDVP) project, a Pilot platform was implemented and delivered; architected to use cloud infrastructure, providing a redundant and expandable design and a highly available environment. Making the application accessible to collaborators when they need access to the data.
Phase III of the project will deliver a production Research Data Visualization Service that meets the needs of the majority of identified HMS use cases. It will also define policies for use of the platform, and develop improved processes ensure a secure and stable service.
Objectives
- Develop policies regarding cost allocation and technical service limits, and potentially the provision of a paid service for non-quad users
- Identify requirements that are not being met by services provided in RDVP Phase II
- Enhance existing RDVP services to provide a data publishing solution for the majority of identified HMS research use cases, for which there is a reasonably achievable solution.
- Ensure that users have access to documentation that allows them to publish from both O2 and shinyapps.io
- Define a support model for both application support and back-end support
Status
Phase II: Complete. Pilot System Delivered
The team has developed a re-usable cloud-based architecture and are currently supporting Posit Connect servers for 12 different HMS groups who are publishing their own apps. In addition, they are collaborating closely with the CCB to support an on-premises Posit Connect server, enabling the CCB to support other groups while developing their new apps.
Phase III : Implementation:
As of June 27, 2024, there are currently 21 Posit Connect instances in operation to support the RDVP service. Operational procedures, which include monthly updates and user notifications, are operating seamlessly and effectively. The team has successfully met many newly identified user requirements, such as handling large data volumes and providing database access. Recently, a cost-effective method for collecting costs from AWS/EKS Connect instances has been discovered, and the team are analyzing those data.
-
Software Containerization
Description
The O2 Containerization Production Service project aims to create a production quality system for managing and deploying high-performance computing workloads in a containerized environment. The goal is to provide a scalable, flexible, and efficient way of running HPC workloads, reducing the complexity of deployment, and making it easier to manage software dependencies while leveraging the existing cluster scheduling system.
This project builds on earlier work done as part of the Containerization Pilot project.
Objectives
- Define policies for building, scanning, using, and storing containers for containerized workflows on O2
- Improve the robustness and functionality of the container scanning tool chain
- Provide additional tools to facilitate container building
- Facilitate improved container organization and management
- Provide RCCs with training to enable them to support the build/scan process
- Improve documentation of the O2 Containerization service
Stakeholders
This Research Computing project will be assisted by HMS community members. They list of key community steakholders will be amended and published prior to the commencement of the project.
If you are interested in more details please contact: Amir Karger (Amir_Karger@hms.harvard.edu)
Status
On Hold:
- Research Computing continues to support a limited Pilot containerization service.
- The production service project will commence as soon as appropriate devOps and devSecOps resources are available.
-
Research Data Migration
Description
HMS IT is upgrading to a new storage system designed to efficiently manage large amounts of data quickly. The Research Data Migration project aims to streamline the storage architecture to a centralized, stable, and secure infrastructure.
O2 Home folders and scratch have already been migrated to the new storage successfully, with several benefits and efficiencies already achieved:
- Automated deduplication has increased the total available capacity in these directories by 35%.
- Automated data compression functionality has reduced data traffic to the storage system by 60%.
- The new storage system reduced energy usage by 75% compared to energy use on Isilon.
- The storage footprint has been reduced by 66%, leading to significant savings in future data center operational costs.
Stakeholders
This Research Computing project will be assisted by HMS community members.
If you are interested in more details, contact rchelp@hms.harvard.edu.
Status
In progress: Review the current progress on the Research Data Migration page.
Completed Projects
-
Cold Storage
Description
The goal of this project is to create an IT service that supports long-term storage and data management workflows for the HMS research community. The initial step in this initiative will create a solution for the storage of inactive data that will focus on preservation and low cost over accessibility, using the term ‘Cold Storage’ to describe this storage tier.
Objectives
- Validate research use cases with pilot participants
- Develop data movement, and identification functionality, setup storage targets, and applicable network infrastructure
- Connect data movement tool with storage target and the required network to enable minimal viable workflow
- Pilot cold storage service with research participants
- Transition piloted service to operations
Stakeholders
This project is executed by Research Computing in collaboration with IT Infrastructure along with seven labs that participated in pilot testing
Status
Completed: Cold Storage is now available for use.
Project Dates
- Dec. 2020 - April 2023
-
FISMA Moderate Secure Enclave
Description
HMS IT will establish a FISMA Moderate certified institutional secure enclave for use by HMS researchers for studies that require FISMA certified environments for sensitive data such as large protected health information data sets. Ultimately this environment will achieve FedRAMP certification which will support multiple FISMA Moderate research projects, and CMS and other regulated data sets.
DBMI is seeking FISMA Moderate certification for an IT platform, BioData Catalyst, funded by the NIH National Heart Lung and Blood Institute (NHLBI) as part of an extensive national collaborative research program increasing access to NHLBI datasets and innovative data analysis capabilities. NHLBI has required HMS IT to managed this FISMA environment. HMS IT will use the Biodata Catalyst project as the initial FISMA Moderate pilot, which will establish a model for other HMS research projects.
Objectives
- Implement a FISMA Moderate IT environment in AWS as a pilot in coordination with Paul Avillach (DBMI and the NHLBI/NIH)
- Identify organization gaps needed to support ongoing FISMA Moderation operations
- Assess reproducibility, scalability, and service model to achieve FedRAMP certification.
Stakeholders
This project is a collaboration with Paul Avillach (DBMI) and the areas of HMS IT Research Computing, Infrastructure, and Security and Compliance. Other stakeholders include the Center for Computational Biomedicine and the Departments of Health Care Policy, Global health, and Biomedical Informatics
If you are interested in more details please contact: Bill Barnett (wbarnett@hms.harvard.edu)
Status
Completed:
The BioData Catalyst FISMA Moderate environment is in production. Third party assessment of the environment completed successfully and a formal, 3 year, Authority to Operate (ATO) was granted by the National Heart, Lung, and Blood Institute in March, 2021.
Project Dates
- July 2019 to March 2021
-
GPU Investments
Description
Adding new GPUs & storage to support cutting edge research.
This project will design and implement new GPU (graphical processing unit) and related data storage architectures to support research at HMS. It is driven by the identification of new research needs coming from image analysis and data science (e.g., machine learning) applications both generally across the Quad as well as Blavatnik Institute needs including from the Center for Computational Biomedicine and Foundry awards. It is building on a current project to assess the performance of new Nvidia RTX 6000 single precision GPU card, which has determined that the RTX 6000 performs very well when tested by HMS-IT and a number of different research labs.
Objectives
- Identify Research Needs
- Develop Technical Specifications to meet Research Needs
- Implement and operate new GPU environments
Stakeholders
This Research Computing project is benefiting from participation of HMS community members including Wei-Chung Lee, the Peter Sorger lab, and SBGrid who contributed their RTX 6000 test results and will provide advice. We also would like to thank the many members of the HMS research community, including the Debora Marks lab, for their responses to our GPU needs survey. Data from both these sources is helping to inform the design and specification of additional GPU purchases for the HMS research community.
If you are interested in more details please contact: Amir Karger (Amir_Karger@hms.harvard.edu)
Status
Completed:
- HMS IT has expanded the O2 cluster's GPU capacity with funding from the Blavatnik Institute. The addition of 71 new NVIDIA GPUs, including 44 RTX8000 and 21 Telsa V100s cards, significantly increases O2's GPU computing capabilities to support HMS research. At this time, the additional GPUs are available only for labs with a primary or secondary appointment in a pre-clinical HMS department.
- Further details can be found on the O2 GPU Wiki page
Project Dates
- April 29, 2020 to March 30, 2021
-
OMERO System Refresh
Description
The current OMERO server hardware is outdated and needs to be refreshed. Our goal is to set up a virtual infrastructure with up-to-date operating system and migrate OMERO application and data to a platform which will improve its performance and reliability as well as replace the aging infrastructure
- Design and approval of new production and development system architecture
- Replace the existing hardware with scalable infrastructure with the capability to run all OMERO microservices and can be fully supported on an ongoing basis
- Establish a change control process to ensure regular planned system maintenance
- Monitoring and infrastructure visibility
Stakeholders
This project is led by Research Computing with guidance from HMS constituents in the Laboratory of Systems Pharmacology, MicRoN Core, Nikon Imaging Center, and Neurobiology Imaging Facility, faculty advisory, and a collaborative implementation with Glencoe.
If you are interested in more details please contact: Neil Coplan (Neil_Coplan@hms.harvard.edu)
Status
Completed:
- The project team has completed architectural design of the new system, deployed to a development environment, and tested extensively. Cutover to the new system was successfully completed on March 11th, 2021.
-
Open On Demand for O2 (O2 Portal)
Description
This project designed and implemented an Open On Demand (OOD) Web-based environment (https://openondemand.org/) as a production Research Computing service on the O2 cluster. It lowered the barrier for non-expert O2 users to access tools and software. Open on Demand provides a single point of entry to O2 services through a web portal, it has a job composing, submission, and monitoring features, it supports file management and editing, it supports interactive login shells and remote desktops. It will also support the use of Jupyter computational notebooks, R Studio, MATLAB and other applications in the O2 environment.
The resultant service offering has been named the “O2 Portal”.
Objectives
- Identify Research Needs
- Develop Implementation Specifications to meet research needs.
- Implement and operate an Open On Demand service
Stakeholders
This project was initiated by the Research Computing group to fill a gap that was recognized in RC’s service offerings rather than to meet the requirements of specific HMS labs. In order to assess OOD requirements, RC reached out to members of the following labs: the Churchman lab, the Seidman lab, the Reich lab, and the Farhat lab.
Status
Complete:
- The O2 Portal is in production, available to all O2 users
- As of September 2022, researchers from 91 different labs/groups are using the O2 Portal. Some example feedback from pilot users:
- “My experience with the pilot has been overall really positive. It’s a nice intuitive UI that makes it easy to submit jobs” – Neurobiology researcher
- “I haven’t noticed a difference in latency when running Rstudio via OOD or my local machine. This opens up a slew of uses for O2 that didn’t exist before.” – Center for Computational Biomedicine staff
- “This is such an incredible recourse for the users, I am very excited about it” – Genetics researcher
- “I’ve been telling other members of my lab about it… I know a lot of people who are going to want to use it” Genetics researcher
- “I’m thrilled to be able to use this [graphical] pipeline now on O2. I’m able to log onto OOD and the ClearMap viewer worked perfectly…. This is soooo helpful” – Neurobiology researcher
- “I really like the drag-and-drop file transfer – it’s very convenient” – Systems Biology researcher
If you are interested in more details please contact: Amir Karger (Amir_Karger@hms.harvard.edu)
Project Dates
April 2020- May 2022
-
Research Cores Facilities Management System
Description
HMS is implementing a single robust and flexible core facility management system that lessens the burden and cost of managing services on staff, HMS administrators, and researchers and provides improved financial integration and reporting. This system implementation will aid in the billing, scheduling, and administration of Core services. In 2019, HMS executed a community-driven RFP process and selected a vendor and solution (Stratocore's PPMS system).
Objectives
- Implement a new, centralized, scalable instance of PPMS for HMS Cores
- Integration with Harvard's Identity Access Management to enable single sign-on for Harvard IT account owners
- Integration with Harvard's Oracle Financials to automate the invoicing of internal or external Core customers
- Roll-out of PPMS and onboarding of all HMS Cores over a multi-year timeline, including training of Core staff
Stakeholders
There are currently 35 HMS-sponsored Cores that play a crucial role in enhancing research competitiveness, securing research funding, and supporting collaboration for HMS researchers and beyond. This project is a collaboration with the HMS Cores, Research Operations, and the areas of HMS IT Information Systems and Research Computing.
If you are interested in more details, please contact: Bill Barnett (wbarnett@hms.harvard.edu)
Status
Completed
Project Dates
- July 2019 to June 2022
-
Research Data Visualization Platform - Phase I
Description
In Phase I, the deployment of on-premise infrastructure and RStudio Connect software has been completed and the system is being used by the Center for Computational Biomedicine (CCB).
Objective
- Provided RStudio Connect Servers for the Center for Computational Biomedicine (CCB).
Status
Production system delivered
-
Storage Usage Reporting and Visualization (SURV)
Description
The project's goal was to provide accurate, interactive reports and visualizations of storage usage for stakeholders across HMS to proactively manage short and long-term data management needs. The previous reporting solution was highly manual, slow, labor-intensive, and error-prone.
Objectives
By optimizing storage usage reporting, the HMS IT Research Data Management team and other stakeholders can more effectively:
- Help researchers better locate and manage their data.
- Reduce time and cost to produce storage reports and improve quality of storage reporting.
- Identify which departments, labs, and individuals utilizing HMS storage resources are driving storage growth in order to improve accountability, manage expansion and IT spending, and to balance immediate and long-term research data management needs.
Stakeholders
The project focused on gathering requirements, validating the storage reporting needs, and building solutions for labs, departments, cores, administrators, and HMS IT teams and leadership.
If you are interested in more details please contact: Jessica Pierce (rdmhelp@hms.harvard.edu)
Status
Completed:
Lab and departmental dashboards of the Storage Usage Monitor have been developed and tested by stakeholders. Underlying infrastructure upgrades to allow access without VPN are complete.
Project Dates
- July 2019 to August 2022