HMS IT invests in several projects, driven by research needs and informed by research priorities, to support the research enterprise at Harvard Medical School. 

This page provides a summary listing of those projects and their scope and timeline. We hope this will be informative and seek participation – advice and testing — from interested research community members. 

If you have questions, contact the Research Computing Lead for each project or  email us at rchelp@hms.harvard.edu

Active Projects

  • Electronic Lab Notebooks

    Description

    The Pilot ELN project seeks to address the increasingly important issues of efficient and collaborative research, improving research data capture and quality, research reproducibility, and sharing of data and results. The goal is to support principal investigators' need to establish electronic documentation standards for their labs, such as aligning with the principles of findable, accessible, interoperable, and reusable (FAIR) resources as recommended by NIH. Outcomes from an extensive collaborative discovery effort between Research Computing (RC) and interested members of the HMS research community led RC to select eLABNext for a 3-year on-premises pilot. 

    Objectives

    •  Pilot Electronic Lab Notebook (ELN) products in collaboration with at least 3 representative user groups
    • Integrate eLABNext with HMS IT Systems for authentication and storage
    • Develop a production ELN service model, to include training, support, and user documentation
    • Determine the next steps for providing a fully supported ELN service to the HMS community

    Stakeholders

    This project, led by Research Computing, is advised by a steering committee of the following individuals drawn from the HMS community: Caroline Shamu, William Barnett, Rachel Cahoon, Jeremy Muhlich, David Corey, Elaine Martin, Stephen Blacklow, Johanna Gutlerner, and David Smallwood.
    If you are interested in more details or would like to express interest in being a future pilot lab please contact: Bill Barnett (wbarnett@hms.harvard.edu)

    Status

    Implementation:

    • As of May 2023, participants in the eLabNext Pilot include members of the David Corey lab (Neurobiology), the George Church lab (Genetics),  the Charles Weitz lab (Neurobiology), the Lucas Farnung lab (Cell Biology), the Dennis Kasper lab (Immunology), the MiCroN Imaging core facility, the Center for Macromolecular Interactions, the Neurobiology Imaging Facility, and BCMP Proteomics.
    • The platform has been integrated with HMS IT security, authenticating using HMS federated sign-on.
    • The ELN service model has been developed, including processes for quarterly software updates, user onboarding and off-boarding, training, and support.
  • Research Data Visualization Service

    Description

    In Phase II of the Research Data Visualization Platform (RDVP) project, a Pilot platform was implemented and delivered; architected to use cloud infrastructure, providing a redundant and expandable design and a highly available environment. Making the application accessible to collaborators when they need access to the data.

    Phase III of the project will deliver a production Research Data Visualization Service that meets the needs of the majority of identified HMS use cases. It will also define policies for use of the platform, and develop improved processes ensure a secure and stable service.

    Objectives

    • Develop policies regarding cost allocation and technical service limits, and potentially the provision of a paid service for non-quad users
    • Identify requirements that are not being met by services provided in RDVP Phase II
    • Enhance existing RDVP services to provide a data publishing solution for the majority of identified HMS research use cases, for which there is a reasonably achievable solution.
    • Ensure that users have access to documentation that allows them to publish from both O2 and shinyapps.io
    • Define a support model for both application support and back-end support

    Status

    Phase II: Complete. Pilot System Delivered

    The team has developed a re-usable cloud-based architecture, and are currently supporting Posit Connect servers for 12 different HMS groups who are publishing their own apps. In addition, they are collaborating closely with the CCB to support an on-premises Posit Connect server, enabling the CCB to support other groups while developing their new apps.

    Phase III : Planning:

    The team is currently gathering requirements for the production service, and exploring methods for gathering costs and allocating them to specific Posit Connect instances and apps.

  • Software Containerization

    Description

    The O2 Containerization Production Service project aims to create a production quality system for managing and deploying high-performance computing workloads in a containerized environment. The goal is to provide a scalable, flexible, and efficient way of running HPC workloads, reducing the complexity of deployment, and making it easier to manage software dependencies while leveraging the existing cluster scheduling system.

    This project builds on earlier work done as part of the Containerization Pilot project.

    Objectives

    1. Define policies for building, scanning, using, and storing containers for containerized workflows on O2
    2. Improve the robustness and functionality of the container scanning tool chain 
    3. Provide additional tools to facilitate container building
    4. Facilitate improved container organization and management
    5. Provide RCCs with training to enable them to support the build/scan process
    6. Improve documentation of the O2 Containerization service

    Stakeholders

    This Research Computing project will be assisted by HMS community members. They list of key community steakholders will be amended and published prior to the commencement of the project.

    If you are interested in more details please contact: Amir Karger (Amir_Karger@hms.harvard.edu)

    Status

    On Hold:

    • Research Computing continues to support a limited Pilot containerization service.
    • The production service project will commence as soon as appropriate devOps and devSecOps resources are available.
  • Research Data Migration

    Description 

    HMS IT is upgrading to a new storage system designed to efficiently manage large amounts of data quickly. The Research Data Migration project aims to streamline the storage architecture to a centralized, stable, and secure infrastructure.  

    O2 Home folders and scratch have already been migrated to the new storage successfully, with several benefits and efficiencies already achieved:  

    • Automated deduplication has increased the total available capacity in these directories by 35%.  
    • Automated data compression functionality has reduced data traffic to the storage system by 60%. 
    • The new storage system reduced energy usage by 75% compared to energy use on Isilon. 
    • The storage footprint has been reduced by 66%, leading to significant savings in future data center operational costs. 

    Stakeholders 

    This Research Computing project will be assisted by HMS community members.  

    If you are interested in more details, contact rchelp@hms.harvard.edu.

    Status 

    In progress: Review the current progress on the Research Data Migration page.

Completed Projects

  • Cold Storage

    Description

    The goal of this project is to create an IT service that supports long-term storage and data management workflows for the HMS research community. The initial step in this initiative will create a solution for the storage of inactive data that will focus on preservation and low cost over accessibility, using the term ‘Cold Storage’ to describe this storage tier.

    Objectives

    • Validate research use cases with pilot participants
    • Develop data movement, and identification functionality, setup storage targets, and applicable network infrastructure
    • Connect data movement tool with storage target and the required network to enable minimal viable workflow
    • Pilot cold storage service with research participants
    • Transition piloted service to operations

    Stakeholders

    This project is executed by Research Computing in collaboration with IT Infrastructure along with seven labs that participated in pilot testing

    Status

    Completed: Cold Storage is now available for use. 

    Project Dates

    • Dec. 2020 - April 2023
  • FISMA Moderate Secure Enclave

    Description

    HMS IT will establish a FISMA Moderate certified institutional secure enclave for use by HMS researchers for studies that require FISMA certified environments for sensitive data such as large protected health information data sets. Ultimately this environment will achieve FedRAMP certification which will support multiple FISMA Moderate research projects, and CMS and other regulated data sets.

    DBMI is seeking FISMA Moderate certification for an IT platform, BioData Catalyst, funded by the NIH National Heart Lung and Blood Institute (NHLBI) as part of an extensive national collaborative research program increasing access to NHLBI datasets and innovative data analysis capabilities. NHLBI has required HMS IT to managed this FISMA environment. HMS IT will use the Biodata Catalyst project as the initial FISMA Moderate pilot, which will establish a model for other HMS research projects.

    Objectives

    • Implement a FISMA Moderate IT environment in AWS as a pilot in coordination with Paul Avillach (DBMI and the NHLBI/NIH)
    • Identify organization gaps needed to support ongoing FISMA Moderation operations
    • Assess reproducibility, scalability, and service model to achieve FedRAMP certification.

    Stakeholders

    This project is a collaboration with Paul Avillach (DBMI) and the areas of HMS IT Research Computing, Infrastructure, and Security and Compliance.  Other stakeholders include the Center for Computational Biomedicine and the Departments of Health Care Policy, Global health, and Biomedical Informatics

    If you are interested in more details please contact: Bill Barnett (wbarnett@hms.harvard.edu)

    Status

    Completed: 

    The BioData Catalyst FISMA Moderate environment is in production. Third party assessment of the environment completed successfully and a formal, 3 year, Authority to Operate (ATO) was granted by the National Heart, Lung, and Blood Institute in March, 2021.

    Project Dates

    • July 2019 to March 2021
  • GPU Investments

    Description

    Adding new GPUs & storage to support cutting edge research.

    This project will design and implement new GPU (graphical processing unit) and related data storage architectures to support research at HMS. It is driven by the identification of new research needs coming from image analysis and data science (e.g., machine learning) applications both generally across the Quad as well as Blavatnik Institute needs including from the Center for Computational Biomedicine and Foundry awards. It is building on a current project to assess the performance of new Nvidia RTX 6000 single precision GPU card, which has determined that the RTX 6000 performs very well when tested by HMS-IT and a number of different research labs.

    Objectives

    • Identify Research Needs
    • Develop Technical Specifications to meet Research Needs
    • Implement and operate new GPU environments

    Stakeholders

    This Research Computing project is benefiting from participation of HMS community members including Wei-Chung Lee, the Peter Sorger lab, and SBGrid who contributed their RTX 6000 test results and will provide advice. We also would like to thank the many members of the HMS research community, including the Debora Marks lab, for their responses to our GPU needs survey. Data from both these sources is helping to inform the design and specification of additional GPU purchases for the HMS research community.

    If you are interested in more details please contact: Amir Karger (Amir_Karger@hms.harvard.edu)

    Status

    Completed: 

    • HMS IT has expanded the O2 cluster's GPU capacity with funding from the Blavatnik Institute. The addition of 71 new NVIDIA GPUs, including 44 RTX8000 and 21 Telsa V100s cards, significantly increases O2's GPU computing capabilities to support HMS research. At this time, the additional GPUs are available only for labs with a primary or secondary appointment in a pre-clinical HMS department. 
    • Further details can be found on the O2 GPU Wiki page 

    Project Dates

    • April 29, 2020 to March 30, 2021
  • OMERO System Refresh

    Description

    The current OMERO server hardware is outdated and needs to be refreshed. Our goal is to set up a virtual infrastructure with up-to-date operating system and migrate OMERO application and data to a platform which will improve its performance and reliability as well as replace the aging infrastructure  

    • Design and approval of new production and development system architecture 

    • Replace the existing hardware with scalable infrastructure with the capability to run all OMERO microservices and can be fully supported on an ongoing basis 

    • Establish a change control process to ensure regular planned system maintenance 

    • Monitoring and infrastructure visibility 

    Stakeholders

    This project is led by Research Computing with guidance from HMS constituents in the Laboratory of Systems Pharmacology, MicRoN Core, Nikon Imaging Center, and Neurobiology Imaging Facility, faculty advisory, and a collaborative implementation with Glencoe. 

    If you are interested in more details please contact: Neil Coplan (Neil_Coplan@hms.harvard.edu

    Status

    Completed: 

    • The project team has completed architectural design of the new system, deployed to a development environment, and tested extensively. Cutover to the new system was successfully completed on March 11th, 2021.  

  • Open On Demand for O2 (O2 Portal)

    Description

    This project designed and implemented an Open On Demand (OOD) Web-based environment (https://openondemand.org/) as a production Research Computing service on the O2 cluster. It lowered the barrier for non-expert O2 users to access tools and software. Open on Demand provides a single point of entry to O2 services through a web portal, it has a job composing, submission, and monitoring features, it supports file management and editing, it supports interactive login shells and remote desktops. It will also support the use of Jupyter computational notebooks, R Studio, MATLAB and other applications in the O2 environment.

    The resultant service offering has been named the “O2 Portal”.

    Objectives

    • Identify Research Needs
    • Develop Implementation Specifications to meet research needs.
    • Implement and operate an Open On Demand service

    Stakeholders

    This project was initiated by the Research Computing group to fill a gap that was recognized in RC’s service offerings rather than to meet the requirements of specific HMS labs. In order to assess OOD requirements, RC reached out to members of the following labs: the Churchman lab, the Seidman lab, the Reich lab, and the Farhat lab.

    Status

    Complete:

    • The O2 Portal is in production, available to all O2 users
    • As of September 2022, researchers from 91 different labs/groups are using the O2 Portal. Some example feedback from pilot users:
      • “My experience with the pilot has been overall really positive. It’s a nice intuitive UI that makes it easy to submit jobs” – Neurobiology researcher
      • “I haven’t noticed a difference in latency when running Rstudio via OOD or my local machine. This opens up a slew of uses for O2 that didn’t exist before.” – Center for Computational Biomedicine staff
      • “This is such an incredible recourse for the users, I am very excited about it” – Genetics researcher
      • “I’ve been telling other members of my lab about it… I know a lot of people who are going to want to use it” Genetics researcher
      • “I’m thrilled to be able to use this [graphical] pipeline now on O2. I’m able to log onto OOD and the ClearMap viewer worked perfectly…. This is soooo helpful” – Neurobiology researcher
      • “I really like the drag-and-drop file transfer – it’s very convenient” – Systems Biology researcher

    If you are interested in more details please contact: Amir Karger (Amir_Karger@hms.harvard.edu)

    Project Dates 

    April 2020- May 2022

  • Research Cores Facilities Management System

    Description

    HMS is implementing a single robust and flexible core facility management system that lessens the burden and cost of managing services on staff, HMS administrators, and researchers and provides improved financial integration and reporting. This system implementation will aid in the billing, scheduling, and administration of Core services. In 2019, HMS executed a community-driven RFP process and selected a vendor and solution (Stratocore's PPMS system).

    Objectives

    • Implement a new, centralized, scalable instance of PPMS for HMS Cores
    • Integration with Harvard's Identity Access Management to enable single sign-on for Harvard IT account owners
    • Integration with Harvard's Oracle Financials to automate the invoicing of internal or external Core customers
    • Roll-out of PPMS and onboarding of all HMS Cores over a multi-year timeline, including training of Core staff

    Stakeholders

    There are currently 35 HMS-sponsored Cores that play a crucial role in enhancing research competitiveness, securing research funding, and supporting collaboration for HMS researchers and beyond. This project is a collaboration with the HMS Cores, Research Operations, and the areas of HMS IT Information Systems and Research Computing.

    If you are interested in more details, please contact: Bill Barnett (wbarnett@hms.harvard.edu)

    Status

    Completed

    Project Dates

    • July 2019 to June 2022
  • Research Data Visualization Platform - Phase I

    Description

    In Phase I, the deployment of on-premise infrastructure and RStudio Connect software has been completed and the system is being used by the Center for Computational Biomedicine (CCB)​.

    Objective

    • Provided RStudio Connect Servers for the Center for Computational Biomedicine (CCB).

    Status

    Production system delivered

  • Storage Usage Reporting and Visualization (SURV)

    Description

    The project's goal was to provide accurate, interactive reports and visualizations of storage usage for stakeholders across HMS to proactively manage short and long-term data management needs. The previous reporting solution was highly manual, slow, labor-intensive, and error-prone. 

    Objectives

    By optimizing storage usage reporting, the HMS IT Research Data Management team and other stakeholders can more effectively:  

    • Help researchers better locate and manage their data. 

    • Reduce time and cost to produce storage reports and improve quality of storage reporting. 

    • Identify which departments, labs, and individuals utilizing HMS storage resources are driving storage growth in order to improve accountability, manage expansion and IT spending, and to balance immediate and long-term research data management needs. 

    Stakeholders

    The project focused on gathering requirements, validating the storage reporting needs, and building solutions for labs, departments, cores, administrators, and HMS IT teams and leadership. 

    If you are interested in more details please contact: Jessica Pierce (rdmhelp@hms.harvard.edu

    Status

    Completed:

    Lab and departmental dashboards of the Storage Usage Monitor have been developed and tested by stakeholders. Underlying infrastructure upgrades to allow access without VPN are complete.

    Project Dates

    • July 2019 to August 2022