HMS DevOps to upgrade the Slurm job scheduler

To increase the efficiency and security of the O2 cluster, HMS DevOps will upgrade the Slurm job scheduler starting on Thursday, July 13 at 7:00 PM and concluding on Sunday, July 16 at noon.

Begin: Thursday, July 13 at 7:00 PM
End: Sunday, July 16 at 12:00 PM

This upgrade will require the O2 cluster to be offline, and as a result, no new jobs will be accepted during the mentioned period. To prevent disruption to your work, ensure all running jobs are complete before the upgrade commencement time. Pending jobs which overlap with the outage window will be allowed to start after the maintenance is completed.

Certain services related to the O2 cluster will be affected during the upgrade period. In particular, the O2 sign in servers at o2.hms.harvard.edu and the O2 Portal will be offline, and you will not be able to submit or execute jobs, including from websites. The filesystem /n/groups will also be offline during this maintenance.

However, not all services will be unavailable. The O2 transfer servers at transfer.rc.hms.harvard.edu will remain operational. Additionally, any services not relying on the O2 job scheduler will continue functioning as usual during the upgrade period.

The upgrade is vital to keep our systems current with necessary security and bug fixes, resulting in enhanced performance for users. The process involves a database schema modification, which is time-consuming, hence the need for downtime. For more information, visit the O2 Cluster Status page.

If you have any questions or concerns, contact Research Computing at rchelp@hms.harvard.edu.  

Sincerely,
HMS Research Computing

IT Department