Advice: Euramoo maintenance outages - 2016-09-02 to 2016-09-20 completed

Follow

In order to prepare for the rollout of new high-speed, wide-area data storage services for QRIScloud, we will be moving Euramoo's user /home directories and /30days directories to a different file server. As this data move cannot be done safely while jobs are running on the system, we will be progressively disabling Euramoo's job queues over a two week period to allow all running jobs to "drain".

IMPORTANT NOTE:  As part of, we will be purging old files from the "/30days" file system. Files that were created more than 30 days before the day the Euramoo is offline (Monday 19th Sept)  are liable to be deleted.  If the files are valuable to you, make sure you have a copy stored somewhere else.

Once we have moved the Euramoo file systems, we will re-enable normal Euramoo job submission and execution.

The proposed schedule is as follows:

Date / time Action System state
 2 Sep @ 4pm Turn off job submission for the LongWallTime queue
Disable starting of jobs in the LongWallTime queue
Draining jobs with walltime > 7 days
 9 Sep @ 4pm Turn off job submission for the Intel, AMD and BioLinux queues with walltime > 48 hours
Disable starting of jobs with walltime > 48 hours
Draining jobs with walltime 2 - 7 days
 14 Sep @ 4pm Turn off job submission for the Intel, AMD and BioLinux queues with walltime > 24 hours
Disable starting of jobs with walltime > 24 hours
Draining jobs with walltime 1 - 2 days
 15 Sep @ 4pm Turn off job submission completely for all queues
Disable starting of any new jobs
Draining jobs with walltime < 1 day
 19 Sep @ 9am

Disable Euramoo user login and force logout any user sessions.
Purge old files from /30days
Copy the /home and /30days file trees to their new locations.
Other maintenance as required

Offline

 20 Sep @ 2pm
(estimated)

Enable user login, enable job submission and start job queues.

Normal operation

Please note:

  • The schedule allows for short (< 24 hour) jobs to be submitted up to 4pm on the Friday before the Monday 9am shutdown. 
  • Any jobs that are queued before the respective cut-off times will, remain queued until Euramoo is returned to normal operations.

For users who cannot wait for Euramoo to be returned to full service, please contact the QRIScloud Help Desk (support@qriscloud.org.au). There may be alternatives for running your jobs on other institutional HPC systems during the outage.

UPDATE: 2016-09-19 10:30 - The file system data transfer is taking longer than was anticipated.  We have updated the estimate for returning Euramoo to full service to 2pm on Tuesday 20th.

UPDATE: 2016-09-20 16:40 - Euramoo is now fully operational.  Unfortunately, due to an incident with file system quotas, some of the jobs in the queues have failed, and will need to be resubmitted.  We apologize for the inconvenience.

 

  

Have more questions? Submit a request

Comments

Powered by Zendesk