QRIScloud Compute Improvements - Rolling outages on March 17th & 18th

Follow

QRIScloud users are advised that on March 17th and 18th there will be a series of rolling outages affecting a large number of QRIScloud NeCTAR instances. These outages will allow us to make important hardware upgrades as part of QCIF regular program of QRIScloud / NeCTAR service improvement. Specifically, 23 out of 54 of QRIScloud's general compute nodes are going to be fitted with new Solid State Drives (SSDs).  This will allow more instances to be run on these nodes, increasing the overall capacity of QRIScloud.

The upgrade work will be performed progressively, with 1 or 2 compute nodes out of service at any one time. The planned procedure for each compute node is as follows:

  1. All RUNNING NeCTAR instances on the compute node will be shut down.
  2. The compute node will be shut down and powered off.
  3. The new SSD will be fitted and tested.
  4. The compute node will be restarted.
  5. Each instance that was shut down in step 1. will be restarted.

We have sent emails to the Tenant Managers and Tenant Members for all instances that will be affected by the outages.  These emails will include some advice on precautions that users should take to prior to the outage. We have also emailed the contacts for a small number of QRIScloud RDSI "collection VMs" that will be affected.

We cannot give a precise schedule for the outages, but we will use the @QRIScloud twitter feed to post updates as the upgrade progress.

If you have concerns about the impact of these outages, please contact QRIScloud support.

UPDATE (2016-03-18 17:45)

Further to QCIF’s previous advice regarding brief rolling outages to QRIScompute VM instances over the period 17/18 March.

QCIF has experienced hardware chassis mounting issues in three out of 23 compute nodes which is preventing a small number of instances from being returned into service.

This hardware was installed by certified Vendors and the current problem is possibly the result of manufacturing defects. This matter has been escalated for urgent manufacturer advice. In the meantime, QCIF is investigating options to restore services as quickly as possible and aims to have firm advice regarding service restoration by noon Monday 21 March.

QCIF apologises for the extended outage which is the result of unforeseen circumstances. We will continue to keep you appraised of developments. Please be assured that all data associated with your instances is safe and unaffected by this outage.

If you require any further information about the impact of these outages, please contact QRIScloud support.

Have more questions? Submit a request

Comments

Powered by Zendesk