Section 4: OpenStack & NeCTAR RC terminology – QRIScloud

Section 4: NeCTAR RC questions

Q4.1 - How do I find out what NeCTAR RC / OpenStack terms mean?

The following FAQ's explain some of the more commonly used terms. For more information, refer to:

The OpenStack Glossary.
The VWranglers NeCTAR / OpenStack Glossary.

Q4.1.1 - What is an Instance?

An "Instance" is NeCTAR terminology for a virtual machine running on NeCTAR's OpenStack infrastructure. An "Instance" runs on a "compute node"; i.e. a physical computer populated with processor chips, memory chips and so on. OpenStack does not support instances that span multiple compute nodes, so the theoretical maximum dimensions of an instance are determined by the compute node hardware.

Q4.1.2 - What is a VCPU

The term VCPU is short for "virtual CPU"; i.e. a virtual processor for a virtual machine. A virtual machine can have a number of VCPUs. At first glance, a VCPU is like a "core" on a typical modern desktop or laptop computer. The difference is that a VCPU can actual represent a fractional share of a core on the physical machine; see below.

Q4.1.3 - What are VCPU hours?

VCPU hours is an OpenStack measure of the resources that your instances are using. The VCPU hours measure for an instance is calculated as:

"number VCPUs" / "over-commit ratio" x "lifetime of instance"

where the lifetime of an instance starts when the instance is created, and ends when it is terminated.

Note that the VCPU hours measure includes time when an instance is paused or shutdown. That is because when an instance exists on a compute node in a paused or shutdown state, it stops other instances being launched on that node. NeCTAR OpenStack is designed on the assumption that you should be able to unsuspend or power up an instance at any time, and have the "level of service" that you expect. (This is equivalent to Amazon EC2 Reserved Instances.)

Q4.1.4 - What is a Flavor?

A flavor is OpenStack terminology for a "virtual hardware template". It specifies the dimensions of an Instance; e.g. the number of VCPUs, amount of memory and the size of the local file systems. The standard NeCTAR flavors are as follows:

Name	VCPUs	Memory	Primary disk	Ephemeral disk	Notes
m1.small	1	4Gb	10Gb	30Gb
m1.medium	2	8Gb	10Gb	60Gb
m1.large	4	16Gb	10Gb	120Gb
m1.xlarge	8	32Gb	10Gb	240Gb
m1.xxlarge	16	64Gb	10Gb	480Gb
m2.tiny	1	768Mb	5Gb	0Gb	1, 2, 3
m2.xsmall	1	2Gb	10Gb	0Gb	1, 2
m2.small	1	4Gb	30Gb	0Gb	1, 2
m2.medium	2	6Gb	30Gb	0Gb	1
m2.large	4	12Gb	30Gb	80Gb	1
m2.xlarge	8	48Gb	30Gb	360Gb	1

Notes:

The m2 flavors may be subject to node-specific over-commit.
The m2.xsmall and m2.tiny flavors are for small footprint webservers, and will typically have 2x and 4x CPU scaling relative to other flavors.
The primary disk on the m2.tiny flavor is too small for some of the standard NeCTAR images.

An addition, there are various private and/or node-specific flavors, that are made available to selected tenants when launching in specific availability zones. The flavors that are available to you are shown in the "Launch Instance" wizard. Select the "Details" panel, and use the "Flavor" selector. The details for the currently selected flavor are shown in the right hand set

Q4.1.5 - What is "over-commit"?

The economics of cloud computing are based on the observation that most servers that run applications are under-utilised. This means that you can often "commit" more resources to virtual machines than are available as physical resources on the computer hardware. This is known as "over-commit".

In the NeCTAR OpenStack world, there are 3 primary resources that are committed to virtual machines:

Processor cores can be shared (as VCPUs). Over-commit is typically implemented by time slicing, so that each virtual machine gets a fair share according to its CPU scaling relative to other active VMs.
Memory can be shared using the system's virtual memory hardware. Memory is divided into pages. Resident virtual memory pages are mapped to physical memory pages, and non-resident pages are stored on disk. If an application on a virtual machine tries to access a virtual memory page that is not currently resident, the hypervisor fetches the page from disk.
Disk over-commit only makes sense if virtual machines don't write to some of the disk space that has been allocated to them. Effectively the VMs are sharing empty disk blocks.

Q4.1.6 - Is over-commit bad for performance?

It depends.

Over-commit of processor cores works well, unless the host is running a compute-intensive workload. Even then, the system as a whole runs efficiently.
Over-commit of memory is OK if the overall memory demand is low, but it can lead to serious performance problems due to "thrashing".
Over-commit of disk can be problematic. When the sum of disk space actually used reaches the space available, file writes to new disk blocks cannot be fulfilled.

Current NeCTAR policy is that nodes are free to implement their own policies on over-commit on the "m1" and "m2" flavors.

Q4.2 - What is an Availability Zone?

An Availability Zone (or AZ) is an OpenStack term for a collection of physical compute and storage resources that is managed as a "cell". For example:

The "QRIScloud" availability zone contains the NeCTAR resources in the Polaris Data Centre in Brisbane, managed by QCIF.
Until recently, we ran another AZ called "qld", in a data centre on the UQ St Lucia campus.

Q4.3 - What is an Image?

An Image is an OpenStack term for a bootable disk image that can be used to create a virtual machine. An image will typically include a Linux operating system kernel, libraries, utilities and configurations. It may also include other software and/or data.

Images are managed using the OpenStack Glance service. They are typically created by taking a snapshot of an existing virtual machine.

Q4.3.1 - What Images are available on NeCTAR?

You can see what public images are available by opening the NeCTAR Dashboard in a web browser and looking at the "Images" panel. You will see tabs for 4 categories of image:

NeCTAR Official images
Project images
Images that have been shared with your project.
Public images.

The NeCTAR Official images are (typically) installs of various Linux distributions, these images are refreshed regularly by the NeCTAR Core Srvices team to incorporate the latest patches. They typically consist of a "headless" server install, with a small amount of NeCTAR specific tailoring. (For example, the SSH daemon is configured in a particular way, and fail2ban is installed to fend off hackers.)

Q4.3.2 - What Operating Systems are available?

NeCTAR provides images for various releases of the following common Linux distributions: Ubuntu, Debian, CentOS, Scientific Linux, Fedora and OpenSuse.

Only versions that are still maintained by the supplier are made available as NeCTAR images. (When an OS version goes off-maintenance, it will no longer receive timely security patches. You would be advised to upgrade.)

Q4.3.3 - What Linux images are recommended?

We can't give a general recommendation. The best choice depends what you are trying to do. For instance.

If you want an instance that you can use for a long time, or that that you want to use to run a server, you might choose a distribution with a long support lifetime.
If you want to use the latest versions of standard Linux applications, you might pick a distribution with a short release cycle.
Your choice may be constrained by the domain applications you need to run.
It may be down to individual preference; e.g. based on what they are most familiar with.

The following summarizes the key properties of the Linux "distro" families that NeCTAR supports.

Distribution	Release cycle	Support lifetime	Notes
Ubuntu	~6 months	~2 years	1
Ubuntu "LTS"	~2 years	~5 years	1, 2
Debian	~1-3 years	1 year from next release	3
CentOS	~3-4 years	~10 years	4
Scientific Linux	~3-4 years	~10 years	4
Fedora	~6 months	~1 year	5
OpenSUSE	~1 year	~1.5 years	6
OpenSUSE "Evergreen"	~2 years	~3 years	6

Ubuntu's primary focus is to provide Linux for end users.
Ubuntu LTS (long term support) has major releases roughly every 2 years, and minor releases every 6 months to consolidate the accumulated patches.
Debian is a "open source purist" distro. No proprietary binary drivers, etc.
CentOS and Scientific Linux are derived from Red Hat Enterprise Linux (RHEL), and the primary focus on providing a stable platform for running running services. Release cycles are tied to RHEL release cycles. Major releases every 3 to 4 years, minor releases every 6 to 12 months. End-of-life is 10 years from a major release.
Fedora has a reputation for being a "bleeding edge" distribution. Fedora is supported by RedHat and serves as a proving ground for new developments befor they go into RHEL.
OpenSUSE is based on SUSE Linux by SUSE / Novell / Attachmate.

For Ubuntu and OpenSUSE check the distro name to see if is a "long term support" release.

All of the above Linux families push security patches in a timely fashion; see FAQ below.

Q4.3.4 - Why isn't there a NeCTAR Windows Image?

In general, Microsoft Windows is not available on NeCTAR for licensing reasons. Some NeCTAR nodes support Windows for "local members". QRIScloud does not.

Q4.4 - What disk storage is available on an Instance?

Three kinds of disk storage are available to a NeCTAR instance:

Local Storage: typically disk drives or solid state drives on the compute nodes.
Volume Storage: typically implemented using Ceph-based file servers
Object Storage: objects accessible via Swift or Amazon S3 APIs.

In addition, QRIScloud collections can be NFS mounted on a NeCTAR instance.

Q4.4.1 - What are the Primary & Ephemeral file systems?

The primary file system for a NeCTAR instance contains the instance's operating system, application software and (typically) the user home directories. The ephemeral file system is mounted as "/mnt" by default, and provides instance-local disk space for general use.

When you launch a NeCTAR instance (from an image), primary and ephemeral disk space is allocated for the exclusive use of the instance. The sizes of these spaces is determined by the flavor that you launch, and file systems are built on them by the launch process. (The primary file system is initialized from the launch image, and the ephemeral ile system is created as empty.)

The primary and ephemeral file systems are tied to the instance. Both will persist if an instance is shutdown and rebooted, but both will "go away" when the instance is terminated. The key difference between primary and ephemeral space is that the formed can be preserved in an instance snapshot, but the latter cannot. That is the rationale for calling the later "ephemeral".

(Beware that that ephemeral file systems for instances in the NCI AZ are currently implemented with a non-journalling file system, and are therefore vulnerable to significant file system corruption in the event of a power failure. Some users have lost data because of this.)

Q4.4.2 - What is Volume Storage?

Volume storage in the OpenStack world is provided by the Cinder service. It consists of network accessible Volumes (virtual disks) that can be attached to an instance in the same availability zone, and used to host a file system.

Volumes cannot be shared. A volume can only be attached to one instance at a time.
Volumes can be snapshotted.
Volumes have a lifetime that is independent of an instance.
Your volume storage usage is subject to a separate quota. You need to request quota in a specific availability zone as part of your NeCTAR allocation request.

Q4.4.3 - What is Object Storage?

Object Storage works differently to local storage, volume storage and NFS storage. Unlike those forms, Object Storage cannot be mounted as a regular file system. Instead the application uses HTTP / HTTPS to "get" and "put" the objects in the store.

NeCTAR Object Store is provided using the OpenStack Swift service. NeCTAR Swift is configured so that each object is replicated at 3 locations in the NeCTAR federation's Swift cluster.

NeCTAR Object Storage is also accessible via Amazon S3 compatibility APIs.

Q4.5 - Is NeCTAR storage backed up?

None of the forms of NeCTAR storage are backed up. Backups are the responsibility of the user.

Q4.6 - Is NeCTAR storage safe?

Object Storage is relatively safe due to the fact that the data is (eventually) replicated to 3 (or more) locations. However, this replication does not occur immediately, and there have been situations where all complete replicas of an object are (temporarily) offline. In addition, there is no protection in scenarios where you mistakenly delete or overwrite objects.

In all other cases, it depends on how the respective NeCTAR nodes have implemented their data centres in general, and their storage platforms more specifically.

All forms of storage associated with an instance should persist safely over a normal shutdown / restart of the instance, provided that the shutdown is performed cleanly. (However, as noted above, the instance's primary and ephemeral file systems do not survive when the instance is terminated.)
f the underlying storage platform has built-in redundancy (e.g. RAID, or Ceph replication) then there is a degree of protection against loss due to media failure; e.g. hard disk errors.
If the file systems that run on the disk storage are "journalled", then there is a degree of protection against file system damage due to an unclean shutdown; e.g. power failure.

However, there have been cases where instances have been terminated by accident, or file systems have been deleted or destroyed due to "operational issues". It is therefore extremely unwise to assume that your data will always be safe. Certainly, neither NeCTAR or any of the Node operators can guarantee this.

Q4.7 - Can I connect my RDSI collection to my instance?

Brisbane-based QRISdata RDSI collections are provisioned on NFS servers on a private network within the Polaris data centre. A collection can be exported to all instances in a NeCTAR tenant in the QRIScloud availability zone.

There are some important caveats:

This only applies to QRISdata RDSI collections in Brisbane, and the QRIScloud AZ.
If you expose a collection to a NeCTAR tenant, all instances in the tenant can access it. Access control becomes your problem.
If an RSDI collection is exposed to NeCTAR instances, access via other mechanisms will need to be restricted. (Details are still being sorted out.)

Note: this option is not available in other NeCTAR nodes, because the have implemented RDSI collection storage differently.

Comments