NFS (Network File System) is a protocol that allows one Linux or Unix system to "mount" a file system that consists of collection of files and directories that are stored on a different system.
The NFS protocol is designed to allow fast access to files over a local area network. It supports the full range of Linux / Unix file system functionality, and allows different systems to simultaneously access and update shared files and directories.
An NFS file system is "exported" by a server to one or more client machines. The clients gain access by "mounting" the NFS file system within the namespace of the client's local file system.
In the QRIScloud context, it is possible to NFS mount a QRIScloud collection on a NeCTAR (or other) VM running in QRIScloud availability zone. The steps are as follows:
- The collection manager requests QRIScloud support to make the collection NFS-Only. This has significant implications for the way that a collection can be accessed, and we will only do this if we are sure that you and the manager understand this; see below "What does NFS-Only access mean?".
- The collection manager requests QRIScloud support to enable NFS export of the collection to the relevant NeCTAR project or projects.
- You install NFS client software on the NeCTAR instance in the project.
- You configure the instance's second ethernet interface ("eth1") on the storage network.
- You create the configuration files to mount the NFS file system, and check that the file system mounts correctly.
Once the NFS mounts have been set up, your applications can access the files and directories exactly the same way that your would access local files and directories.
What does NFS-only access mean?
QRIScloud collections support two access modes:
- Standard Access means that a collection can be accessed:
- as /RDS or /QRIScloud on the HPC systems,
- as /data on the SSH data access systems,
- via the QRIScloud Nextcloud service, or
- via the Medici data fabric.
- as /RDS or /QRIScloud on the HPC systems,
- NFS-only means that a collection can only be accessed via an NFS mount on a NeCTAR instance (virtual machine) in the QRIScloud AZ.
The access modes are mutually exclusive. A collection may be Standard Access or NFS-only. Not both.
It is possible to change a collection between Standard Access mode and NFS-only mode, but you need to lodge a QRIScloud support request, and it takes time to implement.
Requesting NFS access
QRIScloud collections are made "enabled" for NFS access on request. The collection custodian or technical contact needs to make the request, directly as a QRIScloud support ticket or by contacting an eRA. The request needs to state which NeCTAR project or projects require the access.
The constraints on providing NFS access are as follows:
- We only allow access to NeCTAR virtual machines; i.e. Instances launched from NeCTAR Projects.
- We only allow access within the Polaris data centre (i.e. the QRIScloud availability zone).
- We do not allow access for NeCTAR "PT" projects.
- We discourage NFS access to HSM collections.
The storage pools, NFS servers and mount details
QRIScloud collections are stored in large file systems that we refer to as storage pools. The storage pools (in Polaris) are are as follows:
|NFS server IP address||Storage pool names
||Pool size||Storage characteristics|
|/tier2c1, /tier2c2, /tier2c3, /tier2c4,
/tier2c5, /tier2c6, /tier2c7, /tier2c8, /tier2d1, /tier2d2
|/tier3a1||n/a||HSM (large disk cache)|
|10.255.122.70||/gpfs/general, /gpfs/UQ, etcetera||n/a||GPFS|
When we provision a collection, we will tell you what pool it is stored in, and you will need to use this information in the NFS configurations on your NeCTAR VMs. Specifically. you will need the following:
- The NFS server IP address (<NFS-IP>) which you can look up in the table above.
- The NFS mount path (<NFS-PATH>) which is formed from the pool name and the collection id. If the pool name is "tierxyz" and the collection is "Q0xxx", then the NFS mount path is:
- The NFS mount options (<NFS-OPTS>) are:
for Ubuntu & Debian, or:
for RHEL, CentOS, Scientific Linux & Fedora. (Change "rw" to "ro" for a read-only mount.)
Scripted NFS setup
The simple way to setup is to download and run the "q-storage-setup.sh" script. You simply need to provide the collection id and the storage pool name, and the script works out the rest.
The script can be downloaded from Github as follows:
$ curl -O -L https://github.com/qcif/cloud-utils/raw/master/q-storage-setup.sh $ chmod a+x q-storage-setup.sh
(The "-L" option tells curl to follow redirects like a web browser would do. The "chmod ..." command makes the script that you downloaded executable.)
You can then run it as follows to configure and NFS mount a QRIScloud RDSI collection, and examine its contents:
./q-storage-setup.sh Q0xxx $ sudo ls /data/Q0xxx
If you want to set up a read-only mount (or mount a collection that is exported read-only), include the "--read--only" option when running the script. The full, current documentation for the "q-store-setup.sh" script is here.
Note that the script is updated from time to time, so it is advisable to download a fresh copy from GitHub each time you need to use it, rather keeping your own copy.
Note that the script sets up the collection as an NFS automount. If you need a hard NFS mount, you will need to use the manual NFS setup procedure.
Manual NFS setup
This section outlines the procedures for setting up NFS access to a collection for a NeCTAR virtual machine. It is lengthy, and some of the steps require care. You will require administrator (i.e. "sudo") access on the NeCTAR virtual. (We have written some more general documentation on NFS mounting which you can find on the NeCTAR support site.)
The following instructions assume that you have basic Linux administration skills, and the confidence to do the tasks required. (If you are nervous about breaking something, we suggest that you launch a small NeCTAR VM just for testing.)
Also note that some of the details depend on your Linux distribution. For example, package installation details differ, as do the mechanisms for starting and enabling system services.
Installing NFS client software
You can install the necessary NFS client software on a Debian, Ubuntu or similar system by running the following commands:
$ sudo apt-get update $ sudo apt-get install nfs-common autofs
On RedHat, CentOS, Scientific Linux or Fedora, run the following:
$ sudo yum install nfs-utils autofs
For more information on the YUM and APT package managers, please refer to the respective online manual entries.
(I have heard that future versions of Fedora will replace YUM with a new package manager, though the basics will be the same.)
Configuring the Storage Network interface
The NFS servers are on a private storage network in the 10.255.xxx.xxx range that is connected to each (QRIScloud) NeCTAR VM's second network interface (eth1). Unfortunately, the operating systems in some NeCTAR images do not configure the second interface.
We therefore recommend that you run
$ ip link show
$ ifconfig -a
and look at the output. If you see the "eth1" interface listed as active with an IP address on an "10.255.xxx.xxx", then the second interface is configured. If this is the case, you van skip to the next step: checking NFS access.
On RedHat, CentOS, Scientific Linux or Fedora, the network configurations live in the "/etc/sysconfig/network-scripts" directory. Look in that directory for a file called "ifcfg-eth1".
- If the file does not exist, then we recommend that create it by copying the "ifcfg-eth0" file, and then editing it to change the "DEVICE=..." to say "eth1" instead of "eth0".
- If the file already exists, make sure that:
- the "DEVICE" line says "eth1",
- the "BOOTPROTO" line says "dhcp",
- the "TYPE" line says "Ethernet", and
- the "ONBOOT" line says "yes"
When you have created or corrected the network configuration files, bring up the network interface by running:
$ sudo ifdown eth1 $ sudo ifup eth1
and then use "ip link show" or "ifconfig -a" (as previously) to check that the "eth1" interface is now active.
Checking NFS access
Before setting up NFS to automount your collection, it is a good idea to check that you can mount it by hand. The procedure is simple:
$ mkdir /tmp/mnt $ sudo mount -t nfs -o <NFS-OPTS> <NFS-IP>:<NFS-PATH> /tmp/mnt $ cd /tmp/mnt $ ls -l
If you have configured eth1 correctly, and formed the NFS options, server IP and path correctly, the "ls -l" command should list your collection's root directory.
Finally, unmount your collection as follows:
$ cd / $ sudo umount /tmp/mnt
Configuring NFS auto-mounting
It is possible to mount your collection by hand each time you want to use it, using "sudo mount" as above. However, it is likely to be more convenient if your virtual machine mounts the collection automatically. You can configure it to mount the collection when your VM starts, but it is better to use an automounter.
Here are the instructions for configuring automounting using "autofs".
- Install the autofs (as above).
- Edit the "/etc/auto.master" file, and add the following line at the end of the file:
- Create a "/etc/auto.qriscloud" with the following content:
/data/Q0xxx <NFS-OPTS> <NFS-IP>:<NFS-PATH>
- Start the "autofs" service:
$ sudo service autofs start(Some Linux distributions handle starting of system services in other ways. If the "service" command does not exist, use "systemctl start" or the "/etc/init.d/autofs" script.)
- Check that collection mounts:
$ cd /data/Q0xxx
$ ls -l
- Configure the "autofs" service to start automatically on system boot:
$ sudo chkconfig --add autofs(In some Linux distributions, you can use "systemctl enable" instead of "chkconfig".)
Creating the user accounts and groups.
The final step in manual setup is to create local accounts and groups to access your collection.
- The "q0xxx" account needs to have uid and gid of 54xxx. This corresponds to the "q0xxx-rw" account on your colllection VM.
- The "webdav" accounts needs to have uid and gid of 48.
$ sudo groupadd --gid 48 webdav
$ sudo useradd --uid 48 --gid 48 --no-create-home \
--shell /sbin/nologin webdav
$ sudo useradd --uid 54xxx --groups webdav q0xxx
Run "man useradd" or "man adduser" for details of the command for adding a user account. The details can differ for different Linux distributions.
- You should ensure that the "webdav" account never allows login. Setting the login shell to "/sbin/nologin" is a good way to do this.
- There is a good case for disabling login for "q0xxx" as well, and using "sudo -u q0xxx bash" to assume the "q0xxx" user identity, as required.
- Making the "q0xxx" account a member of the "webdav" account is helpful if you use webdav on the collection VM.
NFS security and access control
When you request us to NFS export a collection to a NeCTAR tenant, any instance running in the tenant is able to mount the collection as a local file system. This means:
- Any person who is a tenant manager or a member of the tenant can launch a new instance, and use that instance to gain full access the collection.
- Any person who can login to a tenant instance's root account or an admin account with sudo rights, can gain full access the collection. If it is not mounted, they can mount it.
- If some has a non-privileged local account, they could potentially exploit an unpatched security flaw to gain root access, and then proceed as above.
- If someone is able to hack into the system, they can then proceed as above.
Anyone who has full access to the collection will be able to override any Linux access controls implemented using permission bits or access control lists. They will be able to add, modify and delete files and directories at will.
In short, when your collection is NFS exported to a NeCTAR tenant, you need to pay close attention to the following:
- Ensure that you don't grant Tenant access to the wrong people.
- Ensure that you don't provide instance accounts with "sudo" rights to the wrong people.
- Ensure that you keep your instances patched with the latest security patches to address possible external vulnerabilities AND local privilege escalation vulnerabilities.
- Ensure that you take all necessary steps to minimize the "attack profile" for your instance:
- SSH should not allow password authentication
- Shut down (or do not configure) unnecessary public services; e.g. Web servers, FTP servers, NFS servers, and so on.
- If feasible, use NeCTAR security groups and / or instance-internal firewalling to restrict access to only a few IP addresses or networks.
NFS trouble-shooting and tuning
Problems configuring "eth1"
Some people have reported problems with configuring the "eth1" network interface.
- On some Linux distros (e.g. CentOS 7), dynamic network management tool (NetworkManager) interferes with what "q-storage-setup.sh" script is trying to do. The temporary workaround is to disable NetworkManager as per the OpenStack Networking documentation.
- It has been known for QRIScloud compute nodes to end up with misconfigured networking at the hypervisor level. This leads to the guest OS (i.e. your instance) seeing only one network interface, and "sudo ifup eth1" failing with a message that days that the device does not exist. This problem requires operator intervention to fix: please open a QRIScloud support request.
Checking network connectivity
One cause of problems is that your NeCTAR VM does not have, or has lost, network-level connectivity to the NFS servers. This will manifest as a failure to mount or remount a collection.
DMF tools don't work with autofs
It has been reported to us that the SGI DMF tools do not work on an NFS file system that has been automounted.
- The "dmls" utility reports "(N/A)" for the dmf state attributes of all files.
- The "dmget" uitility reports an error like this:
sgi_dmf-7029 dmusrcmd: warning Unsupported filesystem type for <file>
The work-around is to replace the NFS automount with a hard mount. However, see the next section!
Incorrect hard mounts can "brick" an instance
If you use the incorrect settings on an NFS hard mount (i.e. in the "/etc/fstab" file), then you can easily get it into a state where it gets stuck during reboot. The problem arises when the "fstab" settings tell the initialization procedure that an NFS file system needs to be mounted early in the startup. If the NFS server is offline or if its IP address changes, then the initialization will block. If this happens before the instance has gone into multi-user mode, it will be difficult to repair the "/etc/fstab" file to remove offending line.
With some versions of Ubuntu, there is a "nobootwait" mount option that allows a mount attempt to time out. Unfortunately, support for "nobootwait" has now been removed as of Ubuntu 16.04, so this is not longer an option.
The best way to deal with this is to use the "noauto" mount option, and then add something to do a "mount -a" when the system has reached multi-user mode. Better still, use the NFS automount approach if you can.