Advice: Tier2 NFS server problems - 2017-01-03 - 9:00 to 12:55 (resolved)


We are currently experiencing problems with the Tier2 NFS server. The Tier2 NFS service is going offline for extended periods, then coming back.  This is affecting all "on-disk" collections, and systems that attempt to access them.  (Similar problems have been occurring over the Christmas break.)

Operations staff are attempting to diagnose the problem.

We apologize for the inconvenience this is causing you.

UPDATE - 2017-01-03 - 13:05 - We have identified the possible cause of the NFS problems, and a fix was put in place at 12:55.  So far, it seems to be working. 

(Apparently one of the new Medici AFM cache machines was hammering the NFS server. This was tying up all of the server's NFS kernel threads and denying service to other clients. We have doubled the number of kernel threads and this has relieved the problem, but we still need to understand why the AFM cache was doing that.)

UPDATE - 2017-01-03 - 15:30 - There have been no recurrences of this problem since the last update.

