[OmniOS-discuss] NFS baulks under load

Tom Robinson tom.robinson at motec.com.au
Wed Dec 11 00:01:57 UTC 2013


OmniOS v11 r151006

Hi,

I'm having many stability/performance issues with NFS. Server end is OmniOS; client end is CentOS 5.

When the server end is functioning, I can mount OK, but there are really long waits on simple things
like listing a directory. Often I will get an I/O error. I have been using NFS4 but I'm thinking I
should just configure the server/client maximum to NFS3.

Currently the NFS server is hosed. This happened yesterday as well. The only way I could bring it
back to life was to reboot the hardware; something I want to avoid. Is there a way to tidy up the
network/nfs/server without rebooting?

I had these settings:
# sharectl get nfs
servers=16
lockd_listen_backlog=32
lockd_servers=20
lockd_retransmit_timeout=5
grace_period=90
server_versmin=2
server_versmax=4
client_versmin=2
client_versmax=4
server_delegation=on
nfsmapid_domain=
max_connections=-1
protocol=ALL
listen_backlog=32
device=

But after reading this: http://virtuallyhyper.com/2013/04/installing-and-configuring-omnios/
I have changed to these settings to try to improve responsiveness under load:

# sharectl get nfs
servers=512
lockd_listen_backlog=256
lockd_servers=128
lockd_retransmit_timeout=5
grace_period=90
server_versmin=2
server_versmax=3
client_versmin=2
client_versmax=3
server_delegation=on
nfsmapid_domain=
max_connections=-1
protocol=ALL
listen_backlog=32
device=

The problem is I can't re-enable the service and I don't want to reboot to have to fix this.

# svcs -a | grep -e nfs -e rpc
disabled       17:24:56 svc:/network/nfs/cbd:default
disabled       17:24:56 svc:/network/nfs/client:default
disabled       17:24:57 svc:/network/nfs/log:default
disabled       17:27:55 svc:/network/rpc/meta:default
disabled       17:27:55 svc:/network/rpc/metamh:default
disabled       17:27:55 svc:/network/rpc/rex:default
disabled       17:27:55 svc:/network/rpc/metamed:default
disabled       17:27:55 svc:/network/rpc/mdcomm:default
online         17:25:55 svc:/network/rpc/bind:default
online         17:25:56 svc:/network/rpc/keyserv:default
online         17:25:56 svc:/network/nfs/status:default
online         17:27:55 svc:/network/nfs/mapid:default
online         17:27:56 svc:/network/rpc/gss:default
online         17:27:56 svc:/network/rpc/smserver:default
online         17:27:56 svc:/network/nfs/rquota:default
online*        10:05:07 svc:/network/nfs/server:default
online         10:32:20 svc:/network/nfs/nlockmgr:default


# svcs -xv network/nfs/server
svc:/network/nfs/server:default (NFS server)
 State: online since 11 December 2013 10:05:07 AM EST
   See: man -M /usr/share/man -s 1M nfsd
   See: /var/svc/log/network-nfs-server:default.log
Impact: None.

# svcs -vl network/nfs/server
fmri         svc:/network/nfs/server:default
name         NFS server
enabled      true
state        online
next_state   offline
state_time   11 December 2013 10:05:07 AM EST
logfile      /var/svc/log/network-nfs-server:default.log
restarter    svc:/system/svc/restarter:default
contract_id  96
dependency   require_any/error svc:/milestone/network (online)
dependency   require_all/error svc:/network/nfs/nlockmgr (online)
dependency   optional_all/error svc:/network/nfs/mapid (online)
dependency   require_all/restart svc:/network/rpc/bind (online)
dependency   optional_all/none svc:/network/rpc/keyserv (online)
dependency   optional_all/none svc:/network/rpc/gss (online)
dependency   optional_all/none svc:/network/shares/group (multiple)
dependency   optional_all/none svc:/system/filesystem/reparse (online)
dependency   require_all/error svc:/system/filesystem/local (online)

# svcs -vp network/nfs/server
STATE          NSTATE        STIME    CTID   FMRI
online         offline       10:05:07     96 svc:/network/nfs/server:default
               17:27:56      692 nfsd
               10:05:07     1123 nfs-server
               10:05:07     1134 sharemgr

# ps -elf | grep -e share -e nfs -e rpc
 0 S   daemon   465     1   0  40 20        ?    796        ? 17:25:56 ?           0:00
/usr/sbin/rpcbind
 0 S   daemon   582     1   0  40 20        ?   1554        ? 17:27:56 ?           0:00
/usr/lib/nfs/nfsmapid
 0 S   daemon   545     1   0  40 20        ?    758        ? 17:25:57 ?           0:00
/usr/lib/nfs/statd
 0 S   daemon   692     1   0  39  0        ?    732        ? 17:27:56 ?           5:27
/usr/lib/nfs/nfsd
 0 S     root  1123    10   0  40 20        ?    946        ? 10:05:08 ?           0:00 /sbin/sh
/lib/svc/method/nfs-server
 0 S     root  1134  1123   0  40 20        ?   1607        ? 10:05:08 ?           0:00
/usr/sbin/sharemgr stop -P nfs -a
 0 S     root  1462  1318   0  50 20        ?    578        ? 11:00:43 pts/4       0:00 grep -e
share -e nfs -e rpc
 0 S   daemon  1274     1   0  39  0        ?    713        ? 10:32:20 ?           0:00
/usr/lib/nfs/lockd

/var/svc/log/network-nfs-server:default.log
[ Dec 11 10:05:07 Stopping because service restarting. ]
[ Dec 11 10:05:07 Executing stop method ("/lib/svc/method/nfs-server stop 96"). ]

I'm really stuck. Any assistance is much appreciated.

Kind regards,
Tom

-- 

Tom Robinson
IT Manager/System Administrator

MoTeC Pty Ltd

121 Merrindale Drive
Croydon South
3136 Victoria
Australia

T: +61 3 9761 5050
F: +61 3 9761 5051   
E: tom.robinson at motec.com.au


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 263 bytes
Desc: OpenPGP digital signature
URL: <https://omniosce.org/ml-archive/attachments/20131211/5fe160e0/attachment.bin>


More information about the OmniOS-discuss mailing list