[OmniOS-discuss] strange io-pattern

Henk Langeveld henk at hlangeveld.nl
Fri Dec 13 14:24:22 UTC 2013


There's a known problem with iostat -xn  on multi-processor systems that I posted on the illumos-list
back in September/October, where we occasionally see an astronomical spike in the io wait and service times.

This appears to be caused by the hires kernel timer used by the kstat_io routines, which produces increasing
values of timestamps *per* *physical* *cpu*.  When io events are handled by different cpus, the delta_t can
become negative, as the result of a 64bit int underflow.

These occurrences are rare, but frequent enough to mess up those wait times.  Also, the wait times only show
up with the combined '-x' and '-n'  options.

Can you eliminate the possibility of such an incident?

I intended to post a bug report on this, but I've moved on since then, and don't have access to
any multi-cpu hardware right now.  I *think* I've seen it once in a multi-cpu virtualbox instance, but have not
been able to reproduce that.  (This would suggest that virtualbox actually emulates the physical cpu registers.)

Cheers,
Henk


On 13/12/2013 14:13, Tobias Oetiker wrote:
> I created a little plugin for collectd to interface with iostat. I
> guess having one for vfsstat and arcstat along the same lines would
> give a better picture as to what users actually experience but this
> one gives some impression as to what happens deep down.
>
> #!/usr/bin/perl
> my $filter = $ARGV[0] || '.+';
>
> my $pid = open my $iostat, "-|", "/usr/bin/iostat","-Tu","-xnr",int($ENV{COLLECTD_INTERVAL}) or die "launching iostat: $!";
>



More information about the OmniOS-discuss mailing list