[OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes

Matej Zerovnik matej at zunaj.si
Tue May 12 05:13:35 UTC 2015


I know building a single 50 drives RaidZ2 is a bad idea. As I said, it's 
a legacy that I can't easily change. I already have a backup pool with 
7x10 drives RaidZ2 to which I hope I will be able to switch this week. I 
hope to get some better results and less crashing...

What is interesting is that when the 'event' happens, server works 
normaly, ZFS is accessable and writable(at least, there is no errors in 
log files), only iscsi reports errors and drops the connection. Another 
interesting thing is that after the 'event', all write stops, only read 
continues for another 30min. After 30min all traffic stops for half an 
hour. After that, everything starts to coming back up... Weird?!

Matej

On 09. 05. 2015 02:49, Richard Elling wrote:
>
>> On May 5, 2015, at 9:48 AM, Matej Zerovnik <matej at zunaj.si 
>> <mailto:matej at zunaj.si>> wrote:
>>
>> I will replace the hardwarw in about 4 months with all SAS drives, 
>> but I would love to have a working setup for the time being as well;)
>>
>> I looked at smart stats and there doesnt seem to be any errors. Also, 
>> no hard/soft/transfer error reported by any drive. Will take a look 
>> at service time tomorrow, maybe put the drives to graphite and look 
>> at them over a longer period.
>>
>> I looked at iostat -x status today and stats for pool itself reported 
>> 100% busy most of the time, 98-100% wait, 500-1300 transactions in 
>> queue, around 500 active,... First line, that is average from boot, 
>> says avg service time.is <http://time.is> around 1600ms which seems 
>> like aaaalot. Can it be due to really big queue?
>>
>> Would it help to create 5 10drives raidz pools instead of one with 50 
>> drives?
>
> It is a bad idea to build a single raidz set with 50 drives. Very bad. 
> Hence the zpool
> man page says, "The recommended number is between 3 and 9 to help 
> increase performance."
> But this recommendation applies to reliability, too.
>  -- richard
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150512/b4f74f48/attachment.html>


More information about the OmniOS-discuss mailing list