[OmniOS-discuss] big zfs storage?
Mick Burns
bmx1955 at gmail.com
Wed Oct 7 20:59:14 UTC 2015
So... how does Nexenta copes with hot spares and all kinds of disk failures ?
Adding hot spares is part of their administration manuals so can we
assume things are almost always handled smoothly ? I'd like to hear
from tangible experiences in production.
thanks
On Mon, Jul 13, 2015 at 7:58 AM, Schweiss, Chip <chip at innovates.com> wrote:
> Liam,
>
> This report is encouraging. Please share some details of your
> configuration. What disk failure parameters are have you set? Which
> JBODs and disks are you running?
>
> I have mostly DataON JBODs and a some Supermicro. DataON has PMC SAS
> expanders and Supermicro has LSI, both setups have pretty much the same
> behavior with disk failures. All my servers are Supermicro with LSI HBAs.
>
> If there's a magic combination of hardware and OS config out there that
> solves the disk failure panic problem, I will certainly change my builds
> going forward.
>
> -Chip
>
> On Fri, Jul 10, 2015 at 1:04 PM, Liam Slusser <lslusser at gmail.com> wrote:
>>
>> I have two 800T ZFS systems on OmniOS and a bunch of smaller <50T systems.
>> Things generally work very well. We loose a disk here and there but its
>> never resulted in downtime. They're all on Dell hardware with LSI or Dell
>> PERC controllers.
>>
>> Putting in smaller disk failure parameters, so disks fail quicker, was a
>> big help when something does go wrong with a disk.
>>
>> thanks,
>> liam
>>
>>
>> On Fri, Jul 10, 2015 at 10:31 AM, Schweiss, Chip <chip at innovates.com>
>> wrote:
>>>
>>> Unfortunately for the past couple years panics on disk failure has been
>>> the norm. All my production systems are HA with RSF-1, so at least things
>>> come back online relatively quick. There are quite a few open tickets in
>>> the Illumos bug tracker related to mpt_sas related panics.
>>>
>>> Most of the work to fix these problems has been committed in the past
>>> year, though problems still exist. For example, my systems are dual path
>>> SAS, however, mpt_sas will panic if you pull a cable instead of dropping a
>>> path to the disks. Dan McDonald is actively working to resolve this. He
>>> is also pushing a bug fix in genunix from Nexenta that appears to fix a lot
>>> of the panic problems. I'll know for sure in a few months after I see a
>>> disk or two drop if it truly fixes things. Hans Rosenfeld at Nexenta is
>>> responsible for most of the updates to mpt_sas including support for 3008
>>> (12G SAS).
>>>
>>> I haven't run any 12G SAS yet, but plan to on my next build in a couple
>>> months. This will be about 300TB using an 84 disk JBOD. All the code from
>>> Nexenta to support the 3008 appears to be in Illumos now, and they fully
>>> support it so I suspect it's pretty stable now. From what I understand
>>> there may be some 12G performance fixes coming sometime.
>>>
>>> The fault manager is nice when the system doesn't panic. When it panics,
>>> the fault manger never gets a chance to take action. It is still the
>>> consensus that is is better to run pools without hot spares because there
>>> are situations the fault manager will do bad things. I witnessed this
>>> myself when building a system and the fault manger replaced 5 disks in a
>>> raidz2 vdev inside 1 minute, trashing the pool. I haven't completely yield
>>> to the "best practice". I now run one hot spare per pool. I figure with
>>> raidz2, the odds of the fault manager causing something catastrophic is much
>>> less possible.
>>>
>>> -Chip
>>>
>>>
>>>
>>> On Fri, Jul 10, 2015 at 11:37 AM, Linda Kateley <lkateley at kateley.com>
>>> wrote:
>>>>
>>>> I have to build and maintain my own system. I usually help others
>>>> build(i teach zfs and freenas classes/consulting). I really love fault
>>>> management in solaris and miss it. Just thought since it's my system and I
>>>> get to choose I would use omni. I have 20+ years using solaris and only 2 on
>>>> freebsd.
>>>>
>>>> I like freebsd for how well tuned for zfs oob. I miss the network, v12n
>>>> and resource controls in solaris.
>>>>
>>>> Concerned about panics on disk failure. Is that common?
>>>>
>>>>
>>>> linda
>>>>
>>>>
>>>> On 7/9/15 9:30 PM, Schweiss, Chip wrote:
>>>>
>>>> Linda,
>>>>
>>>> I have 3.5 PB running under OmniOS. All my systems have LSI 2108 HBAs
>>>> which is considered the best choice for HBAs.
>>>>
>>>> Illumos leaves a bit to be desired with handling faults from disks or
>>>> SAS problems, but things under OmniOS have been improving, much thanks to
>>>> Dan McDonald and OmniTI. We have a paid support on all of our production
>>>> systems with OmniTI. Their response and dedication has been very good.
>>>> Other than the occasional panic and restart from a disk failure, OmniOS has
>>>> been solid. ZFS of course never has lost a single bit of information.
>>>>
>>>> I'd be curious why you're looking to move, have there been specific
>>>> problems under BSD or ZoL? I've been slowly evaluating FreeBSD ZFS, but of
>>>> course the skeletons in the closet never seem to come out until you do
>>>> something big.
>>>>
>>>> -Chip
>>>>
>>>> On Thu, Jul 9, 2015 at 4:21 PM, Linda Kateley <lkateley at kateley.com>
>>>> wrote:
>>>>>
>>>>> Hey is there anyone out there running big zfs on omni?
>>>>>
>>>>> I have been doing mostly zol and freebsd for the last year but have to
>>>>> build a 300+TB box and i want to come back home to roots(solaris). Feeling
>>>>> kind of hesitant :) Also, if you had to do over, is there anything you would
>>>>> do different.
>>>>>
>>>>> Also, what is the go to HBA these days? Seems like i saw stable code
>>>>> for lsi 3008?
>>>>>
>>>>> TIA
>>>>>
>>>>> linda
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> OmniOS-discuss mailing list
>>>>> OmniOS-discuss at lists.omniti.com
>>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>>>
>>>>
>>>>
>>>> --
>>>> Linda Kateley
>>>> Kateley Company
>>>> Skype ID-kateleyco
>>>> http://kateleyco.com
>>>
>>>
>>>
>>> _______________________________________________
>>> OmniOS-discuss mailing list
>>> OmniOS-discuss at lists.omniti.com
>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>>
>>
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
More information about the OmniOS-discuss
mailing list