[OmniOS-discuss] ZFS data corruption

Doug Hughes doug at will.to
Sat Aug 22 17:02:12 UTC 2015


I've been experiencing spontaneous checksum failure/corruption on read 
at the zvol level recently on a box running r12 as well. None of the 
disks show any errors. All of the errors show up at the zvol level until 
all the disks in the vol get marked as degraded and then a reboot clears 
it up. repeated scrubs find files to delete, but then after additional 
heavy read I/O activity, more checksum on read errors occur, and more 
files need to be removed. So far on r14 I haven't seen this, but I'm 
keeping an eye on it.

The write activity on this server is very low. I'm currently trying to 
evacuate it with zfs send | mbuffer to another host over 10g, so the 
read activity is very high and consistent over a long period of time 
since I have to move about 10TB.


On 8/21/2015 2:06 AM, wuffers wrote:
> Oh, the PSOD is not caused by the corruption in ZFS - I suspect it was 
> the other way around (VMware host PSOD -> ZFS corruption). I've 
> experienced the PSOD before, it may be related to IO issues which I 
> outlined in another post here:
> http://lists.omniti.com/pipermail/omnios-discuss/2015-June/005222.html
>
> Nobody chimed in, but it's an ongoing issue. I need to dedicate more 
> time to troubleshoot but other projects are taking my attention right 
> now (coupled with a personal house move time is at a premium!).
>
> Also, I've had many improper shutdowns of the hosts and VMs, and this 
> was the first time I've seen a ZFS corruption.
>
> I know I'm repeating myself, but my question is still:
> - Can I safely use this block device again now that it reports no 
> errors? Again, I've moved all data off of it.. and there are no other 
> signs of hardware issues. Recreate it?
>
> On Wed, Aug 19, 2015 at 12:49 PM, Stephan Budach 
> <stephan.budach at jvm.de <mailto:stephan.budach at jvm.de>> wrote:
>
>     Hi Joerg,
>
>     Am 19.08.15 um 14:59 schrieb Joerg Goltermann:
>
>         Hi,
>
>         the PSOD you got can cause the problems on your exchange database.
>
>         Can you check the ESXi logs for the root cause of the PSOD?
>
>         I never got a PSOD on such a "corruption". I still think this is
>         a "cosmetic" bug, but this should be verified by one of the ZFS
>         developers ...
>
>          - Joerg
>
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150822/2df851e0/attachment.html>


More information about the OmniOS-discuss mailing list