[OmniOS-discuss] Ang: Re: Two panics now while writing eager zeros to zvols

Richard Jahnel rjahnel at ellipseinc.com
Tue Oct 13 14:30:07 UTC 2015


Will experiment with it this afternoon. If it hasn't panicked by tomorrow evening odds are this will have identified and fixed this issue.

-----Original Message-----
From: Dan McDonald [mailto:danmcd at omniti.com]
Sent: Tuesday, October 13, 2015 6:40 AM
To: Johan Kragsterman; Dan McDonald
Cc: wuffers; Richard Jahnel; imemo; omnios-discuss at lists.omniti.com
Subject: Re: Ang: Re: [OmniOS-discuss] Two panics now while writing eager zeros to zvols


> On Oct 10, 2015, at 4:49 AM, Johan Kragsterman <johan.kragsterman at capvert.se> wrote:
>
> Dan, is this upstreamed at all...?

No it's not.  The following not-yet-upstreamed diff is a subset of a larger fix from illumos-nexenta:

diff --git a/usr/src/uts/common/io/comstar/lu/stmf_sbd/sbd_scsi.c b/usr/src/uts/common/io/comstar/lu/stmf_sbd/sbd_scsi.c
index cb6e115..7242d15 100644
--- a/usr/src/uts/common/io/comstar/lu/stmf_sbd/sbd_scsi.c
+++ b/usr/src/uts/common/io/comstar/lu/stmf_sbd/sbd_scsi.c
@@ -2347,6 +2347,7 @@ write_same_xfer_done:
                if (scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
                        stmf_scsilib_send_status(task, STATUS_CHECK,
                            STMF_SAA_WRITE_ERROR);
+                       ret = (int)SBD_FAILURE;
                } else {
                        ret = sbd_write_same_data(task, scmd);
                        if (ret != SBD_SUCCESS) { @@ -2355,15 +2356,24 @@ write_same_xfer_done:
                        } else {
                                stmf_scsilib_send_status(task, STATUS_GOOD, 0);
                        }
+                       if ((scmd->flags & SBD_SCSI_CMD_TRANS_DATA) &&
+                           scmd->trans_data != NULL) {
+                               kmem_free(scmd->trans_data, scmd->trans_data_len);
+                               scmd->trans_data = NULL;
+                               scmd->trans_data_len = 0;
+                               scmd->flags &= ~SBD_SCSI_CMD_TRANS_DATA;
+                       }
                }
                /*
-                * Only way we should get here is via handle_write_same(),
-                * and that should make the following assertion always pass.
+                * Do the send_status afterwards, because of a potential
+                * double-free problem.
                 */
-               ASSERT((scmd->flags & SBD_SCSI_CMD_TRANS_DATA) &&
-                   scmd->trans_data != NULL);
-               kmem_free(scmd->trans_data, scmd->trans_data_len);
-               scmd->flags &= ~SBD_SCSI_CMD_TRANS_DATA;
+               if (ret != SBD_SUCCESS) {
+                       stmf_scsilib_send_status(task, STATUS_CHECK,
+                           STMF_SAA_WRITE_ERROR);
+               } else {
+                       stmf_scsilib_send_status(task, STATUS_GOOD, 0);
+               }
                return;
        }
        sbd_do_write_same_xfer(task, scmd, dbuf, dbuf_reusable);

It fixes a double-free in this path, which the eager-zeroes seems to tickle.

I'm attaching an stmf_sbd binary as well.  People affected by this can try this binary by:

1.) beadm create test-be

2.) beadm mount test-be /mnt

3.) cp stmf_sbd /mnt/kernel/drv/amd64/stmf_sbd

4.) bootadm update-archive -R /mnt

5.) Reboot, watch for grub

6.) Select "test-be" from the grub menu.  This will be a boot-once-into-the-new-BE test.

7.) Try your eager-zero test.

I look forward to seeing any experimental results.

Thanks,
Dan

________________________________

The content of this e-mail (including any attachments) is strictly confidential and may be commercially sensitive. If you are not, or believe you may not be, the intended recipient, please advise the sender immediately by return e-mail, delete this e-mail and destroy any copies.


More information about the OmniOS-discuss mailing list