<?xml version="1.0" encoding="UTF-8"?>

<rss version="2.0" xmlns:blogChannel="http://backend.userland.com/blogChannelModule">

<channel>
<title>Topic &#x27;Re: Problem with Open Solaris + disks&#x27; in forum &#x27;All Things Unix&#x27; - dslreports.com</title>
<link>http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22440439</link>
<description></description>
<language>en</language>
<pubDate>Sat, 11 Feb 2012 10:49:25 EDT</pubDate>
<lastBuildDate>Sat, 11 Feb 2012 10:49:25 EDT</lastBuildDate>

<item>
<title>Re: Problem with Open Solaris + disks</title>
<link>http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22637720</link>
<description><![CDATA[galacticroot posted : I finally managed to move the files over to an Openfiler box and I just installed Debian on the system that has OpenSolaris.<br><br>I used smartmon tools to check both hard drives.  Disk 1 is fine, but disk 2 has quite a few reallocated sectors (currently at 1996).  I will definitely replace it if I get any more reallocated sectors.  I suspect that the read failures created by disk 2 were not being handled well by OpenSolaris (or rather the SATA driver).<br><br>Linux seems to handle the errors correctly, although I haven't tried it out with the Xen VMs yet (I'm currently transferring the images).]]></description>
<guid isPermaLink="true">http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22637720</guid>
<pubDate>Wed, 01 Jul 2009 00:24:54 EDT</pubDate>
</item>

<item>
<title>Re: Problem with Open Solaris + disks</title>
<link>http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22466625</link>
<description><![CDATA[galacticroot posted : This was just a personal system which I built primarily as a NAS box, but also to run a couple VMs for various things.  I was more or less running OpenSolaris strictly for the features of ZFS.  I thought I would end up adding a lot more space and using some other features of ZFS which I never ended up using.<br><br>I will look at btrfs.  It sounds like it could be very nice in the future.<br><br>For now, OpenFiler looks like it will work well enough for NAS, and Linux or BSD will be good for running the VMs.]]></description>
<guid isPermaLink="true">http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22466625</guid>
<pubDate>Fri, 29 May 2009 21:26:18 EDT</pubDate>
</item>

<item>
<title>Re: Problem with Open Solaris + disks</title>
<link>http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22464359</link>
<description><![CDATA[koitsu posted : <div class="bquote"><small>said by <a href="/profile/156851" onClick="this.blur(); return popup(event,'/uidpop?ajh=1&uid=156851');">beerbum</a>:</small><br><br>huh??.. maybe you missed it.. the OP is running OpenSolaris (SunOS 5.11), not Solaris (SunOS 5.10).. ...</div>Well colour me stupid.  For the longest while now I've been under the impression that Solaris 10 (5.10) was in fact OpenSolaris.  Good lord, there's something seriously wrong when an administrator of machines doesn't even know what the official title of his OS is.<br><br>I think I might save this thread to remind me of my stupid moments.<br><br>Thanks for clearing that up for me -- I appreciate it.  (Damn you Sun...)<br><small>--<br>Making life hard for others since 1977.<br>I speak for myself and not my employer/affiliates of my employer.</small>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22464359</guid>
<pubDate>Fri, 29 May 2009 13:45:20 EDT</pubDate>
</item>

<item>
<title>Re: Problem with Open Solaris + disks</title>
<link>http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22462382</link>
<description><![CDATA[beerbum posted : <div class="bquote"><small>said by <a href="/profile/659143" onClick="this.blur(); return popup(event,'/uidpop?ajh=1&uid=659143');">koitsu</a>:</small><br><br>Given that my place of employment has relied on Solaris 8 through 10 (~90% of our machines are using 10 at this point) for 5+ years now, my experience is quite the opposite. </div>huh??.. maybe you missed it.. the OP is running OpenSolaris (SunOS 5.11), not Solaris (SunOS 5.10).. I would not recommend anyone run <i>Open</i>Solaris in a production system.  Heck no admin worth anyone would recommend that.. Comparing <i>Open</i>Solaris to Solaris, the SATA support is much more robust than the same in OpenSolaris.. While the new features and whatnot in OS do make it into production Solaris, one should consider OpenSolaris as a beta product.<br><br>Hell I'm pretty sure even Sun does not recommend using OpenSolaris in a production environment..<br><br><div class="bquote">Use whatever OS gets the job done.  If that's Linux, great.  If that's Solaris 10, great.  If that's OS/2, I'll punch you.  ;-) </div>I started out (*nix) adminning on IBM RS2K's.. guess what I used on my peecee - yup OS/2 Warp..  In fact, my Rexx-fu is what helped me land my first gig as an admin..]]></description>
<guid isPermaLink="true">http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22462382</guid>
<pubDate>Fri, 29 May 2009 07:12:40 EDT</pubDate>
</item>

<item>
<title>Re: Problem with Open Solaris + disks</title>
<link>http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22462339</link>
<description><![CDATA[koitsu posted : <div class="bquote"><small>said by <a href="/profile/1008872" onClick="this.blur(); return popup(event,'/uidpop?ajh=1&uid=1008872');">galacticroot</a>:</small><br><br>Okay, I tried changing to AHCI mode, but the Open Solaris drivers seem to have an issue with that and can't mount the root filesystem.  Returning to the original mode allows it to boot.</div>This isn't a "driver issue" -- it's probably that the device label (and all underlying filesystem slices) has changed.  The disks probably won't be named "c3d0s0" any more, but could be "cXd0s0" where X is some new number, or, are labelled "sdX".<br><br>We have the same problem on FreeBSD, and Windows requires an entire reinstall.  :-)<br><br>ZFS should be able to cope with the device names changing.<br><br><div class="bquote"><small>said by <a href="/profile/156851" onClick="this.blur(); return popup(event,'/uidpop?ajh=1&uid=156851');">beerbum</a>:</small><br><br>may I ask a stupid question.. why are you using OpenSolaris to begin with? if this is a mission critical server, heck even if it's a mission optional machine, I do not recommend running OpenSolaris.<br><br>Sun's production Solaris costs just the same and is (IMO) the more reliable route to go in a business setting.</div>Given that my place of employment has relied on Solaris 8 through 10 (~90% of our machines are using 10 at this point) for 5+ years now, my experience is quite the opposite.  I'm talking multiple thousands of machines, all x86 (at this point), and all are mission + time-critical (all production, and are involved with VoIP + IVR; 2-3 second "stalls" or other oddities a server might encounter result in horrible caller experience, and we can't have that).<br><br>We open SunSolve cases for strange things we encounter and Sun is responsive.  I'm not saying "you're wrong", I'm saying my experience is entirely different.  Of course, low-level administration of devices and hardware on Solaris is significantly better on Sparc (and I do mean significantly), but x86 is standard these days.<br><br>Btrfs is the only thing on Linux that even remotely behaves like the OP's ZFS configuration.  I would HIGHLY recommend the OP read the following thread (and news article!):<br><br>&raquo;<A HREF="/forum/r22399545-Chris-Mason-Interview-BTRFS-Founder-Lead-Developer">Chris Mason Interview - BTRFS Founder & Lead Developer</A><br><br>Use whatever OS gets the job done.  If that's Linux, great.  If that's Solaris 10, great.  If that's OS/2, I'll punch you.  ;-)<br><small>--<br>Making life hard for others since 1977.<br>I speak for myself and not my employer/affiliates of my employer.</small>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22462339</guid>
<pubDate>Fri, 29 May 2009 06:50:06 EDT</pubDate>
</item>

<item>
<title>Re: Problem with Open Solaris + disks</title>
<link>http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22460750</link>
<description><![CDATA[beerbum posted : may I ask a stupid question.. why are you using OpenSolaris to begin with?  if this is a mission critical server, heck even if it's a mission optional machine, I do not recommend running OpenSolaris.<br><br>Sun's production Solaris costs just the same and is (IMO) the more reliable route to go in a business setting.]]></description>
<guid isPermaLink="true">http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22460750</guid>
<pubDate>Thu, 28 May 2009 20:53:05 EDT</pubDate>
</item>

<item>
<title>Re: Problem with Open Solaris + disks</title>
<link>http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22460601</link>
<description><![CDATA[galacticroot posted : Okay, I tried changing to AHCI mode, but the Open Solaris drivers seem to have an issue with that and can't mount the root filesystem.  Returning to the original mode allows it to boot.<br><br>The hardware I find that definitely works well with Open Solaris seems to all be higher end than I can afford right now.  I'm going to switch the file storage over to a separate NAS setup running OpenFiler, and replace Open Solaris with Linux on this system and run the VMs on that.]]></description>
<guid isPermaLink="true">http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22460601</guid>
<pubDate>Thu, 28 May 2009 20:22:00 EDT</pubDate>
</item>

<item>
<title>Re: Problem with Open Solaris + disks</title>
<link>http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22444023</link>
<description><![CDATA[koitsu posted : The errors indicate LBA read failures on Disk 0 and Disk 2.  The fact that the errors occurred (according to your logs) within 5 seconds of one another is a little suspicious.<br><br>I'd recommend looking at SMART stats on both of these disks to see if you can discern what's going on.  However, Solaris 10 doesn't offer an API for obtaining SMART data from ATA/SATA disks -- only SCSI.  So smartmontools won't help you here.<br><br>SMART statistics will help determine if the disks themselves are actually witnessing bad blocks (and remapping them), or if the controller is responsible.<br><br>But you'll have to boot into another OS (FreeBSD, Linux, Windows) to get SMART statistics with smartmontools.<br><br>If you bought both ATA/SATA disks at the same time, it's possible both have problems.  Otherwise, if both Disk 0 and Disk 2 are on the same physical controller (which they appear to be), I'd recommend the following:<br><br>1) Are these internal SATA drives?  If so, replace the SATA cables.  You should only have to do this once.  If the problem recurs, it's not the cables.<br><br>2) If the SATA drives are external via eSATA, are you using a SATA-to-eSATA adapter bracket (e.g. cable runs between the onboard SATA controller to the backplane, with an eSATA connector)?  If so, get rid of it -- chances are you're exceeding SATA cable length.  Buy yourself a real PCI/PCIe-based eSATA controller.<br><br>3) If your motherboard BIOS has support for AHCI, enable it in the BIOS.  I have no idea if the SB780 supports AHCI or not.  You should always go with AHCI if given the chance, especially on server systems.<br><br>4) Opening a SunSolve case, as the problem could be Solaris having buggy support for the SB780 SATA controller (do not even for a moment think all SATA controllers are alike).  My money is on this being the root cause.<br><br>zpool status indicates that the disks are literally falling off the SATA bus ("cannot open").  There's got to be other messages on your console to indicate that, not just sector read errors.<br><br>Regarding hardware compatibility:<br><br>I've never seen Solaris 10 behave flawlessly with SATA disks.  At work we have Intel ICHx-based controllers with both SATA SSDs and standard drives.  Our machines work fine -- no problems -- except during boot-up we see some CMD errors to the disks (iostat -e shows the same).  There's nothing wrong with the SATA controllers here -- it's that Solaris is trying to issue SCSI commands to SATA disks, and the SCSISATA layer doesn't properly remap the commands (or remove ones not supported).  Thus, the errors seen are false.  We use ZFS and we don't have problems with the disks falling off the bus, though -- the only errors occur during boot.<br><br>If I had to make a recommendation, I'd say go with a motherboard that provides an on-board Intel ICHx controller.  ICH7 would probably be best, possibly ICH9 (newer).  I don't think Solaris 10 has decent support for ICH10 yet (too new).<br><small>--<br>Making life hard for others since 1977.<br>I speak for myself and not my employer/affiliates of my employer.</small>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22444023</guid>
<pubDate>Tue, 26 May 2009 04:00:42 EDT</pubDate>
</item>

<item>
<title>Re: Problem with Open Solaris + disks</title>
<link>http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22441647</link>
<description><![CDATA[galacticroot posted : Now the other disk in rpool went offline and doesn't come back up even when I reboot for some reason.  I probably have to power cycle the box, not just reboot.<br><br>I am willing to replace the controller with a PCIe one, but I'm not sure what Open Solaris supports well.  Are there any PCIe SATA cards with >=4 ports that it supports well?<br><br>I am even willing to replace the motherboard if that will help.  I just don't want to spend a lot of money on hardware that may not even work well with it.]]></description>
<guid isPermaLink="true">http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22441647</guid>
<pubDate>Mon, 25 May 2009 17:11:08 EDT</pubDate>
</item>

<item>
<title>Re: Problem with Open Solaris + disks</title>
<link>http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22441393</link>
<description><![CDATA[beerbum posted : if you were running SCSI I'd suggest replacing the cable<br><br>can you move one of the discs (Disk 2) to a different controller?  or even better attach an additional drive to a different controller.<br><br>it's possible you could be overloading the controller - something easy to do with the poor sata drivers that are packaged.]]></description>
<guid isPermaLink="true">http://www.dslreports.com/forum/Re-Problem-with-Open-Solaris-disks-22441393</guid>
<pubDate>Mon, 25 May 2009 16:15:53 EDT</pubDate>
</item>

<item>
<title>Problem with Open Solaris + disks</title>
<link>http://www.dslreports.com/forum/Problem-with-Open-Solaris-disks-22440439</link>
<description><![CDATA[galacticroot posted : I've got a server using an SB780 onboard SATA controller running Open Solaris.  I am using it as a file server and to run several Xen VMs.  My problems started when I recently added a VM to use as a mail server.  I allocated a 40GB zfs volume on rpool for its disk image and installed Debian on it.  Everything went as expected and I got the server set up, then the problems started.<br><br>Usually, what will happen is that the VM will crash, accompanied by messages like this on the host machine:<br><pre class="brush: text">May 25 11:22:51 hydrogen xpv_psm: &#91;ID 803547 kern.info&#93; xVM_psm: ide (ata) instance 2 vector 0xe ioapic 0x2 intin 0xe is bound to cpu 1&#012;May 25 11:22:51 hydrogen xpv_psm: &#91;ID 803547 kern.info&#93; xVM_psm: ide (ata) instance 3 vector 0xf ioapic 0x2 intin 0xf is bound to cpu 0&#012;May 25 11:22:51 hydrogen genunix: &#91;ID 408114 kern.info&#93; /xpvd/xnb@4,0 (xnbo0) online&#012;May 25 11:22:51 hydrogen xpv_psm: &#91;ID 803547 kern.info&#93; xVM_psm: ide (ata) instance 2 vector 0xe ioapic 0x2 intin 0xe is bound to cpu 1&#012;May 25 11:22:51 hydrogen xpv_psm: &#91;ID 803547 kern.info&#93; xVM_psm: ide (ata) instance 3 vector 0xf ioapic 0x2 intin 0xf is bound to cpu 0&#012;May 25 11:22:51 hydrogen mac: &#91;ID 469746 kern.info&#93; NOTICE: vnic1009 registered&#012;May 25 11:23:32 hydrogen genunix: &#91;ID 698548 kern.notice&#93; ata_disk_start: select failed&#012;May 25 11:24:02 hydrogen last message repeated 6 times&#012;May 25 11:24:02 hydrogen scsi: &#91;ID 107833 kern.warning&#93; WARNING: /pci@0,0/pci-ide@11/ide@0 (ata0):&#012;May 25 11:24:02 hydrogen        timeout: early timeout, target=0 lun=0&#012;May 25 11:24:02 hydrogen gda: &#91;ID 107833 kern.warning&#93; WARNING: /pci@0,0/pci-ide@11/ide@0/cmdk@0,0 (Disk0):&#012;May 25 11:24:02 hydrogen        Error for command 'read sector' Error Level: Informational&#012;May 25 11:24:02 hydrogen gda: &#91;ID 107833 kern.notice&#93;   Sense Key: aborted command&#012;May 25 11:24:02 hydrogen gda: &#91;ID 107833 kern.notice&#93;   Vendor 'Gen-ATA ' error code: 0x3&#012;May 25 11:24:08 hydrogen genunix: &#91;ID 698548 kern.notice&#93; ata_disk_start: select failed&#012;May 25 11:24:18 hydrogen last message repeated 2 times&#012;May 25 11:24:18 hydrogen scsi: &#91;ID 107833 kern.warning&#93; WARNING: /pci@0,0/pci-ide@11/ide@0 (ata0):&#012;May 25 11:24:18 hydrogen        timeout: early timeout, target=1 lun=0&#012;May 25 11:24:23 hydrogen genunix: &#91;ID 698548 kern.notice&#93; ata_disk_start: select failed&#012;May 25 11:24:23 hydrogen gda: &#91;ID 107833 kern.warning&#93; WARNING: /pci@0,0/pci-ide@11/ide@0/cmdk@1,0 (Disk2):&#012;May 25 11:24:23 hydrogen        Error for command 'read sector' Error Level: Informational&#012;May 25 11:24:23 hydrogen gda: &#91;ID 107833 kern.notice&#93;   Sense Key: aborted command&#012;May 25 11:24:23 hydrogen gda: &#91;ID 107833 kern.notice&#93;   Vendor 'Gen-ATA ' error code: 0x3&#012;May 25 11:24:28 hydrogen genunix: &#91;ID 698548 kern.notice&#93; ata_disk_start: select failed&#012;May 25 11:24:37 hydrogen scsi: &#91;ID 107833 kern.warning&#93; WARNING: /pci@0,0/pci-ide@11/ide@0 (ata0):&#012; &#012;</pre><!--end code block--><br>Zpool will typically show problems:<br><pre class="brush: text">  pool: rpool&#012; state: DEGRADED&#012;status: One or more devices could not be opened.  Sufficient replicas exist for&#012;        the pool to continue functioning in a degraded state.&#012;action: Attach the missing device and online it using 'zpool online'.&#012;   see: http://www.sun.com/msg/ZFS-8000-2Q&#012; scrub: scrub completed after 1h48m with 0 errors on Sun May 17 22:55:01 2009&#012;config:&#012; &#012;        NAME        STATE     READ WRITE CKSUM&#012;        rpool       DEGRADED     0     0     0&#012;          mirror    DEGRADED     1     0     0&#012;            c3d0s0  UNAVAIL      8 32.7K     0  cannot open&#012;            c4d0s0  ONLINE       1     0     0&#012; &#012;errors: No known data errors&#012; &#012;  pool: tank&#012; state: ONLINE&#012;status: One or more devices has experienced an unrecoverable error.  An&#012;        attempt was made to correct the error.  Applications are unaffected.&#012;action: Determine if the device needs to be replaced, and clear the errors&#012;        using 'zpool clear' or replace the device with 'zpool replace'.&#012;   see: http://www.sun.com/msg/ZFS-8000-9P&#012; scrub: resilver completed after 0h0m with 0 errors on Mon May 25 09:44:42 2009&#012;config:&#012; &#012;        NAME        STATE     READ WRITE CKSUM&#012;        tank        ONLINE       0     0     0&#012;          mirror    ONLINE       0     0     0&#012;            c3d1    ONLINE       0     4     0  85K resilvered&#012;            c4d1    ONLINE       0     0     0  54K resilvered&#012; &#012;</pre><!--end code block--><br>If I online any offline devices, clear the errors, restart the VM, and scrub the pools, everything works fine for a while, until it happens again.  It seems to happen when a moderate, but not necessarily heavy IO load occurs.<br><br>I am thinking this is a problem with the disk controller, not the disks.  Although the first disk gets far more errors than the others, sometimes the other disks are the ones to have failures and the first disk continues working.<br><br>This is the first time a disk actually went offline.  I couldn't successfully online it without rebooting.<br><br>Has anyone had this problem before?  Is there likely a software fix for this, or should I just get a new SATA controller (provided it isn't actually the disks)?]]></description>
<guid isPermaLink="true">http://www.dslreports.com/forum/Problem-with-Open-Solaris-disks-22440439</guid>
<pubDate>Mon, 25 May 2009 12:20:15 EDT</pubDate>
</item>

</channel>
</rss>

