said by rockisland:My question is whether you think the drive is salvageable or will it always be suspect and I'd be better off replacing it. If it's worth a shot I'd give writing to it a try. It can't hurt anything at this point. :)
From my perspective there's absolutely nothing anomalous about the drive aside from at least 1 sector that may or may not be bad (won't know until a write is issued to the LBA). Your choices here:
1. Zero the entire drive (writing zeros to every LBA). HD Tune Pro can do this via the Erase tab, or you can use whatever other utility you want (CCleaner for example has this feature too).
FORMAT
will not do this (at least not on XP), nor will Disk Management. Take a screenshot/snapshot of the SMART attributes before and after the drive is erased. I can do the post-analysis from there.
This choice has the advantage of detecting and dealing with any other LBAs/sectors that may cause issues. Meaning: right now you only know of one, but there may be others (on other areas of the drive you haven't used yet).
On the downside, zeroing the entire drive takes a while.
I tend to recommend this method because it's easiest and can also reveal other sectors that may have issues.
I also tend to recommend that after zeroing, you issue a Error Scan (if using HD Tune Pro) of every LBA on the disk (i.e. un-check the Quick checkbox). This takes a while too, but ensures that every LBA is readable before you put the drive back into the array.
2. Issue a write to the individual LBA that the drive has issues with (LBA 10447767). The drive will re-analyse the individual sector and either remap the LBA to a spare or decide the sector is fine and keep the existing mapping.
This has the advantage of being very quick to do (a single write takes milliseconds), and does not require you to have to back up any data from the drive to begin with (latter doesn't apply in your case since it's used for RAID).
On the downside, doing this is tricky and requires familiarity with tools such as
dd
(I don't trust any other utility) and
exactly what arguments to use (messing these up or omitting one can result in the entire drive being zeroed). You also have to read from that individual LBA first -- why? Because I have seen cases where the drive firmware says LBA X while the OS insists LBA X is perfectly fine and it's LBA X+1 which has the issue (don't ask; this is not an off-by-one mistake, this is just downright something bizarre that I've seen reported here).
In general, on RAID arrays where checksumming filesystems are not used (i.e. NTFS, FAT, ext2, ext3, ext4, etc.), I do not recommend this method
unless after doing so you
immediately tell the RAID management software to nuke the metadata on the disk and rebuild the array entirely with that drive (i.e. treat the now-repaired drive as a new disk). Failure to do this can/will result in one of your files, when read, returning 512 bytes of zeros where there was previously data. What file is also unknown/undetermined. There's nothing you can do about this situation, sadly (think about the situation if it was a standalone, non-RAID disk).
3. RMA the drive (preferably an Advanced RMA, since it ensures you get a replacement drive first, which you can test fully before sending the other drive back).
This has the advantage of being the simplest choice and usually the least painful, i.e. box the drive up and ship it off.
On the downside, Advanced RMA requires that you have a credit card handy (in case they don't receive the bad drive you get charged for the new one, at a significantly increased price), that you have proper shipping materials (anti-static peanuts/foam, anti-static bags, sturdy box, etc.) for the bad drive, and that it takes about a week to get the replacement drive. The other downside is that if you do this over the phone (please try to avoid that) you have to "prove" to the person you speak to that the drive is bad. They also ask you the question "is this drive in a RAID array?" to which you should answer
NO. I've ranted about this sneaky/tricky question in a DSLR/BBR post in the past; I can dig it up if you want. Just answer no and move on. Their website, AFAIK, does not ask this question. For the RMA reason, just say "bad sectors".
said by rockisland:Then what to do with the drive with 30 Ultra DMA CRC Errors in 94 hours of use? That seems like too much especially when compared to the other drives with many times the hours of use. That one may actually be under warranty because it was a replacement last year.
I already answered this. Quote:
said by koitsu:... If you really did replace the drive 92 hours ago, I recommend waiting until the next array degradation event happens and then see if the CRC error count [has] increased. ...