dslreports logo
site
 
    All Forums Hot Topics Gallery
spc

spacer




how-to block ads


Search Topic:
uniqs
4460
share rss forum feed


vue666
Small block Chevies rule
Premium
join:2007-12-07
Halifax, NS
kudos:1

Ubuntu Disk Utility and SMART info...

The Ubuntu disk utility is reporting a SMART warning that I should replace one of my data drives. A WD 500gb sata drive.

Reallocated Sectors Count
NORM 143
WORST 143
THRESHOLD 140
VALUE 452 SECTORS

However when I run the WD Diags utility from the Ultimate BOOT CD, I don't get any errors. I ran both the QUICK and EXTENDED tests and it reported ZERO errors....


ReplaceIt

@pnap.net
Replace the disk, I replaced one under warranty RMA with as little as 20. 452 reallocated sectors is a pretty good indicator the disk may be failing. Use smartctl -a /dev/path/to/your/hdd to get the serial number and other information.

Go here to check your warranty status, I've done warranty swaps and the process was seamless. Some of the older WD drives even have a *lifetime* warranty that is still honored -- »westerndigital.secure.force.com/···?lang=en

When I RMA'd my drive I simply noted "High number of Reallocated_Sector_Ct drive is remapping sectors and is failing/failed"

From »en.wikipedia.org/wiki/S.M.A.R.T.
quote:
Count of reallocated sectors. When the hard drive finds a read/write/verification error, it marks that sector as "reallocated" and transfers data to a special reserved area (spare area). This process is also known as remapping, and reallocated sectors are called "remaps". The raw value normally represents a count of the bad sectors that have been found and remapped. Thus, the higher the attribute value, the more sectors the drive has had to reallocate. This allows a drive with bad sectors to continue operation; however, a drive which has had any reallocations at all is significantly more likely to fail in the near future.[2] While primarily used as a metric of the life expectancy of the drive, this number also affects performance. As the count of reallocated sectors increases, the read/write speed tends to become worse because the drive head is forced to seek to the reserved area whenever a remap is accessed. A workaround which will preserve drive speed at the expense of capacity is to create a disk partition over the region which contains remaps and instruct the operating system to not use that partition.


Bill_MI
Bill In Michigan
Premium,MVM
join:2001-01-03
Royal Oak, MI
kudos:2
Reviews:
·WOW Internet and..
reply to vue666
I gained a respect for the utility (Ubuntu 10.04).

Got a warranty replacement for a Hitachi Deskstar 2TB just a few months ago. Ubuntu Disk Utility called it all the way - exactly the same parameter.

But I didn't need a second opinion. I had several distros on it and some wouldn't mount or boot. Glad it was a 2nd drive.

There's also a parameter "Current Pending Sector Count" that better be zero or, I think, it ran out of swap sectors. Mine got to over 1000.


vue666
Small block Chevies rule
Premium
join:2007-12-07
Halifax, NS
kudos:1
From previous WD dealings they always ask for the code the WD Diags utility spits out. In this case there is no error, so I suspect WD will say there is no problem with the drive...


Bill_MI
Bill In Michigan
Premium,MVM
join:2001-01-03
Royal Oak, MI
kudos:2
Reviews:
·WOW Internet and..
Just a thought. Keep watching that relocated sector count. Is it going up? Perhaps it WILL eventually hit WD's threshold. They have every reputation-saving reason to flag problems as seldom as possible.

If things are really going bad it may not take long. If it stays frozen you may be ok, too, according to WD, anyway.


koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23
reply to vue666
Please stop using the Ubuntu Disk Snake Dildo Butt tool. I hate how this tool actually hides very key information from you. The information you've provided is too vague/ambiguous.

Please install smartmontools version 5.43 or newer (6.0 is much preferred) and run it against your drive (smartctl -a /dev/sda for example), and provide the output here. Please enclose all the output in a [code]
block (use brackets, not less-than/greater-than) to retain formatting. You will need to run this as root / via sudo.

I can provide an analysis of your drive after seeing that output.

--
Making life hard for others since 1977.
I speak for myself and not my employer/affiliates of my employer.


vue666
Small block Chevies rule
Premium
join:2007-12-07
Halifax, NS
kudos:1
reply to vue666
The repositories I'm using only had ver 5.39 so I installed that version...


vue666
Small block Chevies rule
Premium
join:2007-12-07
Halifax, NS
kudos:1
reply to vue666
Here's the output

quote:
enmo@asrock:~$ sudo smartctl -a /dev/sde
[sudo] password for kenmo:
smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, »smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar Blue Serial ATA family
Device Model: WDC WD5000AAKS-00TMA0
Serial Number: WD-WCAPW4073536
Firmware Version: 12.01C01
User Capacity: 500,107,862,016 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Sun Oct 21 12:54:51 2012 ADT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (12000) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 150) minutes.
Conveyance self-test routine
recommended polling time: ( 6) minutes.
SCT capabilities: (0x303f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 15
3 Spin_Up_Time 0x0003 180 172 021 Pre-fail Always - 5975
4 Start_Stop_Count 0x0032 097 097 000 Old_age Always - 3943
5 Reallocated_Sector_Ct 0x0033 143 143 140 Pre-fail Always - 452
7 Seek_Error_Rate 0x000e 200 200 051 Old_age Always - 0
9 Power_On_Hours 0x0032 079 079 000 Old_age Always - 15368
10 Spin_Retry_Count 0x0012 100 100 051 Old_age Always - 0
11 Calibration_Retry_Count 0x0012 100 100 051 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 794
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 253
193 Load_Cycle_Count 0x0032 199 199 000 Old_age Always - 4095
194 Temperature_Celsius 0x0022 112 098 000 Old_age Always - 38
196 Reallocated_Event_Count 0x0032 134 134 000 Old_age Always - 66
197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 197 172 051 Old_age Offline - 203

SMART Error Log Version: 1
ATA Error Count: 119 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 119 occurred at disk power-on lifetime: 15341 hours (639 days + 5 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 00 8a c0 e0 Error: UNC at LBA = 0x00c08a00 = 12618240

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 da 00 00 8a c0 2b 00 17:06:54.099 READ VERIFY SECTOR(S) EXT
42 da 00 00 89 c0 2b 00 17:06:54.097 READ VERIFY SECTOR(S) EXT
42 da 00 00 88 c0 2b 00 17:06:54.095 READ VERIFY SECTOR(S) EXT
42 da 00 00 87 c0 2b 00 17:06:54.093 READ VERIFY SECTOR(S) EXT
42 da 00 00 86 c0 2b 00 17:06:54.091 READ VERIFY SECTOR(S) EXT

Error 118 occurred at disk power-on lifetime: 15339 hours (639 days + 3 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 00 a4 c0 e0 Error: UNC at LBA = 0x00c0a400 = 12624896

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 da 00 00 a4 c0 2b 00 14:19:19.262 READ VERIFY SECTOR(S) EXT
42 da 00 00 a3 c0 2b 00 14:19:18.087 READ VERIFY SECTOR(S) EXT
42 da 00 00 a2 c0 2b 00 14:19:18.085 READ VERIFY SECTOR(S) EXT
42 da 00 00 a1 c0 2b 00 14:19:18.075 READ VERIFY SECTOR(S) EXT
42 da 00 00 a0 c0 2b 00 14:19:13.660 READ VERIFY SECTOR(S) EXT

Error 117 occurred at disk power-on lifetime: 4997 hours (208 days + 5 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 80 1f 6b 8a e0 Error: UNC 128 sectors at LBA = 0x008a6b1f = 9071391

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 80 1f 6b 8a 01 00 8d+00:25:32.983 READ DMA EXT
25 00 08 0f 6b 8a 01 00 8d+00:25:32.983 READ DMA EXT
35 00 08 37 00 60 00 00 8d+00:25:32.982 WRITE DMA EXT
35 00 08 f7 a9 5f 00 00 8d+00:25:32.982 WRITE DMA EXT
35 00 08 9f 6a a4 0b 00 8d+00:25:32.982 WRITE DMA EXT

Error 116 occurred at disk power-on lifetime: 4997 hours (208 days + 5 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 80 17 6b 8a e0 Error: UNC 128 sectors at LBA = 0x008a6b17 = 9071383

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 80 17 6b 8a 01 00 8d+00:25:30.898 READ DMA EXT
25 00 08 0f 6b 8a 01 00 8d+00:25:30.898 READ DMA EXT
35 00 08 ff a5 60 00 00 8d+00:25:30.898 WRITE DMA EXT
35 00 08 bf 8b 5b 00 00 8d+00:25:30.898 WRITE DMA EXT
25 00 80 0f 6b 8a 01 00 8d+00:25:28.672 READ DMA EXT

Error 115 occurred at disk power-on lifetime: 4997 hours (208 days + 5 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 80 0f 6b 8a e0 Error: UNC 128 sectors at LBA = 0x008a6b0f = 9071375

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 80 0f 6b 8a 01 00 8d+00:25:28.672 READ DMA EXT
25 00 08 07 6b 8a 01 00 8d+00:25:28.672 READ DMA EXT
35 00 08 c7 8b 5b 00 00 8d+00:25:28.671 WRITE DMA EXT
35 00 aa af fb 4c 10 00 8d+00:25:28.671 WRITE DMA EXT
35 00 00 af fa 4c 10 00 8d+00:25:28.670 WRITE DMA EXT

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Conveyance offline Completed without error 00% 15330 -
# 2 Conveyance offline Completed without error 00% 15328 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.




koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23
reply to vue666
That's nonsense. Ubuntu has got to have something newer. Oh look, it does:

»packages.ubuntu.com/search?keywo···tion=all

Install the package from one of those other versions, preferably the 5.43 version. 5.39 is just abysmally old and may not have your drive in the drive DB.

EDIT: You got lucky:

Device is: In smartctl database [for details use: -P show]
 

I will read the attributes momentarily.
--
Making life hard for others since 1977.
I speak for myself and not my employer/affiliates of my employer.


koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23
reply to vue666
Thank you for not following my instructions (I asked you to insert the data into a [code]
block to retain formatting). I will do my best. :P

SMART attributes show the following:

1. Drive has been in use for 15368 hours, or roughly 640 days, so a little under 2 years. Drive is 3.5" form factor.

2. SMART attribute 1 indicates the drive does have some minor issues re-reading certain sectors (note: not "reading", but actually "re-reading" in error conditions).

3. SMART attribute 5 indicates you have a total of 452 LBAs which have been remapped. This is a 512-byte sector drive, so that means you've lost a total of 231424 bytes of data over time. Which data was lost/what files were impacted is impossible to tell without a checksumming filesystem (ZFS, Btrfs, etc.).

4. SMART attribute 196 indicates there have been a total of 66 reallocation events, whether successful or failed. This number is a little hard to describe, since you might expect it to be the same (or greater than) what's in attribute 5. The tricky part about this number is that it actually fluctuates; it's a counter, but it doesn't increment -- it can be any value. It will go back to 0 when the drive has finished dealing with reallocation analysis.

5. SMART attribute 197 indicates you have 1 LBA which is "suspect", thus presently unreadable. Any attempts to read data from that LBA will result in an I/O error. The only way to get the drive to determine if the sector is bad is to initiate a write to the LBA, which will naturally cause it to lose its contents. So consider this another 512 bytes of data loss (total lost bytes at this point is 231936)

6. SMART attribute 200 indicates the drive is beginning to show signs of physical media wear or has physical/magnetic problems with the substrate layer. This affects writing especially. The drive has seen worse conditions in the past (current adjusted value is 197, but sometime during its lifetime has gotten down to 172).

7. Your SMART error log indicates a total of 119 events pertaining to errors (sector reads/writes which failed, or other anomalies). It looks like these have been gradually accumulating over time.

Due to the hard disk vendor choosing threshold trip values which are extremely lax, no SMART attributes have tripped, thus the overall SMART health value of the drive is OK/GOOD.

I could not give half of a rat's butt what Western Digital's own test tool says about the state of your drive -- that tool is only sometimes useful. It can never, EVER replace that of an actual human being when it comes to attribute analysis.

Overall advice:

Replace the drive. If it's under warranty still, do an Advanced RMA with WD (this ensures you get back a new drive first, which you can test fully -- let me know if you want me to point you to the procedure used to do this; it's actually much easier in *IX OSes than on Windows). Regarding the replacement: if you submit an Advanced RMA via their website, they will not require you to give them a "WD Diagnostic test code" -- you just simply pick "Bad sectors" as the reason and that's it.

If it's not under warranty, buy a new drive. Please avoid WD Green drives or any WD drive with "-GP" in its name at this time. You're free to purchase any other brand you wish, though I do not recommend Seagate's current models since they tend to also aggressively park their heads (and without any way to track it; you'll just hear it going "thunk" randomly). If choosing Samsung, please be aware of catastrophic SMART-related firmware bugs that can result in data loss (there are firmware patches that fix this).
--
Making life hard for others since 1977.
I speak for myself and not my employer/affiliates of my employer.


vue666
Small block Chevies rule
Premium
join:2007-12-07
Halifax, NS
kudos:1
Thanks kindly. It's out of warranty according to the serial number and the WDC Warranty page...

I was thinking of picking up these Seagate but I've had a few problems with their external drives and also note your remarks on them...

Again many thanks...


koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23
You're welcome!


vue666
Small block Chevies rule
Premium
join:2007-12-07
Halifax, NS
kudos:1
reply to vue666
I did install smartmon ver 5.39 from your link.

I'm running natty (Zorin OS 5.2 based on Ubuntu 11.04)...


vue666
Small block Chevies rule
Premium
join:2007-12-07
Halifax, NS
kudos:1
reply to koitsu
said by koitsu:

Please avoid WD Green drives or any WD drive with "-GP" in its name at this time.

Sorry but what does the -GP indicate? Refurbished Green??

Again many thanks...


koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23
"GreenPower"