Tell me more x
, there is a new speed test available. Give it a try, leave feedback!
dslreports logo
 
    All Forums Hot Topics Gallery
spc

spacer

Search Topic:
uniqs
879
share rss forum feed


andyross
Premium,MVM
join:2003-05-04
Schaumburg, IL

1 edit

[hard drive] Samsung drive with bad spot

I have a 2007 Dell XPS 410 with a Samsung HD501LJ drive.

A few days ago, I had a program seem to freeze, and I heard clicking from the drive. After a defrag yesterday, on a reboot, it froze at the black screen between the Win7 logo and the login, with that clicking, for about 2 minutes.

I tried to make a Ghost backup, but it failed due to a read error. I was able to eventually make one after enabling the option to ignore bad sectors.

I then tried to find some diagnostics for the drive. From what I can tell, it should be a program called HUTIL. I found a copy and made a bootable disk, but the FreeDOS seemed to give a few errors and it hung. I then booted an old Win98 boot floppy (yes, the computer has a floppy drive) and tried running HUTIL from a copy I put on another disk. That ran, but couldn't see any drives.

I tried searching, but it looks like Seagate bought the Samsung drive business, and trying to get at info keeps jumping to the wrong area. I tried the SeaTools for Windows, and it shows the drive, but I can't really do anything beyond basic scanning.

I did run a "CHKDSK C: /R" on the drive, and Windows found a bad cluster in wbengine.exe (part of Windows Backup). On the first reboot, it did click a few times, but not as long, showed a 'good CHKDSK' text, then went to the login screen. I replaced wbengine.exe from another Win7 computer (fun with ownership and permissions.) I rebooted again, and it booted up normally. I think the immediate crisis is past, but it does need replacing.

So, #1: Does anyone know of a program that will let me try and do better diagnostics on this drive?

#2: I will probably need to replace the drive. I haven't cloned a drive since the NT/Win2K days, and used Partition Magic back then (just hook up two drives, boot the floppies, and clone.) Anything as simple to use today? Also, will Windows (Win7 32-bit) require re-registering, or will it tolerate small changes like a drive change?

bbear2
Premium
join:2003-10-06
94045
kudos:5
Reviews:
·VOIPO
said by andyross:

...
A few days ago, I had a program seem to freeze, and I heard clicking from the drive. After a defrag yesterday, on a reboot, it froze at the black screen between the Win7 logo and the login, with that clicking, for about 2 minutes.

...

So, #1: Does anyone know of a program that will let me try and do better diagnostics on this drive?

#2: I will probably need to replace the drive. I haven't cloned a drive since the NT/Win2K days, and used Partition Magic back then (just hook up two drives, boot the floppies, and clone.) Anything as simple to use today? Also, will Windows (Win7 32-bit) require re-registering, or will it tolerate small changes like a drive change?

#1 Why do you want a program that does diagnostics? The clicking noise you heard WAS your diagnosis that the drive is going bad; read that drive is on borrowed time.

#2 Change probably to definitely. Depending on which drive you replace it with, often drive manufacturers supply a drive cloning tool.

If it were me and that was my drive, here's my 1,2, 3...

1. Refrain from any excess drive activity (e.g. defrags, etc. ) If they are on autorun, disable them, don't use the computer except for getting your data off.

2. Back up any date you care about, this drive is on its last leg and you have no idea how much time you have left.

3. Buy a replacement drive and try the drive manufacturer's cloning application.


koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23
reply to andyross
I'm trying to figure out what it is you actually want to accomplish with this drive exactly.

Are you trying to do data recovery from it (i.e. copy off as much as you can to a working/replacement drive)?

I know you're asking for "a program that does better diagnostics", but what exactly are you wanting to diagnose? Or are you wanting to see if there's an actual problem with the drive (vs. the problem being elsewhere)? smartmontools / GSmartControl (if you want a GUI) is by far the best thing for this task, but it the user to understand/interpret the results properly (most people don't).

If there are suspect LBAs (which are unreadable) on the drive, then this means you've already lost data -- nothing can be done about that. The fact that you ran Ghost and it reported at least 1 read error indicates this is the case.

I would not do a "raw copy" ("disk clone", etc.) of this drive onto a new one and then expect everything to run happily, because there may be files which utilise the unreadable LBAs, and skipping read errors means the file will then contain zeros where there used to be data. The end result here can be programs randomly crashing, a kernel that doesn't work, or giant bees flying out of your CD drive -- okay, the last is a bit of a stretch, but my point is that the effects are literally infinite. It's not easy to figure out what LBA is in use by a file on the filesystem (it should be, but it isn't).

In these situations, when the end-user is not familiar with data recovery, I always suggest copying off files at the filesystem level (just a standard file copy) instead, as you'll then know what files are impacted (if unreadable). You can then restore those files from backups, or from other source media if you have it. This requires the person know what files to copy in the first place -- My Documents or the \Users\{username} directory is a good start.

So basically: get a replacement drive first, install Windows 7 on it (fresh install) as if it were a brand new system, then from the wonky drive copy files over to the new.

Final tip: when a drive begins to behave this way, you don't then go and do backups -- by then it's too late. The entire point of a backup is that when this situation happens, you have something that's known to be good which you can restore from.
--
Making life hard for others since 1977.
I speak for myself and not my employer/affiliates of my employer.


andyross
Premium,MVM
join:2003-05-04
Schaumburg, IL
I do have earlier backups, but wanted to get one that was the latest, just in case. In addition to the Ghost image backup, I also use SyncToy to copy the user directories and other files more often.

So far, the CHKDSK /R only showed the one file bad, which I already replaced. No other issues so far.

I have bought a new drive (Seagate Barracuda) and will see about moving everything to it when I get time. I'll probably try a clone first, as I'd rather not start from scratch if I don't have to.

As far as the diagnostics, I've heard that one some drives, it can help reformat or force the drive to 'redirect' the bad areas to a spare area. I've done that in the past, and the drives have gone on to long lives. It's just that I haven't found the software that works properly with the Samsung yet.


lugnut

@communications.com
reply to andyross
I've successfully used this program, HDD Regenerator to bring back MANY drives from the dead. It's well worth the money. It fixes errors that Spinrite barfs on.

»www.dposoft.net/

At the very least, it should work well enough for you to pull your files off of the drive before you send it back for a replacement.


dbarber

join:2000-07-25
West Chester, PA
said by lugnut :

I've successfully used this program, HDD Regenerator to bring back MANY drives from the dead.

+1
I've used it to do data salvage on a number of occasions. It hasn't always regenerated ALL of the bad sectors, but it has allowed me to copy off most of the data. After salvage, I have NEVER trusted the drives again. I dispose of them.


koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23

1 recommendation

reply to andyross
said by andyross:

So far, the CHKDSK /R only showed the one file bad, which I already replaced. No other issues so far.

You (like most people -- it's not your fault) misunderstand how filesystems work. CHKDSK is looking at the allocation table for problems, i.e. "if the unreadable LBA is located within the MFT or is used by some allocation tables that define information about a file". It does not and can not address or repair problems if the LBA is used for the contents of a file. The only type of filesystem that can detect (and/or repair) this is a checksumming filesystem like ZFS or Btrfs.

So the short version is: depending on what all has happened to your disk, you may in fact have files that contain unreadable data.

said by andyross:

I have bought a new drive (Seagate Barracuda) and will see about moving everything to it when I get time. I'll probably try a clone first, as I'd rather not start from scratch if I don't have to.

See above -- doing a disk-to-disk clone is not going to tell you what files are impacted if an LBA cannot be read. All you're going to get from Ghost is "I can't read sector/LBA 12345". This doesn't help you determine what files need to be restored from backups.

Also, just an FYI: be aware that many of Seagate's present-day drives will exhibit clicking during normal operation. This is caused by excessive head parking (what has been coined on the Internet as the "LCC issue"). So going forward, depending on the model of drive you got, if you hear clicking it doesn't necessarily mean something is wrong. Just something you need to be aware of about most of Seagate's drives at this point in time.

said by andyross:

As far as the diagnostics, I've heard that one some drives, it can help reformat or force the drive to 'redirect' the bad areas to a spare area. I've done that in the past, and the drives have gone on to long lives. It's just that I haven't found the software that works properly with the Samsung yet.

You don't understand how LBA remapping works. It's okay -- again, most people don't. You don't need special software to accomplish this task -- it can be done with almost literally any kind of software that issues writes to a disk (no joke, really).

But none of this allows you to get data back in any way -- anything that is unreadable is lost. Period. NOTHING can recover that data. LBA remapping will just cause LBA 12345 to no longer have a 1:1 correlation with sector 12345; it instead maps LBA 12345 to sector 834904329834. But to induce remapping, you have to issue a write to to that LBA, which means you have to know what LBAs are unreadable, and more importantly, what files utilise that LBA. Otherwise, if you "don't care", and you issue a write to the unreadable LBA, you effectively zero out 512 bytes (or 4096 for 4096-byte sector drives) of data within that file (if that LBA is used for file contents, rather than the MFT or file metadata tables). CHKDSK isn't going to detect the former situation. CHKDSK /B (available on Vista onwards only), however, might -- but I make no guarantees.

People seem to think that by magically issuing a write to an unreadable LBA, thus inducing remaps, "magically lets them get all their data off the drive". What they do not understand is that any file utilising that LBA can/will contain zeros once the write is issued. So yes, now you'll have a drive "you can copy without any errors", but now you've got random files that contain zeros when previously they contained data (that became unreadable). This is why knowing what files to restore from backups is important.

If you want to talk about this in further detail, I can do so, but it is a serious time sink for me. I'd recommend looking at my profile here + looking through the past 2-3 years of posts of mine where I step people through how to do this and explain the caveats, how it all works, etc...

Finally:

DO NOT RUN ANY SOFTWARE THAT MAKES MODIFICATIONS TO THE DRIVE (LIKE HDD REGENERATOR OR ANYTHING ELSE). This can actually make your situation worse, depending on what the situation is. There isn't enough information for me to go off of at this point / to condone use of such software. You could start by using smartmontools or GSmartControl and providing me output from smartctl -a C: (assuming drive is C:), or alternately install HD Tune Pro (trial version please; do not use the free version) and provide me a screenshot of the Health tab. I strongly prefer smartctl output, as this gives me a lot of other insights to other features of SMART that HD Tune Pro does not support.
--
Making life hard for others since 1977.
I speak for myself and not my employer/affiliates of my employer.


47717768
Premium
join:2003-12-08
Birmingham, AL
kudos:2
reply to andyross
I agree with koitsu. Check S.M.A.R.T first.


andyross
Premium,MVM
join:2003-05-04
Schaumburg, IL
reply to koitsu
Again, there really isn't any lost data, other than the one damaged file (CHKDSK /R listed the full name and path of the damaged file it found the bad cluster in), which was replaced. Windows has marked the cluster containing the bad sector(s) as bad, so it won't use it again. From what I could tell, CHKDSK /R does scan the full drive (it took nearly 3 hours for this 500G drive).


andyross
Premium,MVM
join:2003-05-04
Schaumburg, IL
reply to andyross
downloadsmartctl.txt 5,350 bytes
Here is the smartctl output.

I do see the issues involving read errors. That said, if the issue is spreading, would those number continue to increase? If they stay the same, would that mean it's stable for the short term?


koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23

1 edit
said by andyross:

Here is the smartctl output.

I do see the issues involving read errors. That said, if the issue is spreading, would those number continue to increase? If they stay the same, would that mean it's stable for the short term?

Thanks -- this greatly helps shed some light on what the condition is (it's actually quite good). My familiarity with Samsung model drives is somewhat limited (I'm familiar with WD, Seagate, Fujitsu, Intel (SSDs), and then Samsung -- in that particular order), but I can understand the data. Here are the key pieces. Keep in mind all of these represent the state of the drive during its entire lifetime (5575 hours):


Attribute 1 indicates that there is likely an area on the platter(s) which is not in good condition -- meaning the drive has to re-attempt re-reads when reading certain areas. This is done transparently (meaning the OS/controller has no idea its being done), but it happens. This number is quite small/low, and that's good.

Attribute 13 indicate the number of times a read submit by the OS/controller resulted in data that could not be fixed/repaired using per-sector ECC. How this manifests itself is unknown (Samsung doesn't describe it), which means it could a) return partial results (meaning lots of bits would be wrong and you'd have no idea), or b) return an I/O error. However, you'll notice the counter matches that of attribute 195, and that's important here.

Attribute 195 indicates the number of times a read submit by the OS/controller resulted in data that was repaired using per-sector ECC. Since this number matches that of attribute 1, it's safe to say that there is area of your disk which is in not-so-great condition, but the drive is able to recover from that situation.

Note for readers (because this will come up I am certain): this model of Samsung disk treats 13 and 195 as raw counters, and the data is NOT vendor-encoded. This differs from Seagate disks, where the data is vendor-encoded and thus appears "preposterously large" and changing at all times; only Seagate knows the format of this data, although I think smartmontools has some knowledge of how to decode some of the fields (varies per drive model). smartmontools is the only utility to be able to do this, to my knowledge -- one of the MANY reasons I prefer it over other "SMART monitoring" tools.


Keeping this one simple: here we can see there have been no LBA remaps. This doesn't mean there haven't been unreadable/suspect LBAs in the past -- it just means that the drive has never had to do a remap. The latter attribute indicates the drive has never had to return a failure status for an attempted remap (often happens when the drive runs out of spare sectors, but varies per drive model).

This is a key piece of information, and was one of the things I was wanting to see quite badly given your description of the situation.


And here we have another useful piece of information: your drive currently has a total of 1 LBAs which are "suspect" (unreadable). The drive has tried to read this single LBA in the past, and had difficulty doing so (returning an I/O error to the OS/controller). As such, the LBA has been marked "suspect". Data at the sector pointed to by that LBA has been lost (so, 512 bytes of data).

The only way to make a drive analyse an LBA to see if the sector it points to is actually usable (good) or not is to issue a write to it. This causes the drive to perform a whole series of operations that it goes through to see if it can write data to the sector pointed to by the LBA.

If it can, the LBA is once again marked readable (and the SMART attribute would begin showing 0 instead of 1). If it cannot, the drive will perform a remap operation, pointing the LBA in question to a spare sector (which there should be some given the health of your drive), so that the LBA can again be used (but will obviously not have a 1:1 correlation between that LBA number and that sector number).

Hopefully my description in the last 2 paragraphs above gives you an idea of what some programs do to try and bring a "bad drive" back into a "healthy state".

Remember: all of this is being done purely within the disk itself. The disk has no knowledge of filesystems or files or any other data.

This is where crazy/insane/snake oil programs like SpinRite make their claim. Some of these programs have filesystem knowledge *in addition* to the above understanding of how suspect LBAs and remapped LBAs work. They do stupid crap like attempt to re-read the LBA a zillion times, or LBAs before or after it, "hoping" to get some "good data" or cause the drive to -- by total chance -- actually get one good final read. SpinRite is particularly snake oil due to how it behaves in this regard, and I do not want to get into a discussion about it.

HDD Regenerator is one I haven't used, but chances are it operates under similar pretences. I relaly don't know/care, to be honest, and that's because I know what the drive is doing under the hood.

So as I said: if you wanted your drive to show 0 for the number of suspect/unreadable LBAs, you could find the LBA which is unreadable (there are many ways to do this), then issue a write to that LBA (again, many ways to do this), then check SMART statistics afterwards.

To determine the unreadable LBA, you can either have the drive itself do it (via a SMART selective scan, which your drive does support), or you can do it using an application in the OS (like HD Tune Pro's Error Scan feature (make sure you uncheck Quick Scan, else the software won't try to read every LBA)). I prefer using a combination of the two methods, because in rare cases I've found where the LBA number the SMART selective test returns is different from that of HD Tune Pro. Let me know which method you want to use. The SMART selective scan, by the way, is a non-modifying operation and can be done while the drive is in use.

The easiest way to issue a write to the LBA (to induce a remap or clear it from the suspect list), without getting into using utilities like dd for Windows, is to use HD Tune Pro's Erase tab to zero every single LBA on the drive. (I really wish they'd add a Start/End field!) This takes a long time, but ensures the entire drive is zeroed, and every single LBA is written to. You'd take a snapshot of the SMART statistics before you did the zeroing, followed by another snapshot after the zeroing, and compare the results.

If you want to know how to go about doing this with dd, I can step you through it, but I warn you: a single typo can result in lost data (irreversible). I'm familiar with this process/do it regularly so I'm used to it, but for some folks this is too much/risk is too high.

So anyway, overall your drive is in "okay" shape given that it's been used for a total of ~232 days. There may be a section of the platter(s) that it has difficulty reading from time to time, causing intermittent unreadable LBAs. That would bug me too, so yes, I would suggest replacing the drive.

This brings me to my final point, re: using Ghost or some other disk cloning utility. "Given the state of my drive, would you recommend using such?" Yes, as long as you have a way to ignore/skip read errors, and have already run CHKDSK /R (which you have -- and you found one file which was corrupted), you should end up with a fully 100% usable system (assuming you can restore that corrupted file from backups), with no other repercussions.

So yes, given the health of that disk + what you did with CHKDSK, using a disk cloning program would be fine. I would suggest that you take a snapshot of the SMART attributes after the cloning, however, just to see if things look the same or are worse. (For example you might find attribute 197 begins showing "2" instead of "1", which means during the clone the drive was unable to read some other new/different LBA, so now we're back to square one, rinse lather repeat...)

Welcome to the beginnings of drive/data recovery. :-)

I DO NOT recommend using SpinRite or HDD Regenerator or even native Samsung (now Seagate) utilities on the drive. There's absolutely no point -- for the former 2, all this will do is potentially damage/worsen the condition of files on your filesystem ("black box" software with no real explanation of what it's doing at a low level, just "black magic" -- sorry, I do not want it!), and the latter won't induce remappings anyway (I do not remember SeaTools ever inducing remaps via LBA writes).
--
Making life hard for others since 1977.
I speak for myself and not my employer/affiliates of my employer.


andyross
Premium,MVM
join:2003-05-04
Schaumburg, IL
Thanks for the information. I was reading the 'value' entries and not the 'raw_value' and thought that some of them seemed high.

I do worry about the 'pre-fail' entries? One of them is 'spin up time'. Does that mean it's having problems starting? Although I've never noticed an issue, or if that is just the number of times it's been powered up.

That is one reason why I was looking for manufacturer diagnostics. Years ago, I thought I had read that some drives would remap only when special diagnostics are used, like what I used on an old WD drive a number of years ago.

The drive itself is nearly 6 years old (Sept 2007), but I don't leave the computer on 24 hours.

Since Windows has marked the cluster as bad, it won't write to it anymore, so I'm apparently safe for now. I may still clone over to the new drive, but could eventually re-purpose the old one as a second drive. I've been occasionally playing with VM's, and it would make a nice place to put those large files, and not really worry if something goes wrong. (I was going to play around with Win8.1 Preview when all this started happening.)


koitsu
Premium,MVM
join:2002-07-16
Mountain View, CA
kudos:23
said by andyross:

Thanks for the information. I was reading the 'value' entries and not the 'raw_value' and thought that some of them seemed high.

I do worry about the 'pre-fail' entries? One of them is 'spin up time'. Does that mean it's having problems starting? Although I've never noticed an issue, or if that is just the number of times it's been powered up.

You're misunderstanding what the TYPE field represents (don't feel bad, lots of people do). The TYPE field represents how to read/interpret VALUE vs. THRESH. It is not an indicator of "a pre-failure condition" or other such things. Per the smartmontools docs:


said by andyross:

That is one reason why I was looking for manufacturer diagnostics. Years ago, I thought I had read that some drives would remap only when special diagnostics are used, like what I used on an old WD drive a number of years ago.

I haven't seen hard disks behave like that since the 80s and 90s, specifically classic MFM and RLL drives where the defect list was user-maintained. (I'll add that SCSI, even today, supports both physical and grown defect lists -- ATA hides all of this entirely from the user, with no way to get access to it).

said by andyross:

Since Windows has marked the cluster as bad, it won't write to it anymore, so I'm apparently safe for now. I may still clone over to the new drive, but could eventually re-purpose the old one as a second drive. I've been occasionally playing with VM's, and it would make a nice place to put those large files, and not really worry if something goes wrong. (I was going to play around with Win8.1 Preview when all this started happening.)

Understood.
--
Making life hard for others since 1977.
I speak for myself and not my employer/affiliates of my employer.