 MaxoYour tax dollars at work.Premium,VIP join:2002-11-04 Tallahassee, FL | Data recovery helpA do the IT work at a local food cooperative. Our server is a Pandabard running Ubuntu server 12.04 that has been rocking until last week. I tried to run updates and got some errors. Some Googling revealed the errors to be related to filesystem problems. I shut the server down and put the sd card in my laptop. fsck came up with a bunch of errors, but didn't fix anything. After trying a bunch of stuff I put the card back in the Pandaboard and booted it up. While the server came up it wouldn't do much of anything and was very very slow. Apache, mysql, etc. would not start and the console was constantly throwing I/O errors. At first I wasn't too worried. I backup the database daily and the code is all on github. However, I have discovered that my database backup file is a cool 0kb in size. My guess is that the last time it tried to backup it couldn't pull any data, and then piped the empty result into my backup file. I do have a file from last October, but I really need to recover the data if possible. When I put the SD card in my laptop I can browse all the directories, but many files are not readable. I can open the files, for example, in /usr/share/applications but not in /var/lib/mysql.
5.5.28root@HP:/media/3fe54ae1-490a-4bcc-a6fe-a546bb9bdc4f/var/lib/mysql# cat ib_logfile1
cat: ib_logfile1: Input/output error
Here's what fsck gives me.
➜ ~ sudo fsck -y /dev/mmcblk0p2
fsck from util-linux 2.20.1
e2fsck 1.42 (29-Nov-2011)
/dev/mmcblk0p2: recovering journal
Error writing block 2359734 (Attempt to write block to filesystem resulted in short write). Ignore error? yes
Error writing block 2359734 (Attempt to write block to filesystem resulted in short write). Ignore error? yes
Error writing block 2359734 (Attempt to write block to filesystem resulted in short write). Ignore error? yes
Error writing block 2359734 (Attempt to write block to filesystem resulted in short write). Ignore error? yes
Error writing block 2359734 (Attempt to write block to filesystem resulted in short write). Ignore error? yes
...
This goes on and on
...
fsck.ext4: unable to set superblock flags on /dev/mmcblk0p2
/dev/mmcblk0p2: ********** WARNING: Filesystem still has errors **********
-- "Padre, nobody said war was fun now bowl!" - Sherman T Potter
»maxolasersquad.com/
»maxolasersquad.blogspot.com
»www.facebook.com/maxolasersquad |
|
 leiboldPremium,MVM join:2002-07-09 Sunnyvale, CA kudos:6 Reviews:
·SONIC.NET
| Please don't take this personal, but you have already made several big mistakes:
1.) When you receive filesystem related errors the first priority is to stop writing to that filesystem. Some data that is still in the buffer pool may yet be readable and therefore before shutting down attempt to save truly critical files (obviously not onto the damaged filesystem or any other filesystem on the same physical media; if possible use a network connection to copy those files to another system altogether).
2.) After shutting down a damaged disk (or in this case SD card) prevent any writing until all data that can be saved has been saved. Beware of unintentional writing that happens with some journalled filesystems even when mounting it read-only! Always use read-only mode when mounting media from which you want to recover data. At the very least you are putting the data at an unnecessary risk but far more likely you are increasing the already existing amount of data corruption. It is safer to do a copy of the raw disk partition then to trying to mount the damaged filesystem.
3.) I know some books say differently, but I say never run fsck with the -y option unless you have a good copy of the damaged filesystem (dd of raw partition) and you have determined the cause of the problem and either fixed it or determined that it longer prevents a repair of the filesystem.
4.) Never ever backup anything over the last known good backup. As you found out, if the backup fails your previously good backup is then gone too. That rule applies even if you haven't had any filesystem errors before starting the backup.
From the error messages it is clear that your media is damaged and at least block 2359734 can no longer be written to. My first action would be to get an SD Card with the same capacity and attempt a raw copy from the defect card to the new card (if you are really lucky all blocks are still readable even so some can no longer be written). If there are unreadable areas on the defect SD card you may have to restart dd with seek and skip options (or experiment with conv=noerrror). Once you have a copy of your filesystem on good media you can attempt to repair the filesystem. Journal replay of the ext4 filesystem journal may actually recover some of the damages once it can write data from the journal to the proper filesystem block. Fsck should do the rest to restore integrity to the filesystem but it will do (almost) nothing to recover lost data (it does recover orphaned files and directories). It cannot fix data that was never written properly to begin with or that was somehow overwritten.
Good luck. -- Got some spare cpu cycles ? Join Team Helix or Team Starfire! |
|
 MaxoYour tax dollars at work.Premium,VIP join:2002-11-04 Tallahassee, FL | said by leibold:Please don't take this personal, but you have already made several big mistakes: No offense taken. You are absolutely correct on all points. I did not take the initial errors I got seriously enough. Also, my backup logic was not solid, which is what really made this inconvenience a disaster. I have already tried making an image of the disk
➜ ~ sudo dd if=/dev/mmcblk0 of=/home/david/sdcard.img
dd: reading `/dev/mmcblk0': Input/output error
4587520+0 records in
4587520+0 records out
2348810240 bytes (2.3 GB) copied, 264.791 s, 8.9 MB/s
Do you think that setting an identical sd card as the of would be beneficial? I used the only identical card I had to get the server back up. I was able to get all the code back up, all the data from last October, and some up-to-date data that gets synched to the registers, the membership data and inventory. -- "Padre, nobody said war was fun now bowl!" - Sherman T Potter
»maxolasersquad.com/
»maxolasersquad.blogspot.com
»www.facebook.com/maxolasersquad |
|
 MaxoYour tax dollars at work.Premium,VIP join:2002-11-04 Tallahassee, FL | I'm dding it with noerror right now to see what happens. |
|
|
|
 pabloMVM join:2003-06-23 kudos:1 | Hi,
See lugnut's comment in this thread:
»system wont boot
It may be the ticket to recovery some of your data.
Cheers, -pablo -- openSUSE 12.2/KDE 4.x ISP: TekSavvy Bonded DSL; backhauled via a 6KM wireless link Assorted goodies: »pablo.blog.blueoakdb.com |
|
 leiboldPremium,MVM join:2002-07-09 Sunnyvale, CA kudos:6 Reviews:
·SONIC.NET
| reply to Maxo said by Maxo:Do you think that setting an identical sd card as the of would be beneficial? Yes, there is a chance you might get back some of the lost data back.
Your dd was done without specifying a blocksize which means you used the default of 512 bytes. Larger blocksizes would copy faster, but in an error recovery scenario you want the blocksize to be small so this is good (you could have used 1kB or whatever blocksize your filesyztem is using). In order to see whether the block where dd stopped is the same block as in the earlier filesystem error messages you need to do some calculations (don't forget that you are doing the dd on the entire blockdevice and the damaged filesystem is on the 2nd partition of that blockdevice). Assuming your ext4fs filesystem used a 1kB blocksize and your first partition on the SD card is about 64MB then the dd stopped at the same place as the filesystem error you posted earlier. With luck this is the only bad spot on the media.
Carefully check the messages produced by dd and the resulting output file. If the dd you are using is substituting 0 blocks for input blocks it can't read it is fine. However if the output is too short because dd didn't write anything while encountering a bad input block the copy is not usable for filesystem recovery. I seem to remember not all implementations of dd to behave identical in the presence of input errors which is why I prefer to use skip&seek (or iseek&oseek for dd versions that have those options) to eliminate the guesswork. -- Got some spare cpu cycles ? Join Team Helix or Team Starfire! |
|
 Reviews:
·Comcast
| reply to Maxo FWIW, I tried to image a failing Mandriva system hard drive, using dd, which aborted upon reaching a bad sector. dd_rescue, reporting about 45 bad sectors, made a complete image that I put on a different drive and that has been working since. Some annotated post I already made is here: »bjoernvold.com/forum/viewtopic.p···4#p14714 |
|
 leiboldPremium,MVM join:2002-07-09 Sunnyvale, CA kudos:6 Reviews:
·SONIC.NET
| Thanks for the reminder, I completely forgot about dd_rescue! If it is available it is preferable over plain dd for data rescue operations (it automatically switches between large blocks for speed and small blocks for maximum data recovery as needed). -- Got some spare cpu cycles ? Join Team Helix or Team Starfire! |
|
 MaxoYour tax dollars at work.Premium,VIP join:2002-11-04 Tallahassee, FL | reply to leibold dd is still running this morning. It looks like this.
2348810240 bytes (2.3 GB) copied, 54020.2 s, 43.5 kB/s dd: reading `/dev/mmcblk0': Input/output error 4587520+0 records in 4587520+0 records out 2348810240 bytes (2.3 GB) copied, 54021 s, 43.5 kB/s dd: reading `/dev/mmcblk0': Input/output error 4587520+0 records in 4587520+0 records out 2348810240 bytes (2.3 GB) copied, 54021.8 s, 43.5 kB/s dd: reading `/dev/mmcblk0': Input/output error 4587520+0 records in 4587520+0 records out 2348810240 bytes (2.3 GB) copied, 54022.5 s, 43.5 kB/s dd: reading `/dev/mmcblk0': Input/output error 4587520+0 records in 4587520+0 records out 2348810240 bytes (2.3 GB) copied, 54023.3 s, 43.5 kB/s
-- "Padre, nobody said war was fun now bowl!" - Sherman T Potter
»maxolasersquad.com/
»maxolasersquad.blogspot.com
»www.facebook.com/maxolasersquad |
|
 leiboldPremium,MVM join:2002-07-09 Sunnyvale, CA kudos:6 Reviews:
·SONIC.NET
| How big is the SD card ? You can check where the dd is currently by sending it a SIGUSR1 signal: kill -USR1 PID# (where PID# is the process id of the running dd command). If it is stuck at that block and doesn't continue beyond it, you will have to stop dd and determine by trial and error which the next readable block is (using the skip or iseek option of dd). If you have dd_rescue on your system use that instead since it will do all that work for you. -- Got some spare cpu cycles ? Join Team Helix or Team Starfire! |
|
 MaxoYour tax dollars at work.Premium,VIP join:2002-11-04 Tallahassee, FL | The cards 4GB. It is sitting at 2.3 GB copied. I now have ddrescue installed. kill -USR1 is not giving me anything
➜ ~ sudo kill -USR1 3091 ➜ ~
Do you think killing it and restarting with ddrescue is best? If so, what are the best options to use. Right now I'm just putting the output to a .img file in my home directory. I'm going to order a few more SD cards this weekend from Newegg. -- "Padre, nobody said war was fun now bowl!" - Sherman T Potter
»maxolasersquad.com/
»maxolasersquad.blogspot.com
»www.facebook.com/maxolasersquad |
|
 leiboldPremium,MVM join:2002-07-09 Sunnyvale, CA kudos:6 Reviews:
·SONIC.NET
| said by Maxo:kill -USR1 is not giving me anything
➜ ~ sudo kill -USR1 3091 ➜ ~
The output would be in the terminal where dd is running, not in the terminal where you send the USR1 signal.
said by Maxo:Do you think killing it and restarting with ddrescue is best?
Given that dd is stuck, definitely use dd_rescue.
The option syntax is different from dd so check dd_rescue -h to get a list of them. Most defaults are fine, but I would recommend saving the list of bad blocks:
dd_rescue -o bad_block_list /dev/mmcblk0 /home/david/sdcard.img -- Got some spare cpu cycles ? Join Team Helix or Team Starfire! |
|
 MaxoYour tax dollars at work.Premium,VIP join:2002-11-04 Tallahassee, FL | Now I'm getting somehwere ... I think.
sudo dd_rescue -o bad_block_list /dev/mmcblk0 /home/david/sdcard.img
dd_rescue: (info) expect to copy 3872256kB from /dev/mmcblk0
dd_rescue: (info): ipos: 2293760.0k, opos: 2293760.0k, xferd: 2293760.0k
* errs: 0, errxfer: 0.0k, succxfer: 2293760.0k
+curr.rate: 0kB/s, avg.rate: 8634kB/s, avg.load: 2.0%
>------------------------.................< 59% ETA: 0:03:02
dd_rescue: (warning): read /dev/mmcblk0 (2293760.0k): Success!
dd_rescue: (info): ipos: 2293760.5k, opos: 2293760.5k, xferd: 2293760.5k
* errs: 1, errxfer: 0.5k, succxfer: 2293760.0k
+curr.rate: 1kB/s, avg.rate: 8609kB/s, avg.load: 2.0%
>-----------------------x.................< 59% ETA: 0:03:03
dd_rescue: (warning): read /dev/mmcblk0 (2293760.5k): Success!
dd_rescue: (info): ipos: 2293761.0k, opos: 2293761.0k, xferd: 2293761.0k
* errs: 2, errxfer: 1.0k, succxfer: 2293760.0k
+curr.rate: 1kB/s, avg.rate: 8584kB/s, avg.load: 2.0%
>-----------------------x.................< 59% ETA: 0:03:03
dd_rescue: (warning): read /dev/mmcblk0 (2293761.0k): Success!
dd_rescue: (info): ipos: 2293761.5k, opos: 2293761.5k, xferd: 2293761.5k
* errs: 3, errxfer: 1.5k, succxfer: 2293760.0k
+curr.rate: 1kB/s, avg.rate: 8559kB/s, avg.load: 2.0%
>-----------------------x.................< 59% ETA: 0:03:04
dd_rescue: (warning): read /dev/mmcblk0 (2293761.5k): Success!
dd_rescue: (info): ipos: 2293762.0k, opos: 2293762.0k, xferd: 2293762.0k
* errs: 4, errxfer: 2.0k, succxfer: 2293760.0k
+curr.rate: 1kB/s, avg.rate: 8535kB/s, avg.load: 2.0%
>-----------------------x.................< 59% ETA: 0:03:04
dd_rescue: (warning): read /dev/mmcblk0 (2293762.0k): Success!
dd_rescue: (info): ipos: 2293762.5k, opos: 2293762.5k, xferd: 2293762.5k
* errs: 5, errxfer: 2.5k, succxfer: 2293760.0k
+curr.rate: 1kB/s, avg.rate: 8511kB/s, avg.load: 2.0%
>-----------------------x.................< 59% ETA: 0:03:05
dd_rescue: (warning): read /dev/mmcblk0 (2293762.5k): Success!
-- "Padre, nobody said war was fun now bowl!" - Sherman T Potter
»maxolasersquad.com/
»maxolasersquad.blogspot.com
»www.facebook.com/maxolasersquad |
|
 leiboldPremium,MVM join:2002-07-09 Sunnyvale, CA kudos:6 Reviews:
·SONIC.NET
| 5 consecutive bad sectors. Lets hope that this is all there is. Remember that data for one of the blocks (2 sectors) at least is present in the ext4 filesystem journal. Once the SD card data is copied to a working media, mounting the filesystem will hopefully recover that block from the journal replay.
Most of your data ought to be intact (2.5kB out of 4GB is next to nothing, but of course not every block is as important as the other one). -- Got some spare cpu cycles ? Join Team Helix or Team Starfire! |
|
 leiboldPremium,MVM join:2002-07-09 Sunnyvale, CA kudos:6 Reviews:
·SONIC.NET
| reply to Maxo You don't have to wait until you get a new 4GB SD Card to recover the data.
Once dd_rescue is finished (I hope it long since has) and if you have the space for it, make a 2nd copy for your recovery attempts. Turn the 2nd image into a block device with the use of loopback devices (see losetup) and mount the 2nd partition (ext4fs). If you don't know that starting offset for the 2nd partition create a loop device for the entire SD Card and use fdisk to read the partition table (be careful to not mix 512 byte sectors and 1kB blocks). -- Got some spare cpu cycles ? Join Team Helix or Team Starfire! |
|
 MaxoYour tax dollars at work.Premium,VIP join:2002-11-04 Tallahassee, FL | Here's the final details of dd_resuce.
dd_rescue: (info): read /dev/mmcblk0 (3872256.0k): EOF
dd_rescue: (info): Summary for /dev/mmcblk0 -> /home/david/sdcard.img:
dd_rescue: (info): ipos: 3872256.0k, opos: 3872256.0k, xferd: 3872256.0k
errs: 286720, errxfer: 143360.0k, succxfer: 3728896.0k
+curr.rate: 303kB/s, avg.rate: 17kB/s, avg.load: 0.1%
>-----------------------xxx--------------.< 99% ETA: 0:00:00
It ran through the weekend while I was out of town.
-- "Padre, nobody said war was fun now bowl!" - Sherman T Potter
»maxolasersquad.com/
»maxolasersquad.blogspot.com
»www.facebook.com/maxolasersquad |
|
 MaxoYour tax dollars at work.Premium,VIP join:2002-11-04 Tallahassee, FL | One thing I didn't consider is that I do not know how to mount a single partition from a .img file of a whole disk. |
|
 MaxoYour tax dollars at work.Premium,VIP join:2002-11-04 Tallahassee, FL | Some Googling and I did this.
root@HP:/home/baucumd# sudo losetup /dev/loop0 sdcard.img -o $((75497472))
root@HP:/home/baucumd# mkdir /media/sdcard
root@HP:/home/baucumd# fsck -fv /dev/loop0
fsck from util-linux 2.20.1
e2fsck 1.42 (29-Nov-2011)
Pass 1: Checking inodes, blocks, and sizes
Inode 90322 has an invalid extent node (blk 2327776, lblk 3855)
Clear<y>? yes
Inode 90322, i_blocks is 11186, should be 7818. Fix<y>? yes
HTREE directory inode 278684 has an invalid root node.
Clear HTree index<y>? yes
HTREE directory inode 278787 has an invalid root node.
Clear HTree index<y>? yes
...
...
/dev/loop0: ***** FILE SYSTEM WAS MODIFIED *****
74314 inodes used (15.64%)
598 non-contiguous files (0.8%)
346 non-contiguous directories (0.5%)
# of inodes with ind/dind/tind blocks: 0/0/0
Extent depth histogram: 67135/184/1
2306852 blocks used (60.73%)
0 bad blocks
0 large files
58220 regular files
7957 directories
56 character device files
25 block device files
0 fifos
4294967170 links
8046 symbolic links (6904 fast symbolic links)
1 socket
--------
74133 files
-- "Padre, nobody said war was fun now bowl!" - Sherman T Potter
»maxolasersquad.com/
»maxolasersquad.blogspot.com
»www.facebook.com/maxolasersquad |
|
 MaxoYour tax dollars at work.Premium,VIP join:2002-11-04 Tallahassee, FL 1 edit | I'm pretty pumped as all of the mysql files are now readable and appear to be in tact. It will be a while before I have the opportunity to actually try restoring those files and seeing if they work. If this works I'm shipping some beers out your way leibold . |
|
 leiboldPremium,MVM join:2002-07-09 Sunnyvale, CA kudos:6 Reviews:
·SONIC.NET
| said by Maxo:If this works I'm shipping some beers out your way leibold . Better not, I don't drink 
The file with inode number 90322 got truncated during the fsck. To identify which file this was, you can use the find command after mounting the filesystem. E.g.:
mount -r /dev/loop0 /mnt find /mnt -inum 90322 -print
It is also possible to determine which files have been corrupted due to the defects in the sdcard by checking the list of badblocks that dd_rescue reported. This is a bit more involved and you can find some related information here . Note1: you don't need to run any programs to find bad blocks since you already have the bad block list (running badblocks on the copy wouldn't find any). Note2: in addition to the calculation regarding block sizes you also need to subtract the offset of the 2nd partition. -- Got some spare cpu cycles ? Join Team Helix or Team Starfire! |
|