Recovering damaged CDs or DVDs with Linux

On Windows there are a slew of file recovery tools which will peer intently at an optical disc, retrying until they recover every possible file. The leading tool is probably Isobuster, but there are dozens of candidates for the title. There are few automated (or even user-friendly) data recovery tools on Linux or UNIX(tm) platforms, but common tools which are often even included with the core system or which are installable through the official package system are often sufficient for performing this critical task.

One particularly frustrating way to lose data is by burning it to an optical disc and storing it. One often attempts to preserve data this way, only to have cheap media or a cheap storage container (especially binders) destroy the disc beyond repair. Sometimes, however, the data around the error (or at least up to it, which is sometimes still better than nothing) may still be readable if you use a tool more complicated than the 'cp' command (or selecting and dragging files in the file manager of your choice.)

Copying an entire disc

One excellent starting point is to use GNU dd (from GNU coreutils) or other, similarly capable implementations to recover the data on a damaged optical disc (though perhaps not one so damaged as the one on the right.) A handful of the available options are especially helpful. Here's a possibly excessive example command line for copying a whole disc:

dd if=/dev/sr0 of=image.iso bs=2048 conv=noerror,notrunc iflag=nonblock

dd is an exceptionally useful utility. The GNU dd manpage says of dd that it will "Copy a file, converting and formatting according to the operands." We don't do any conversion, but we do specify some options as to how to go about reading the data:

From my first cd-rom drive (if=/dev/sr0) I read to an appropriately named ISO file (of=image.iso). I specify a block size (bs=2048 means 2048 bytes, or 2 KiB) and some options (conv=noerror,notrunc): noerror causes dd to continue after a read error, and notrunc will avoid any automatic truncation of the output file. iflag=nonblock sets an "input flag" that causes dd to use non-blocking I/O, which should minimize the impact on your system at the possible expense of speed during the copy. Since I always assume that the copy will take a more or less indefinite period of time, this does not offend me at all, but I admit that it also helps to have an external DVD burner lying around as a backup in case I change my mind and decide that I really need my DVD-ROM.

This is what it looks like when there's errors:

dd: reading `/dev/sr0': Input/output error
2306+0 records in
2306+0 records out
4722688 bytes (4.7 MB) copied, 88.0203 seconds, 53.7 kB/s
dd: reading `/dev/sr0': Input/output error
2308+0 records in
2308+0 records out
4726784 bytes (4.7 MB) copied, 88.0543 seconds, 53.7 kB/s

As you can see, although errors are clearly occurring, the copying continues (thanks to conv=noerror.) How much of this disc is going to be usable, however, is another matter. This is a DVD that became unreadable; perhaps next I'll try just running dvdbackup against it.

Recovering Files

If this ISO image contains data we can mount it and attempt to copy files from it as if it were a real optical disc. These techniques sometimes work on damaged optical discs as well, but the way that many operating systems react to optical disc errors is not graceful and it can be much more efficient to make an ISO and work from that.

GNU tar

One option for copying large volumes of files is to use GNU tar. Tar will often successfully copy from a mounted filesystem that has errors, but it can take very long periods of time. So I like to read the optical disc to an image with dd (as above) and then mount it using the loopback driver, and then read from that.

mount -o ro,loop image.iso /mnt/mountpoint

or, if you get errors about filesystem type, something like

mount -o ro,loop -t iso9660 image.iso /mnt/mountpoint

then

cd /mnt/mountpoint
tar cvfp - * | ( cd /destination ; tar xvfp - )

Which will copy all files from the mountpoint to the destination, preserving permissions.

Using dd to copy files

Here's another fun thing to do with dd. It's not as effective as when reading a block device, but you can read a file (as far as Unix is concerned, a character device and a file look pretty similar to a program) with dd and pass it options to try to convince it to copy a file with errors:

$ dd if=/mnt/cdrom/filename of=/someplace/filename conv=noerror,notrunc
657595+1 records in
657595+1 records out
336688970 bytes (337 MB) copied, 144.927 seconds, 2.3 MB/s

(the filename has changed, but the output is real, and so is the command line, otherwise. this wasn't the output from the problem file, though.)

Unstoppable Copier

I've also found a little Qt-lib program called the unstoppable copier ("unstopcp") which will aggressively retry on failure, on a file-by-file basis. I found it to be a bit less likely to truncate files than dd, even with the "notrunc" option (it may be filling them with garbage, though, for all I know.) Sometimes it works better to use unstopcp on files on a mounted optical disc directly than by making an ISO and mounting it, so it's worthwhile to experiment in some cases.

Recovering Video

If it contains video, we can try extracting data from it as if it were a device. Most programs will happily read a file, which is after all as little different from reading a device as possible in keeping with the Unix metaphor. Video CD, Super Video CD, DVD and other types of video DVD are often written in forms that make it difficult or impossible to get the data from them by simply copying files, and additional utilities are necessary.

DVD

For example, if this is a video DVD, you could use the dvdbackup utility from the command line:

dvdbackup -i image.iso -o output_directory -M

output_directory is a precreated, empty directory. The -M flag instructs dvdbackup to copy the DVD's complete contents. If you have the libdvdcss installed and dvdbackup is built to use it, then this step also includes CSS removal, a necessary prelude to burning the resulting content - e.g. for the purposes of recovering a protected video DVD which has become unplayable. If not, then on x86-compatible platforms you could run DVDShrink (or something similar) to remove this "protection" under wine.

(S)VCD

If the disc is a VCD or SVCD, use vcdxrip (part of vcdximager) to extract the MPEG data from the disc.

Identifying Discs

One fun thing to find in your disc collection is a disc that obviously contains data (you can flip it over and see the difference between the written and unwritten area on any disc not completely filled) but which hasn't been labeled. Perhaps you couldn't find your Sharpie(tm) and the disc was originally in a labeled jewel case, which was right on top of the DVD player, but nobody ever puts them in there... Er, where was I? Oh yeah, here's one way to identify an optical disc, using dd and GNU file:

dd if=/dev/sr0 bs=2048k count=1 | file -

This grabs the first couple megabytes (bs=2048k count=1) and submits it to the GNU file command ("determine file type") for identification. You can experiment with the value of bs (block size) if you're not getting a good ID. Naturally, if=image.iso would also work fine, to identify an ISO file. Here's some example output:

/dev/stdin: UDF filesystem data (version 1.5) 'DVD_VOLUME                     '

This is handy because GNU dd and file are on pretty much every Linux system (except the very tiny, which tend to use busybox.)

You probably have the vol_id program (from the "volumeid" package), so you can just use it:

$ vol_id /dev/sr0
ID_FS_USAGE=filesystem
ID_FS_TYPE=iso9660
ID_FS_VERSION=
ID_FS_UUID=
ID_FS_UUID_ENC=
ID_FS_LABEL=Edubuntu 7.10 i386 Bin-1
ID_FS_LABEL_ENC=Edubuntu\x207.10\x20i386\x20Bin-1
ID_FS_LABEL_SAFE=Edubuntu_7.10_i386_Bin-1

You might also have some other programs available to you. Here's another way using the "isoinfo" command, which is part of the "genisoimage" program (and at least on Ubuntu, that is the name of the package as well.)

$ isoinfo -d -i /dev/sr0
CD-ROM is in ISO 9660 format
System id:
Volume id: NEW
Volume set id:
Publisher id:
Data preparer id:
Application id: NERO___BURNING_ROM
Copyright File id:
Abstract File id:
Bibliographic File id:
Volume set size is: 1
Volume set sequence number is: 1
Logical block size is: 2048
Volume size is: 164579
Joliet with UCS level 3 found
NO Rock Ridge present

It can tell you a little more than you really want to know, you could pipe it through "head -n 3" and get everything you typically need. The last two lines can be interesting, too; depending on which representation of the original, pre-iso9660 filenames you want, you might want to mount the volume with either the "nojoliet" or "norock" options (look them up in the mount(8) manpage, Linux Programmer's Manual, 2004-12-16, Manpages for util-linux-ng for Linux 2.6 or a similar reference.)

Conclusion

These are some of my favorite file recovery techniques. I hope they're as useful to you as they have been to me. Sometimes I have better luck with file recovery tasks on Windows, sometimes Linux, and it doesn't seem to have anything in particular to do with the format of the disc. I could perhaps chalk it up to differences in optical drives, which can be truly significant. Finally, I have had some luck with the manual disc sanding machines (e.g. Dr. DVD) which really are capable of making almost any disc readable unless it has damage to the upper metal layer. I've also heard good things about Brasso, but have never tried it myself. Polishing a disc and then reading it off with dd is a great option for getting that one last read.

Comments

Since writing this article I've also become familiar with ddrescue, which is a lovely little tool that is willing to try, and try, and try again to read your data.

This saved me an important DVD with images and I love the "KISS" way using the core tools. I fiddled with dd a bit myself but didn't know the right options to use, so I googled and found this article. Thanks a lot.

Add new comment