Ubuntu Manpage: rdd-copy - copy a file, even if read errors occur

NAME

       rdd-copy - copy a file, even if read errors occur

SYNOPSIS

       rdd-copy [OPTION] src [dst]

       rdd-copy -C [CLIENT OPTION] src [host:]dst

       rdd-copy -S [SERVER OPTION]

DESCRIPTION

       Rdd-copy  is  a  file  and  device  copying  utility that includes features that are useful in a forensic
       environment.  In particular, rdd-copy can compute cryptographic hashes over the data it copies, is robust
       with respect to read errors, and can copy data across a network.

       Rdd-copy is best understood as a program that consists of a reader  stage  and  one  or  more  processing
       stages.  The reader stage reads input data in a robust way.  It will retry failed reads.  If a read error
       persists,  the  reader  stage  substitutes  zero  bytes  for  the input bytes that it fails to read.  The
       resulting bytes are passed to all subsequent processing stages.

       The processing stages are enabled through command-line options.  The  current  stages  are:  checksumming
       (Adler32 and CRC32), hashing (MD5 and SHA1), file output, network output, and statistics.

       Rdd-copy  can  be  run  in  local mode, in client mode, and in server mode.  The mode is indicated by the
       first command-line argument.

       Copying data across a network requires two rdd-copy processes: a client process that reads the data  from
       disk  and  transmits it across the network, and a server process that reads the data from the network and
       writes it to a file or device.

LOCAL MODE

       In local mode, rdd-copy copies source file src to destination file dst, handling read errors according to
       the options.  If dst is not specified, the data in src will be read and optionally hashed,  but  it  will
       not be written.  To write to standard output, specify - as dst.

       Rdd-copy  will  optionally compute an MD5 or a SHA1 hash value over the input bytes and the zero bytes it
       substitutes for blocks it cannot read.  These hash values should be interpreted with care (see below).

       Rdd-copy does NOT guarantee that the bytes it reads are the same bytes  that  are  stored  on  the  input
       medium.  It simply takes what read(2) returns.  Any hash values (see options) are computed over the bytes
       that read(2) returns or, if read(2) fails, over zero-valued fill bytes.

       Rdd-copy  does  NOT  guarantee that the bytes that it reads into memory (or the zero-valued bytes that it
       substitutes when a read error occurs) will be written to the output  file  correctly.   If  you  wish  to
       verify  the  correspondence  between  what  rdd-copy  saw  and what got written to disk, you will have to
       recompute the MD5 and/or SHA1 hash values over the output file and compare  them  with  the  hash  values
       reported  by  rdd-copy.   This  is  a  useful  verification  step,  but beware that even this step cannot
       guarantee perfect correspondence with the data stored on the source medium.

       The best end-to-end test is probably to read back the output file and compare each  output  byte  to  the
       corresponding  input  byte, unless that input byte was part of a block for which rdd-copy reported a read
       error.

       Rdd-copy does NOT recover from persisting write errors.   Rdd-copy  was  designed  to  handle  unfriendly
       source media only.  If you get write errors, you should replace your target medium.

READ ERRORS

In local mode and in client mode, rdd-copy reads from disk. Rdd-copy assumes that the source disk may be
faulty and tries to be robust with respect to disk-read errors. In server mode, rdd-copy reads from the
network and makes no attempt to survive read errors. The explanation below applies only to read errors
that occur in local mode and in client mode.

When a read error occurs, rdd-copy reduces the block size to the minimum block size (see
--min-block-size) and resets the read pointer to the location at which it started the read that failed.

Next, rdd-copy tries to read a series of minimum-sized blocks (see --min-block-size). When such a read
fails, it is retried a user-specified number of times (see --nretry). If the read failure persists, rdd-
copy normally will skip a minimum-sized block of input data and will write a minimum-sized block of zero
bytes to the destination file. These zero bytes are also passed to all other rdd-copy processing stages
(checksumming, hashing, and statistics).

Any persistent read failure counts toward the maximum number of read errors that the user will tolerate
(see --max-read-err). If this maximum is reached, rdd-copy will exit immediately. By default, however,
an infinite number of read errors is allowed.

After a read failure, rdd-copy continues to use the minimum block size to read data until it has read
block-size bytes of data without errors. (block-size is the user-specified block size, see
--block-size.) Only then will rdd-copy increase its block size again, doubling the size at each
successful read, until it reaches the default block size.

CLIENT MODE

       In client mode, rdd-copy operates as in local mode, except that the data will not be copied  to  a  file,
       but will be written to a TCP connection to an rdd-copy server process.

       In  client  mode,  a  destination  file,  dst,  on  a  destination host must be specified.  If no host is
       specified, localhost will be used.

SERVER MODE

       In server mode, rdd-copy accepts one TCP connection from an rdd-copy client.  The server process must  be
       started  before  the  client  process.  In server mode, rdd-copy will read data from a TCP connection and
       write it to a target file.  For now, the target file must always be specified by the  client.   The  main
       reason  for  this  decision  is to keep open the option of having inetd(8) or xinetd(8) start an rdd-copy
       server process.

OUTPUT

       Informative messages, error messages, and statistics are all written to stderr.

OPTIONS

-C, --client
Run rdd-copy in client mode. If you use this option, it must come first.

-S, --server
Run rdd-copy in server mode. If you use this option, it must come first.

-p, --port <portnum>
Modes: client, server.

Specifies the port number <portnum> at which the server listens for an incoming connection. The
default port is 4832.

-?, --help
Modes: all.

Print a usage message that includes this list of options.

-V, --version
Modes: all.

Print version information and exit

-v, --verbose
Modes: all.

Be verbose.

-q, --quiet
Modes: all.

Do not pose interactive questions.

-l, --log-file <logfile>
Modes: all.

Log all messages except progress messages to <logfile>.

-f, --force
Modes: local, server.

Force existing files to be overwritten. The default behavior is to bail out when the output file
already exists.

-b, --block-size <size>
Modes: local, client.

Specify the default block size; <size> must be a power of two. While no read errors occur, rdd-
copy will read and write blocks of <size> bytes.

-m, --min-block-size <size>
Modes: local, client.

Specify the minimum read size; <size> must be a power of two. When a persistent read error
occurs, at least this many bytes of data will be skipped and replaced with zero bytes in the
destination file.

-n, --nretry <count>
Modes: local, client.

Retry failed reads up to <count> times. In many cases, using a large retry value makes little
sense, because the operating system's device driver will not indicate a failed read until it has,
itself, retried the read several times.

-o, --offset <size>
Modes: local, client.

Skip <size> bytes from the start of the input file before reading any data. The bytes that are
skipped will not be included in any hash computation and will not be written to the output file.

-c, --count <size>
Modes: local, client.

Read at most <size> input bytes or read until end-of-file.

-z, --compress
Modes: client.

Compress network data.

-s, --split <size>
Modes: local, server.

If necessary, create multiple output files, none of which will be larger than <size> bytes. Each
output file will have a name that consists of a sequence number followed by a dash and the name
specified on the command line.

-r, --raw
Modes: local, client.

Access the device using the raw device. The data will not travel through the buffer cache.

-P, --progress <sec>
Modes: all.

Report progress (bytes read and percentage of data covered) every <sec> seconds.

-M, --max-read-err <count>
Modes: local, client.

Give up after <count> read errors.

--md5 Modes: all.

Compute an MD5 hash value over all data that was read without errors and over the zero-filled
blocks that are used to replace bad blocks.

--sha, --sha1
Modes: all.

Compute a SHA1 hash value over all data that was read without errors and over the zero-filled
blocks that are used to replace bad blocks.

--checksum, --adler32 <file>
Modes: all.

Compute an Adler32 checksum value over blocks of data produced by the reader stage. The last
block to be checksummed may be smaller than the the block size that is used. All checksum values
are written to <file>.

--checksum-block-size, --adler32-block-size <size>
Modes: all.

Compute Adler32 checksum values over data blocks with a size of <size> bytes. Only the last data
block to be checksummed may be smaller than <size>. The default block size is 32 Kbyte.

--crc32 <file>
Modes: all.

Compute a CRC32 checksum value over blocks of data produced by the reader stage. The last block
to be checksummed may be smaller than the the block size that is used. All checksum values are
written to <file>.

--crc32-block-size <size>
Modes: all.

Compute CRC32 checksum values over data blocks with a size of <size> bytes. Only the last data
block to be checksummed may be smaller than <size>. The default block size is 32 Kbyte.

-H, --histogram <file>
Modes: all.

Compute a histogram over each block of data produced by the reader stage. The histogramming block
size can be set by the user (see --hist-block-size). For each block, write a single text line of
statistics to <file>.

-h, --hist-block-size <size>
Modes: all.

Set the histogramming block size to <size> bytes. The default block size is 256 Kbyte.

--block-md5 <file>
Modes: all.

Compute the MD5 hash value over blocks of data produced by the reader stage. The last block to be
hashed may be smaller than the block size. All MD5 values are written to text file <file>. Each
line in this file contains a block number, followed by a space, followed by the hash value of the
corresponding block.

--block-md5-size <size>
Modes: all.

Sets the block size of the block-wise MD5 computation. The default block size is 4 Kbyte.

A <size> argument may be followed by one of the following multiplicative suffixes: c 1, w 2, b 512, k
1024, M 1,048,576, and G 1,073,741,824.

EXAMPLES

       rdd-copy --md5 /dev/hda1

              Compute and print the MD5 hash value over  /dev/hda1.   On  Linux,  /dev/hda1  denotes  the  first
              partition of the primary master disk.

       rdd-copy -b 16k -m 512 -l rdd-log.txt /dev/fd0 f.img

              Create  an image of a floppy disk (/dev/fd0).  Copy 16 Kbyte at a time, but use blocks as small as
              a single sector (512 bytes) when read errors occur. Write  all  log  messages  to  the  file  rdd-
              log.txt.

       On the server: rdd-copy -S --sha1

       On the client: rdd-copy -C --sha1 /dev/hdb snake:/images/disk.img

              Copy the primary slave disk to host snake and store the data in file /images/disk.img.  The client
              host  computes  a  SHA1 hash over the data it reads from the disk; the server host computes a SHA1
              hash over the data it receives from the network.

       rdd-copy --count 512 /dev/hda mbr.img

              Copy the master boot record (MBR) from the primary master disk to file mbr.img.

NOTES

       If you encounter read errors, do examine /var/log/messages (or the  equivalent  file  on  your  operating
       system).  It may contain useful device driver error messages.

       On  Linux  (kernel  2.4 and lower) rdd-copy and other programs that read from a block device may yield an
       I/O error when they reach the end of the device, even if there's nothing wrong with the device.   To  the
       best  of  my  knowledge, this is a Linux problem rather than an rdd-copy problem; the same problem occurs
       with  GNU  dd-copy  and  other  programs.   The  problem  is  described  in   the   following   document:
       http://www.cftt.nist.gov/Notes_on_dd_and_Odd_Sized_Disks4.doc.  The problem has apparently been solved in
       the Linux 2.6 kernel.

       If  you use rdd-copy to access a device, consider using the raw device (see raw(8)).  This way, your data
       will not travel through the buffer cache.

BUGS

       Server-side errors are not reported back to the client.  Users must watch the server's output.

REPORTING BUGS

       Report bugs to <rdd@holmes.nl>.

ACKNOWLEDGEMENTS

       Many thanks to all who reported bugs and successes, and who suggested improvements.   You  know  who  you
       are.

COPYRIGHT

       Copyright © 2002-2003 Netherlands Forensic Institute
       This software comes with NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

HISTORY

       Up  to  version 1.2-7a rdd-copy (then called rdd) used a different error recovery strategy.  With the new
       strategy, users can no longer set the recovery threshold, so the --recovery-len option has been retired.

rdd 2.0                                           December 2004                                           RDD(1)