[one-liner]: Using the Linux Command, dirsplit, to Dynamically Backup a Directory Over Multiple DVDs

Background

At my day job I deal with a fair amount of image data. We typically are shipping the data out on either hard drives, thumb drives, or via SFTP. On occasion we will some times burn it to a CD and/or a DVD. But until today all the data was either large sets (200-400GB) variety, or small, less than 1-2GB. However today’s shipment was 18GB. What to do? I didn’t have a spare USB thumb drive handy so I thought, ah I’ll just throw it on a couple of single layer DVDs. So my first order of business was to figure out how many. As it is with Linux/UNIX, there is pretty much already a tool for everything, if only you look hard enough 8-).

For this particular shipment all the image data was organized into a couple dozen folders, each weighing in a ~100-200MB. I quickly figured that 5 DVDs should be more than enough, but how to optimally fill each DVD? Luckily there’s a program called dirsplit which made this a breeze.

Solution

Again another tool I’ve never heard of, dirsplit is actually a Perl script that can analyze a directory and report the optimal way to burn it to a set of DVDs. Once it’s done analyzing a directory, it’ll report back a set of .list files, one per each DVDs worth of files. dirsplit is part of the package cdrkit which in addition to dirsplit, also includes the following programs:

  • dirsplit: dirsplit utility
  • genisoimage: Creates an image of an ISO9660 filesystem
  • icedax: A utility for sampling/copying .wav files from digital audio CDs
  • wodim: A command line CD/DVD recording program – (“write optical disk media”) – a cdrecord replacement

It can get a little confusing, but cdrkit, at least under Fedora & CentOS, is comprised of 4 individual RPMs, so we’re only going to be using dirsplit. I installed it like so:

1
yum install dirsplit

dirsplit’s basic usage:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
% dirsplit [options] [advanced options] < directory >
 
 -H|--longhelp Show the long help message with more advanced options
 -n|--no-act   Only print the commands, no action (implies -v)
 -s|--size     NUMBER - Size of the medium (default: 4488M)
 -e|--expmode  NUMBER - directory exploration mode (recommended, see long help)
 -m|--move     Move files to target dirs (default: create mkisofs catalogs)
 -p|--prefix   STRING - first part of catalog/directory name (default: vol_)
 -h|--help     Show this option summary
 -v|--verbose  More verbosity
 
The complete help can be displayed with the --longhelp (-H) option.
The default mode is creating file catalogs useable with:
    mkisofs -D -r --joliet-long -graft-points -path-list CATALOG
 
Example:
dirsplit -m -s 700M -e2 random_data_to_backup/

Once installed, cd <image data directory>, and run the following command:

1
2
3
4
5
6
7
8
9
10
11
12
13
# -e takes a number (1-4). In our case we're using 2
 
#  2: like 1, but all files in directory are put together (as "atom") onto the
#     same medium. This does not apply to subdirectories, however.
 
# analyze current directory, i.e. the dot
 
% dirsplit -e2 .
Building file list, please wait...
Calculating, please wait...
....................
Calculated, using 5 volumes.
Wasted: 7827961 Byte (estimated, check mkisofs -print-size ...)

In addition to telling us how many DVDs we’ll require, it also tells you how much wasted space the backup will incur, and provides you with a .list file per DVD. For the above run dirsplit generated the following 5 .list files:

1
2
3
4
5
-rwxrwxr-x 1 root root 1679528 Feb 18 21:18 vol_1.list
-rwxrwxr-x 1 root root 1689556 Feb 18 21:18 vol_2.list
-rwxrwxr-x 1 root root 1694680 Feb 18 21:18 vol_3.list
-rwxrwxr-x 1 root root 1694300 Feb 18 21:18 vol_4.list
-rwxrwSr-x 1 root root   17110 Feb 18 21:18 vol_5.list

The beauty of dirsplit is that the .list files can be utilized by mkisofs to generate .iso files, one for each .list file. The command mkisofs has an option, path-list which takes the .list file as an argument. I used the following command to generate a single .iso for the 1st .list file, vol_1.list.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
% mkisofs -o ~/backup1.iso -D -r --joliet-long -V "BACKUP DISC1" -graft-points -path-list vol_1.list
INFO:   UTF-8 character encoding detected by locale settings.
        Assuming UTF-8 encoded filenames on source filesystem,
        use -input-charset to override.
  0.22% done, estimate finish Sat Feb 18 04:41:56 2012
  0.44% done, estimate finish Sat Feb 18 04:41:57 2012
  0.65% done, estimate finish Sat Feb 18 04:41:57 2012
  0.87% done, estimate finish Sat Feb 18 04:41:57 2012
  1.09% done, estimate finish Sat Feb 18 04:40:25 2012
  1.31% done, estimate finish Sat Feb 18 04:40:41 2012
  ...
  ...
  99.46% done, estimate finish Sat Feb 18 04:41:58 2012
  99.68% done, estimate finish Sat Feb 18 04:41:57 2012
  99.90% done, estimate finish Sat Feb 18 04:41:57 2012
Total translation table size: 0
Total rockridge attributes bytes: 1086116
Total directory bytes: 2136064
Path table size(bytes): 4858
Max brk space used bfe000
2292383 extents written (4477 MB)

You could use something more advanced to generate all the .iso files:

1
2
3
4
5
#!/bin/bash
 
for i in `seq 1 5`; do
  mkisofs -o ~/backup${i}.iso -D -r --joliet-long -V "BACKUP DISC${i}" -graft-points -path-list vol_${i}.list
done

Once you’ve got .iso files, you can use your favorite burning software to write them to DVDs. I usually just do something like this:

1
2
# in dir. where the .iso are
% sudo growisofs -Z /dev/dvd=backup1.iso

References

links
local copies

NOTE: For further details regarding my one-liner blog posts, check out my one-liner style guide primer.

This entry was posted in backup, centos, DVD, fedora, iso, one-liner, redhat, rhel, script, Syndicated, sysadmin, tips & tricks. Bookmark the permalink.

Comments are closed.