Backing up film assets
last update 20051102
Introduction to BUP
Archiving 'typical' footage.mov (e.g. assets under Cinelerra) onto a cdr
-
Check files' attributes before compression
Compress 'sparse' footage.mov
Compress and create image of sparse footage.mov for a 650Mb cdr in one go
Mount image (to run MD5SUM) before burning
Extract file (to run MD5SUM) before burning
Extract file (to run MD5SUM) before burning - faster alternative
Burn the calculated image of 'sparse' footage.mov into a cdr
Calculate image & burn 'sparse' compressed & split footage.mov onto a cdr
-
Create & compress an archive of assets with 'sparse' footage.mov
Blank DVD-RW if necessary
Size of BUP media
Split archive of footage_mov_over2G.star.gz for cdr / DVD / double-layer DVD
Burning DVDs with ready-made compressed & split archive
-
Unsplit compressed archive
Extract & 'un-archive' footage.star.gz
Restoring Cinelerra's xml using 'vi' and 'substitute'
Backing Up Assets (BUP)
- Compression of films for back-up & storage vs. compression for download & viewing is little documented. Likewise, splitting (un)compressed archives of films (or footage captured under Cinelerra) for many cdrs/dvds and building the iso.images is particularly difficult without corrupting contents; this is due to the recursive EOF (end of file error on sparse files)
- The approach described below is not quite viable yet on large scale productions because of extra drive space required for temporarily building archives & iso images. It should be possible to work out a script to retrieve data from the standardised layout of a film project's tree and back-up all assets & master releases (regardless of compression & format used) in a couple of operations.
- Since a typical 26min documentary project may involve some 400 - 500 files, one needs to work out an nle administration strategy before actually starting capturing & editing
I plan writing a series of scripts to do all that - see blog - section TODO scripts - The BUP path described below is likely to change as soon as I can get my hands on Win & Mac nle-dedicated machines. The point is, since you want/need to back up an xyz-month film project, you might just as well do it in formats compatible with assets & masters used not only under Cinelerra but also recognised by Avid DV Express & Final Cut
- Working freelance, the idea is to carry out cost-effective nle on a reliable linux box and final film & sound mastering in whatever format & piece of software the local film production house demands in terms of Tv industry standards
Sparse files
- most so-called raw footage - i.e. captured without compression - cannot be compressed like any 'usual file'
- traditional tar -cf followed by gzip -9 will generate an EOF error (end of file, zeros which should not be ignored during compression)
- mpeg encoded films are not subject to this kind of EOF error
- you won't notice an EOF error when checking locally a film.tar.gz either, but you can create its cdr/dvd image, and mount it to check if your compression is successful, or use the command md5sum to check integrity (see how below)
- you may burn the image of a non-compressed film_dvc.mov onto a cdr or dvd without EOF, but may not burn the image of a compressed film_dvc.mov without calculating beforehand its so-called print-size for mkisofs to feed the tsize option for actually burning the image
- info tar says 'A file is sparse if it contains blocks of zeros whose existence is recorded, but that have no space allocated on disk.'
- info star says 'Gnu tar often creates tar archives with incorrect logical EOF marks. The standard requires two blocks that are completely zeroed, whereas gnutar often only adds one of them.'
Compression issues
- video footage which has already been strongly compressed for screening can hardly be compressed more: examples (using tar -cf followed by gzip -9)
- star compress-program=gzip can shrink size of footage to approx. 56% of its original size, but only on rendered films (see 26min documentary example below), not captured footage (perhaps because captured raw footage is usually short time-wise and therefore it's too small to have much redundant data). In short, we can't expect to reduce significantly the size of assets captured under Cinelerra - this is bad news when backing up...
- examples (using star -compress-program=gzip)
- by pure curiosity, should raw_footage be converted to another format before editing for better downsizing assets
(of course don't do that to preserve quality) - practical implementation?
tests on mixed assets (dvc, mpeg2, divx5 etc, i.e. sparse and non-sparse files)
54M film_26min_divx5.avi 52M film_26min_divx5_avi.tar.gz 544M film_26min.mpg 526M film_26min_mpg.tar.gz
finding optimum codec & format for both capture & bup ----------------------------------------------------- (by default Cinelerra captures raw footage encapsulated in quicktime for linux) tests on both YUV & RGB colour models give similar results 114M raw_footage_30sec_captured_by_cinelerra.mov 105M raw_footage_30sec_captured_by_cinelerra_mov.star.gz
tests on YUV colour model only 138M rendered_raw_footage_30sec_mov_to_jpeg.mov 129M rendered_raw_footage_30sec_mov_to_jpeg.mov.star.gz tests on YUV colour model only 154M rendered_raw_footage_30sec_mov_to_jpeg.avi 147M rendered_raw_footage_30sec_mov_to_jpeg.avi.star.gz tests on YUV colour model only 114M rendered_raw_footage_30sec_mov_to_dvc.avi 70M rendered_raw_footage_30sec_mov_to_dvc.avi.star.gz finding optimum codec & format for master to be bup --------------------------------------------------- tests on RGB colour model only (but will test on YUV too) 4.6G rendered_film_26min_dvc_2_complements.mov 2.6G rendered_film_26min_dvc_2_complements_mov.star.gz test on RGB colour model only (but will test on YUV too) 5.8G rendered_film2_dvc_2_complements.mov 5.0G rendered_film2_dvc_2_complements_mov.star.gz
29.0G all_assets_any_codec.any_format 27.0G all_assets_any_codec.any_format.star.gz
Compression opportunities - summary
- it seems more efficient to breakdown archiving & compression of masters & releases, and virtually hopeless to reduce size of original assets
- use jpeg codec and 2 complements encapsulated into mov for maximum quality (as cinelerra tutorial / secrets recommends)
- use jpeg codec and 2 complements encapsulated into avi for maximum compatibility (I still have to check that under avid & final cut)
- use
star -c all_assets_any_codec.any_format f=all_assets_any_codec.any_format.star
Life span of backups
- it is quite frequent to have the tracks of a home-made dvd not recognised by a consumer dvd player hardly 4-6 months after it was burnt; loss is (fortunately) less noticeable when read by an internal dvd unit
- life span of cdrs seems to be longer than that of dvds, perhaps because cdrs round 1997-98 were burnt at double speed; time will tell whether dvds are more trustworthy
- rule of thumb:
a slow engraving speed on a cdr or a DVD guarantees increased long-lasting legibility while higher speeds & densities of dvd tracks generate probably less legible tracks which will fade away anyway - buy quality cdr & DVDs for backups
Preliminaries:
- this is rather unlikely nowadays, but depending on kernel, OS and compiled software you may or may not be able to backup a file with a size exceeding 2GB onto another media, although you may not have noticed it earlier during editing or playback.
Check of your OS can take file size over 2 Gigabytes
- create a dummy file > 2G
- run K3b > new dvd project
open (...)/home/tmp/dummy_file_over2G - if you can drag & drop, you're fine
if k3b doesn't display the contents of /home/tmp, or protests that you should add files to your project before burning the image, then you need to split footage_over2G.mov - note that when running growisofs from the command line you will actually be able to burn the dvd image of dummy_file_over2G, but on opening you'll get an EOF error
- in short: you may create an image & burn it regardless of the number of files & total size as long as none of these files exceeds 2G
dd if=/dev/zero of=/home/tmp/dummy_file_over2G bs=1MB count=2500 ls -sh 2,4G /home/tmp/dummy_file_over2G
Possible solutions
- switch from ext2 to ext3
(doesn't involve reformatting, only perhaps losing some journals...) - mount the file system temporarily
mount -t ext3
or replace ext2 by ext3 in your /etc/fstab - upgrade to a more recent kernel
tune2fs -j /dev/hda#
/dev/hda#/ /home ext3 defaults
Cdrecord config & basic commands
- check system config for cdrecord
- run a test
cdrecord -v -dummy -dao -dev=0,0,0 sthg.iso
- blank cdrw if necessary
cdrecord -v -blank=fast -dev=0,0,0
cdrecord -scanbus
Back-up strategy
Intermediate backup
-
once editing is over and film versions are released for appraisal, either on the net, or in various mpeg formats
- create archive of assets (w/o compression) & copy to a different drive rather than burn onto dvdrw
- create archive, compress & split masters to burn onto re-writable dvds and check md5sum
- burn film in various formats for limited release
- generate an nle_film_project_report.txt (which consists mainly of ls -shR | cat > assets_film_project.txt and md5sum of masters)
- intermediate back-ups have really just one goal: a tmp bup in case of disk crash (which happens more often than you think - in my case ~ 12-15months per disk)
Final backup
-
following feedback from limited release, add suggested or necessary changes
- update files in uncompressed archive of assets (i.e. overwrite old versions), split and create image.iso for dvds
- unsplit compressed archive of masters, add new compressed masters to existing compressed archive (i.e. incremental, but keep old versions), split and create image.iso for cdrs or dvds
- update nle_film_project_report.txt (and possibly generate an html version with 'txt2html')
- burn all image.iso for dvds & cdrs
- blank re-writable dvds with masters from intermediate back-ups
Remarks
- if a rough selection of footage has been done before capturing to hd, expect about 20G of assets for a 26min documentary film (shooting ratio 1:8)
- make provision for another 10G for various masters & releases
- compression of archives (depending on format can be hardly 2 to 60%) (approx. 20+10*60%= more or less 18-20G), splitting archives into chunks of 1.1G (20G) & creating iso images (another 20G) of split compressed archives being undertaken in 3 different steps, you may need at least 60G only for back-up purposes - this is not satisfactory, but it's the only way I have found so far
Archiving sparse footage.mov (eg. assets under Cinelerra) onto a cdr
- this is hardly worth it, because you probably want to backup onto a dvd, but is provided as a generic way for paranoiacs who trust more cdrs than dvds
Check file's attributes before compression (MD5SUM)
- keep this footage_mov_md5sum file until the end of the compression & decompression cycle as you will need it for integrity check, i.e if no EOF error has occurred
md5sum footage.mov | cat > /home/films_to_compress/footage_mov_md5sum
Compress 'sparse' films_with_any_codec.any_format
- tar and gnutar with sparse option don't seem to work in the context of sparse footage encapsulated in Quicktime for Linux
- 'star' seems to include some sort of 'sparse' option by default, and is more powerful too (POSIX compliant, etc) - read man or info star
using the -sparse option in the command line results in virtually no compression while its absence doesn't seem to generate EOF errors - note that listing the contents of compressed archives doesn't seem to be possible
you have to extract the whole archive, so you might be better off doingls -shR | cat > ls_bup.txt_not_included_in_compression
ls_bup.txt_not_included_in_compression >> nle_film_project_report.txt
star -v compress-program=gzip -c films_with_any_codec.any_format f=films_with_any_codec.any_format.star.gz
Compress & create image of a 'sparse' footage.mov for a 650Mb cdr in one go
star compress-program=gzip -v -c footage.mov | mkisofs -v -o footage_mov.iso -stream-media-size 333000
Mount image (to run MD5SUM before burning)
mount footage_mov.iso -r -t iso9660 -o loop /mnt ls -shR /mnt
Extract file (to run MD5SUM before burning)
cp /mnt/stream.img /home/films_to_extract/ cd /home/films_to_extract/ star -v -x -f=stream.img
Extract file (to run MD5SUM before burning) - faster alternative
star -v -zxp -C /home/films_to_extract/ f=/mnt/stream.img md5sum footage.mov | cat >> /home/films_to_compress/footage_mov_md5sum vi /home/films_to_compress/footage_mov_md5sum
diff md5sum_footage.mov_before_compression.txt md5sum_footage.mov_after_compression.txt
umount /mnt
Burn the calculated image of a 'sparse' footage.mov onto a cdr
- running K3b seems to work fine (without EOF error) on small size files (perhaps due to max 2GB limit)
- alternatively, use image.iso calculated with stream-media-size option set for a 650M cdr
cdrecord -v -dao -dev=0,0,0 footage_mov.iso
Calculate image & burn 'sparse' compressed & split footage.mov onto a cdr
- this could be useful when your split archive (total size 8.98G) fits for instance on 2 dvds (6x1.43G) and one cdr (1x400M)
- note that some file names may have slightly been truncated - md5sum should give nonetheless the same attributes, which means that when copying from cdr back to hd for a restore operation, you may have to rename some files for the 'unsplit' process to succeed
mkisofs -print-size footage_mov.star.gz
use output of print-size for feeding tsize=65796 + s in the command below
('s' is compulsory)
cdrecord -v -dao -dev=0,0,0 footage_mov.star.gz tsize=65796s
mkisofs -print-size part7_out_of_7_parts_films_mov.star.gz.split cdrecord -v -dao -dev=0,0,0 \ part7_out_of_7_parts_films_mov.star.gz.split.iso \ -tsize=409600s
growisofs -Z /dev/dvd -r \ -l part1_out_of_7_parts_films_mov.star.gz.split part2_out_of_7_parts_films_mov.star.gz.split (...)
Archiving footage.mov (eg. assets under Cinelerra) onto a dvd
- this section is probably what you will need most (although the life span of home-made dvds is still questionable)
Create and compress an archive of assets with 'sparse' footage.mov
star compress-program=gzip -v -c footage.mov f=footage_mov.star.gz
Blank DVD-RW if necessary
growisofs -Z /dev/scd0=/dev/zero
dvd+rw-format -format=blank /dev/dvd
Size of bup media
- use the table below as a source of inspiration to avoid wasting space when backing up on cdr / dvd / double-layer dvd
size of file possible # MB * 1024 = # of media or directory of parts stream media (cdr - dvd - size option 2-layer dvd) ---------------------------------------------------------------- 100 - 650MB 1 X 650M 665600 1 X cdr 800 - 1950MB 2 X 2G 1996800 1 X dvd 2GB - 4.3GB 3 X 1.43G 1467392 1 X dvd ---------------------------------------------------------------- 4.3 - 7.6GB 6 X 1.1G 1126400 2 X dvds (if OS accepts only < 2G file size) ---------------------------------------------------------------- 4.3 - 7.6GB 2 X 3.8G 3891200 1 X 2-layer dvd (if OS accepts > 2G file size)
Split archive footage_mov_over2G.star.gz for cdr / DVD / double-layer DVD
- watch out
- suffix is irrelevant
the whole file name is for better legibility - in practice you may use just as well part1 - part2 - etc - 1. create the first part 01footage_mov_over2G.star.gz
- after creating 01split_footage_mov_over2G.star.gz, you will get the following prompt
- 2. type n (no), the name of your second part and hit 'enter'
- you will get a confirmation prompt for the name of the next part
- hit 'enter'
- after creating 02split_footage_mov_over2G.star.gz, you will get the following prompt
- 3. type n (no), the name of your third part and enter
- you will get a confirmation prompt for the name of the next part
- hit 'enter' (...)
--file=/home/split_parts/01split_footage_mov_over2G.star.gz
other parts will be in the same directory you're running the command from
-L 1126400
tar -c -M -L 1126400 --file=01split_footage_mov_over2G.star.gz footage_mov_over2G.star.gz
'Prepare volume #2 for 01split_footage_mov_over2G.star.gz and hit return'
n 02split_footage_mov_over2G.star.gz
'Prepare volume #2 for 02split_footage_mov_over2G.star.gz and hit return'
'Prepare volume #2 for 02split_footage_mov_over2G.star.gz and hit return'
n 03split_footage_mov_over2G.star.gz
'Prepare volume #2 for 03split_footage_mov_over2G.star.gz and hit return'
Burning DVDs with ready-made compressed and split archive
- in the example given above, each part is 1.1GB, so you may nicely fit 4 parts onto one dvd
growisofs -Z /dev/dvd -r \ -l contents_not_to_be_compressed.txt 01split_footage_mov_over2G.star.gz 02split_footage_mov_over2G.star.gz (...)
Restoring assets.star.gz from cdr / DVD
Unsplit compressed archive
- 1. decompress the first part (be careful to use the exact name of the original uncompressed file!)
- 2. decompress the 2nd part
- accept the prompt
- hit 'enter'
- 3. accept the prompt
'Prepare volume #2 for 02split_footage_mov_over2G.star.gz and hit return'
- type n (no), the name of your third part and enter
- hit 'enter' (...)
tar -x -M \ -f 01split_footage_mov_over2G.star.gz /home/assets_to_extract/footage_mov_over2G.star.gz
'Prepare volume #2 for 01split_footage_mov_over2G.star.gz and hit 'enter' type n (no), the name of your second part and hit 'enter'
n 02split_footage_mov_over2G.star.gz
'Prepare volume #2 for 02split_footage_mov_over2G.star.gz and hit return'
n 03split_footage_mov_over2G.star.gz
'Prepare volume #2 for 03split_footage_mov_over2G.star.gz and hit return'
Extract & 'un-archive' footage.star.gz
star -v -zxp \ -C /home/assets_restored/ \ f=/home/assets_to_extract/footage_mov_over2G.star.gz
Restoring Cinelerra's xml editing list
- after changing drives, or a restore operation of all assets, you need updating the relative path of /film/soundtrack/*.wav files
/mnt/old_drive/film/soundtrack/*wav
/mnt/new_drive/restored_film/soundtrack/*wav
Solution using 'vi' and 'substitute'
- run vi (or any editor like emacs) to rebuild index (i.e. replace path) after changing hd
- alternative: right click in media box on given file for info & re-enter manually the new path
(valid solution if only a few assets need to be re-indexed)
vi film_project.xml :% substitute/\/mnt\/old_drive\/old_directory\//\/mnt\/new_drive\/new_directory\//g
Your beloved commands for BUP
- this is a list of all commands used above
md5sum films.* | cat > /films_to_compress/films_md5sum_log.txt star compress-program=gzip -c films f=films.star.gz mkisofs -print-size films.star.gz [??????] cdrecord -v -dao -dev=0,0,0 footage_mov.star.gz tsize=??????s star compress-program=gzip -c films | mkisofs -v -o films.iso -stream-media-size 333000 mount films.iso -r -t iso9660 -o loop /mnt cdrecord -v -dao -dev=0,0,0 films.iso star -zxp -C /home/films_to_extract/ f=/mnt/stream.img star compress-program=gzip -c films f=films.star.gz run bc -l then ###MB * 1024 = tsize = -L [??????] tar -c -M -L ?????? --file=part1_films.star.gz.split films.star.gz growisofs -Z /dev/dvd -r -l *log.txt part1_films.star.gz.split part2_films.star.gz.split (...) tar -x -M -f part1_films.star.gz.split /assets_to_extract/films.star.gz star -v -zxp -C /assets_restored/ f=/assets_to_extract/films.star.gz cdrecord -scanbus [?,?,?] cdrecord -v -dummy -dao -dev=?,?,? sthg.iso cdrecord -v -blank=fast|all -dev=0,0,0 growisofs -Z /dev/scd0=/dev/zero dvd+rw-format -format=blank /dev/dvd :% substitute/\/mnt\/old_drive\/old_directory\//\/mnt\/new_drive\/new_directory\//g