----------- Introduction, copyright, license

debdelta is an application suite designed to compute changes between
Debian packages. These changes (deltas) are similar to the output of
the "diff" program in that they may be used to store and transmit only
the changes between Debian packages.  This suite contains
'debdelta-upgrade', that downloads debdeltas and use them to create
all Debian packages needed for an 'apt-get upgrade'.

debdelta is  Copyright (C) 2006 Andrea Mennucci

debdelta is free software.  See the file COPYING for copying conditions.


debdelta uses 'minizgip', that is a simplified version of
  /usr/share/doc/zlib1g-dev/examples/minigzip.c.gz

minigzip is released with a permissive license,
(see minigzip.c , or debian/copyright)

minigzip is Copyright (C) 1995-2002 Jean-loup Gailly

---------- Description

The debdelta  application suite is really composed of different applications.

Warning: all these applications are still at a beta stage, do not run them as
'root' user.

---- debdelta

'debdelta'  computes the delta, that is, a file that encodes the difference
between two Debian packages.

Example:

$ a=/var/cache/apt/archives 
$ debdelta -v $a/emacs-snapshot-common_1%3a20060512-1_all.deb $a/emacs-snapshot-common_1%3a20060518-1_all.deb /tmp/emacs.debdelta

the result is:
 deb delta is  12.5% of deb ; that is, 15452kB would be saved

---- debpatch

'debpatch' can use the delta file and a copy of the old Debian package to
recreate the new Debian package. If the old Debian package is not available,
but is installed in the host, it can use the installed data; in this
case, '/' is used in lieu of the old .deb.

Example:

$ debpatch  /tmp/emacs.debdelta / /tmp/emacs.deb

----- debdeltas

'debdeltas' can be used to generate deltas for many debs at once.
It will generate delta files with names such as
 package_old-version_new-version_architecture.debdelta
and put them in the directory where the new .deb is.

If the delta exceeds ~70% of the deb, 'debdeltas' will delete it
and leave a stamp of the form
 package_old-version_new-version_architecture.debdelta-too-big

Example :

$ debdeltas /var/cache/apt/archives/*deb

With the --dir argument, it will put the deltas in a different tree
(this is necessary if you use 'debmirror' , since 'debmirror' will
 destroy any file that it does not recognize)

Example:

$ m=where_your_mirror_is
$ d=where_to_put_deltas
$ cd $m
$ find pool -type d -mtime -1   |  xargs -r  debdeltas --dir $d// 

The // means that the pool directory tree will be mimicked in the deltas
directory tree.

----------- debdelta-upgrade

This command will download necessary deltas from my mirror
and use them to create debs for an 'apt-get upgrade'

This is currently a hack; it should be replaced by an APT method
(this is work in progress); but for this I need help from APT and python-apt
authors.

By default, debs are saved in /tmp/archives : do not forget to
move them in /var/cache/apt/archives

-------------- Statistics

I am currently running 'debdeltas' in a mirror that mirrors
'etch' and 'sid' for i386. 

Statistics are at
http://tonelli.sns.it/pub/mennucc1/debdelta/histograms
and daily logs at
http://tonelli.sns.it/pub/mennucc1/debdelta/daily-logs/

Currently, on average, debdeltas are 28% of the original .deb
(28% is the ratio of the total size of the debdelta created / 
 /  the total size of the .debs processed).

In some cases, though, the benefit of using deltas is much more than that:
for example, 'debdelta'  can express the difference between 'tetex-doc'
 3.0-17 and 3.0-18 into a delta of mere 260kB !

------------- Tests and comparisons on backend binary delta difference compressors

Debdelta is just a wrapper that analyzes the .deb files and prepares
chunks of data to pass to a backend; the backend may be any binary
delta difference compressor program. Chunks are ~ 4Mb in size, but
note that (currently) chunks must contain at least one file, so if the
original .deb contains large files, debdelta will generate large
chunks (example: the 'gcl' package contains a file of ~ 20Mb size).

Debdelta can use different programs as backend; I did many
tests, using a pool of 'difficult' debs; results are in
http://tonelli.sns.it/pub/mennucc1/debdelta/tests

So far I tested:

bdiff
http://www.webalice.it/g_pochini/bdiff/
Comment: It is extremely slow!! and it performs bad.


rdiff 0.9.7-1 
(package in Debian)
Average size: 111%
Comment: On average, deltas are 10% larger than original .deb!
Altough rsync is a wonderful tecnology, its rdiff counterpart is definitively
not usable.


zdelta 2.1
http://cis.poly.edu/zdelta/
Average size: 42%
Delta speed: 495
Patch speed: 828
Comment: it is outperformed by xdelta, both in speed and in size


xdelta3 30e
http://xdelta.sf.net
It is still too buggy, crashes in many cases (with "assertion failed")


bsdiff 4.3-2   (tested with ~ 50Mb RAM limit)
(package in Debian)
Average size: 26%
Delta speed: 97
Patch speed: 761
Comment: The king of delta compression; in my tests it achieves an
average of 26% , so it is the best choice for archiving. But it is
very slow, even when patching, so it may not be the best choice for
people who download deltas instead of .debs: if an user has a slow
CPU, it may be the case that patching would be so slow that the user
would not gain any benefit w.r.t simply downloading the full
.deb. Moreover delta creation uses a LOT of memory (even though I try
to use small chunks of data), and is very slow on large files
(i.e. when the .deb contains large files); for example, it took to 8
hours to create a delta for package gcl !


xdelta 1.1.3-6   (with bzip2 compression of deltas)
(package in Debian)
Average size: 33%
Delta speed: 592
Patch speed: 1602
Comment: It is so fast that it may be a good replacement for bsdiff
in many cases (since the difference in size is just an extra 7%); unfortunately
it does not (currently) work on 64 bit CPUs.


bdelta 0.1.0 
http://deltup.sf.net
Average size: 83%
Delta speed: 437
Patch speed: 1429
Comment: it was supposed the delta engine behind deltup (that is used in Gentoo)
 it was left at a beta stage, and does not perform well enough


diffball 0.7.2
http://developer.berlios.de/projects/diffball/
Average size: 76%
Delta speed: 103
Patch speed: 1741
Comment: 
 It does not perform well; moreover,
 as bsdiff, it can be extremely slow on large files 


Winners: bsdiff and xdelta


-------------- Speed

Warning: this section is referred to experiments where the backend for
delta encoding was 'xdelta' ; currently the default backend is
'bsdiff', that is much slower; work is in progress to find a
compromise.

On a desktop with CPU  Athlon64 3000 and a average hard disk,
$ debdelta mozilla-browser_1.7.8-1sarge3_i386.deb  mozilla-browser_1.7.8-1sarge6_i386.deb /tmp/m-b.debdelta
processes the 10Mb of mozilla-browser in ~11sec, 
that is a speed of ~900kB per second.

Then  debpatch applies the above delta in  16sec,
at a speed of  ~600kB per second.

Numbers drop in a old PC, or in a notebook (like mine, that has a
Athlon 1600MHz and slow disks), where data are chewed at ~200kB per
second. Still, since I have a ADSL line that downloads at
max 80kB per second, I have a benefit downloading deltas.

In a theoretical example, indeed, to download a 80MB package, it would
take 1000seconds; whereas to download a delta that is 20% of 80MB it
takes 200seconds, and then 80MB / (200kB/sec) = 400seconds to apply
it, for a total of 600seconds. So I may get a "virtual speed" of 80MB /
600sec = 130kB/sec .

Note that delta downloading and delta patching is done in parallel:
if 4 packages as above have to be downloaded, then the total
time for downloading of full debs would be 4000seconds, while the time
for  parallel-download-patch-apply-patch may be as low as 1400seconds.

This is a real example of running 'debdelta-upgrade' :
 Looking for a delta for libc6 from 2.3.6-9 to 2.3.6-11
 Looking for a delta for udev from 0.092-2 to 0.093-1
 Patching done, time: 22sec, speed: 204kB/sec, result: libc6_2.3.6-11_i386.deb
 Patching done, time: 4sec, speed: 57kB/sec, result: udev_0.093-1_i386.deb
 Delta-upgrade download time 28sec speed 21.6k/sec
               total time: 53sec; virtual speed: 93.9k/sec.

(Note that the "virtual speed" of 93.9k/sec , while less than the 
130kB/sec of the theoretical example above, is still more than the
80kB that my ADSL line would allow).

Of course the above is even better for people with fast disks and/or
slow modems.

Actually, an apt delta method may do a smart decision of how many
deltas to download, and in which order, to optimize the result, (given
the deltas size, the packages size, the downloading speed and the
patching speed).
