bzip2, bunzip2 – a block-sorting file compressor, v1.0.4

Connected From This

A UNIX Command

$wget http://www.kernel.org/pub/linux/kernel/v3.0/patch-3.0.4.bz2
--2011-09-03 19:22:43--  http://www.kernel.org/pub/linux/kernel/v3.0/patch-3.0.4.bz2
Resolving www.kernel.org... 130.239.17.5, 149.20.4.69, 199.6.1.165, ...
Connecting to www.kernel.org|130.239.17.5|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 96120 (94K) [application/x-bzip2]
Saving to: `patch-3.0.4.bz2'

100%[===================================================================================>] 96,120      85.3K/s   in 1.1s

2011-09-03 19:22:45 (85.3 KB/s) - `patch-3.0.4.bz2' saved [96120/96120]

$bunzip2 patch-3.0.4.bz2
$ls patch-3.0.4
patch-3.0.4
$


UNIX Explanation

bzip2  compresses files  using the  Burrows-Wheeler block
sorting text  compression algorithm, and  Huffman coding.
Compression  is generally  considerably better  than that
achieved    by    more    conventional    LZ77/LZ78-based
compressors,  and approaches the  performance of  the PPM
family of statistical compressors.

Related Source Code Exposition


for (t = 0; t < nGroups; t++) {
minLen = 32;
maxLen = 0;
for (i = 0; i len[t][i] > maxLen) maxLen = s->len[t][i];
if (s->len[t][i] len[t][i];
}
BZ2_hbCreateDecodeTables (
&(s->limit[t][0]),
&(s->base[t][0]),
&(s->perm[t][0]),
&(s->len[t][0]),
minLen, maxLen, alphaSize
);
s->minLens[t] = minLen;
}

Source Code Highlight

Create the Huffman decoding tables

Featured Image

Related Knowledge

bzip2  expects a  list  of file  names  to accompany  the
command-line   flags.   Each  file   is  replaced   by  a
compressed   version    of   itself,   with    the   name
"original_name.bz2".  Each  compressed file has  the same
modification  date,   permissions,  and,  when  possible,
ownership  as the corresponding  original, so  that these
properties  can be  correctly  restored at  decompression
time.   File name  handling is  naive in  the  sense that
there is no mechanism for preserving original file names,
permissions,  ownerships or  dates  in filesystems  which
lack  these concepts,  or have  serious file  name length
restrictions, such as MS-DOS.

source : debian manual pages for bzip2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s