Universal lossy compression of analog sources - software
This webpage describes the
used in our work on universal lossy compression of analog sources
described in the following papers,
The software was implemented by
based on an implementation by Shirin Jalali
of her work with
published in the following paper,
Below is a brief description of files used in our implementation.
The first files you should look at when trying to use this code
are those whose name starts with "main" and in particular
Any comments will be appreciated.
Dror Baron, July 2011
Feel free to also browse through other
software packages developed by our group.
Dror, October 2011
this file evaluates all possible symbols and computes the
Gibbs distribution; then generates new symbol and updates data
structures accordingly. Instead of working directly on the context
counts data structure, only locations of changes are saved; this
accelerates performance when the context count matrix is big.
This file takes up most of the runtime of the lossy compressor,
and therefore it was heavily optimized. The code has
inevitably become more complicated, and special care was taken
to incorporate plenty of documentation.
encoder_MCMC_03102011.m: this is the actual lossy compressor. Sets up data
structures for contexts; runs d_update and count_update_all each
iteration to update symbol counts, moments, entropy, and distortion.
I incorporated sanity checks that verify integrity of various data
structures and computations. This implementation uses
count_update_all.m (above), which includes several accelerations.
These two files run compress using multiple alphabet sizes and multiple
RD slopes for Laplace and autoregressive (AR) sources. For the most
part these are wrapper files around encoder_MCMC.m, which is the actual
compressor. Each file generates a data file and plot that compares the
rate distortion function, simulation results, entropy coding, etc.
this is a wrapper file similar to main_Laplace04082011.m and
main_AR04082011.m (details above) but simplified. It generates
an input, compresses it, and invokes Matlab's profiler. This is the
first file you should look at when trying to use this package.
Other Matlab Scripts
describes the implementation.
alph2int.m: converts string over finite alphabet into integer.
bad_comparison.m: lots of printouts when the database loses its
integrity (useful for debugging).
count.m: creates depth-k context counts for finite alphabet.
d_update.m: computes possible changes in distortion for different
possible new symbols; this is run before counts_update_all.m but
could be grouped together. The code is highly vectorized, yet short
and thus (hopefully) not difficult to understand.
distortion.m: computes distortion between sequences. Used in encoder
at the beginning and to verify integrity of data structures.
entropy.m: computes per-context entropy. Redundancy for unknown
conditional empirical statistics is not accounted for, and so this
is a slight under-estimate of actual coding length.
graphics_journal.m: this file creates plots for the paper.
H_m.m: computes conditional empirical entropy for entire matrix m of
context counts (entire sequence).
initz.m: initializes z sequence from x based on some heuristic.
moments.m: computes moments of the form X^m_alpha that appear
in DCC 2010 paper; used for fast update of distortion despite
change in mapping that appears for many symbols.
A great deal of data was generated by the actual simulations main_Laplace and main_AR.
These are not being made available at this time.
laplace_RD_load.mat: contains data about the rate distortion function
of the Laplace distribution. This data is used in main_Laplace_04082011.m to plot
the RD function.
contains data about the rate distortion function
of the Gaussian distribution. This data is used in graphics_journal.m to plot
the RD function.
Back to my homepage.