Distributed Source/Video Coding
Definition and
application scenarios
Distributed source
coding (DSC) refers to compression of multiple statistically correlated, yet
physically separated sources. It is distributed in the sense that the sources
do not communicate with each other (distributed encoding), and the burden of
exploiting the correlation among the sources is shifted to the decoder (joint
decoding). The theoretical foundation of DSC was established in 1970s, yet the
majority of researches came into sight only recently, strongly spurred by the
emergence of wireless sensor networks (WSN). WSN provides suitable application
scenarios for DSC:
1. Sensor nodes
are provisioned with very limited power supplies, thus the light-weight
distributed encoding is desired;
2. Sensor nodes
have similar observations, thus the joint decoding is enabled (at a base
station).
Another class of
applications of DSC is the distributed video coding (DVC). Video data is highly
correlated among frames. Conventional inter-frame coding schemes use
motion-compensated prediction (MCP) for decorrelation, however, the motion estimation (ME) is a pretty heavy load
for the encoder. Similar to DSC, DVC provides a useful alternative to
conventional video coding, by shifting the burden of ME to the decoder. It is
suitable for the so-called ��uplink�� video transmission, in which the video
encoders are assumed to be low-cost power-hungry.
Other applications
include using DSC/DVC for error-resilient multimedia communication, content
authentication, or blind compression of encrypted data.
History and
state-of-the-art
DSC can be
categorized into lossless and lossy ones. Lossless
DSC was first discussed in Slepian and Wolf��s 1973
paper [4], which is also known as Slepian-Wolf coding (SWC). Lossy
DSC is a rate-distortion problem. One of its sub-problem, source coding with
side information (SI) at the decoder, was first addressed in Wyner and Ziv��s 1976 paper [5], which is also known as Wyner-Ziv coding (WZC). Theoretical result reveals that
there is no rate loss in SWC, compared to the conventional lossless coding; and
there is also no rate loss for WZC, if the sources are jointly-Gaussian, and
the distortion metric is MSE.
State-of-the-art
SWC uses near-capacity channel codes such as turbo codes and LDPC codes and the
performances are very close to the theoretical bounds. Practical WZC based on
lattice quantization and SWC also approaches the theoretical bounds. Readers
are referred to [6] for a wonderful survey.
For DVC (a.k.a. Wyner-Ziv video coding, or WZVC), however, the performance
gap is still large, compared to conventional inter-frame coding. The difficulty
lies in the generation of SI at the decoder: unlike the encoder-side ME, the decoder does not have the access to the current
frame. The in-efficient generation of SI is the major factor that degrades the
overall coding efficiency of WZVC. For a much detailed review of DVC, readers
are referred to [7].
Our works and
publications
1. Power-efficient rate allocation
for networked SWC
2. Hierarchical side-information
generation for WZVC
3. Non-binary SWC design using Gray
codes
In this work, we
consider the rate allocation (RA) problem for SWC over WSN. In a real network,
it is reasonable to assume the physical conditions of the links (say, noise
level, fading factor, etc.) are different. The cost of transmitting a bit from
each sensor to the sink node also varies. Since the multiple encoders are
collaborating with each other, it is reasonable to put more transmission burden
to sensors with good link conditions to increase the power efficiency. The goal
of this work is to find the optimal rate-point that allows lossless
reconstruction of the sources, while minimizing the overall transmission power
consumption of the WSN.
We use the model
proposed in [8], which assumes the overall transmission power is the
weighted sum of exp(rate) of the M sources.
We propose to find the optimal rate-point using a recursively approach, based
on a novel water-filling model. The feasibility and optimality of the proposed
solution are analyzed mathematically and verified experimentally. Compared to
the conventional Lagrangian-multiplier approach, our
algorithm achieves dramatic reduction in computational complexity. The details
of this work can be found in our Conference Paper [1].
Future work
includes optimum quantizer design for a lossy DSC problem, aiming at a practical lossy DSC system that achieves the best
power-rate-distortion tradeoff.
As we have mentioned,
the decoder-side ME is one of the most distinctive component in a WZVC system.
And not surprisingly, its performance significantly affects the overall coding
efficiency of WZVC. Most existing WZVC schemes apply motion interpolation or
extrapolation to estimate the motion vectors (MV) at the decoder. These
approaches basically assume that objects move at a constant speed and MVs reflect true motions, which is an over-simplification
to the reality. As a result, the generated SI is usually not a good approximation
to the current frame. This has been one of the most significant limitations to
the performance of existing WZVC schemes.
In our work, we
consider the case that the decoder performs the ME and the WZ decoding in a
hierarchical fashion. That is, we assume the decoder has partial information
about the current frame, which could be a frame with lower quality or
resolution. The decoder uses such information to refine the motion field and
get a better estimation of the SI, which better helps the decoding of the image
with finer resolution or higher quality. This process iterates until the whole
frame is decoded. Experimental results show that the quality of SI is greatly
improved when there is less temporal correlation among the motion fields (thus
the motion is less predictable along the time axis). Another advantage is that
the energy of the prediction residual is much more stable across frames, which
enables reliable encoder-side rate-allocation. For more details of the scheme,
also for the theoretical analysis of the performance gain, please refer to our
Conference Paper [2].
This is an
improvement to the non-binary SWC scheme in [9]. We propose to use Gray codes for the binary
representation of the symbols. Better performance is obtained when the channel
between the two sources is additive and the noise is small (which is typical in
most multimedia coding scenarios). In our Conference Paper [3], we show the near-capacity coding efficiency and good
SNR scalability of our scheme.
Papers
[1] W. Liu, L. Dong
and W. Zeng, ��Power-Efficient Rate Allocation for Slepian-Wolf Coding over Wireless Sensor Networks��, (submitted to) IEEE International Conference on Acoustics,
Speech, and Signal Processing (ICASSP), Las Vegas, Mar. 2008.
[2] W. Liu, L.
Dong and
[3] W. Liu and
W. Zeng, ��Non-binary distributed source coding using gray codes,�� IEEE International Workshop on Multimedia
Signal Processing,
Other
References
[4] J. D. Slepian and J. K. Wolf, ��Noiseless coding of
correlated information sources,�� IEEE Transactions on Information Theory,
vol. IT-19, pp. 471�C480, July 1973.
[5] A. D. Wyner, ��Recent Results in the Shannon Theory,�� IEEE
Transactions on Information Theory, vol. 20, no. 1, pp. 2�C10, Jan. 1974.
[6] Z. Xiong, A. Liveris, and
S. Cheng, ��Distributed source coding for sensor networks,�� IEEE Signal
Processing Magazine, vol. 21, pp. 80-94, September 2004.
[7] B. Girod, A. Aaron, S. Rane
and D. Rebollo-Monedero , ��Distributed video
coding,�� Proceedings of the IEEE, Special Issue on Video
Coding and Delivery, vol. 93, no. 1, pp. 71-83, January 2005.
[8] R. Cristescu, B. Beferull-Lozano
and M. Vetterli,
��Networked Slepian-Wolf:
theory, algorithms, and scaling laws��, IEEE
Trans. on Info. Theory, vol. 51, no. 12, Dec. 2005.
[9] Y. Zhao and
J. Garcia-Frias, ��Joint estimation and data compression of
correlated non-binary sources using punctured turbo codes,�� in Proc.
Conference on Information Sciences and Systems, Princeton, NJ, Mar. 2002.