SENDER-BASED RATE-DISTORTION OPTIMIZED STREAMING
OF3-D WA VELET VIDEO WITH LOW LATENCY
Chuo-Ling Chang,Sangeun Han and Bernd Girod
Information Systems Laboratory,Department of Electrical Engineering,Stanford University
We propose a sender-based rate-distortion optimized framework to stream scalable bitstreams of3-D wavelet video stored at the sender to a remote receiver.Based on the requests and feedback from the receiver,the source rate-distortion proﬁles,the desired playout latency and transmission rate,and the network charac-teristics,the sender optimizes the responses sent to the receiver throughout the video playout session in order to minimize the dis-tortion in the reconstructed frames.Rate-distortion optimized re-sponse is formulated as a convex optimization problem,and an ofﬂine-computation approach is proposed to further reduce the complexity at the sender.Experimental results show that the pro-posed sender-based approach outperform the receiver-based ap-proach,and the ofﬂine-computation approach very closely approx-imates the fully-optimized approach.
Many attempts have been made to incorporate motion compensa-tion into the3-D wavelet video coding framework.Earlier works are somewhat unsatisfactory in terms of the rate-distortion coding performance because the motion vectorﬁeld is severely re-stricted and the temporal transform is usually limited to the two-tap Haar wavelet.Recently,motion-compensated lifting has been pro-posed-,which successfully incorporates unrestricted motion compensation into3-D wavelet coding and achieves compression efﬁciency approaching the state-of-the-art predictive video coding schemes.However,despite the increasing interest in3-D wavelet video,efﬁcient streaming of such data sets that exploits the rate-distortion performance as well as the inherent support for scalabil-ity is seldom addressed.
In,Chou and Miao proposed a framework for rate-distortion optimized packet scheduling of video and audio data. In general,it can be applied to streaming of the3-D wavelet video data set.However,in their framework,the data set has to be as-sembled into packets before the optimization for packet schedul-ing takes place.Therefore,the packet content cannot dynamically adapt to the network characteristics as well as the state of the data previously transmitted to the receiver.Additionally,the optimiza-tion process is formulated as a combinatorial problem which re-quires high complexity to solve.
We have previously proposed a receiver-based rate-distortion optimized framework for streaming of3-D wavelet video with low latency.The source rate-distortion proﬁles,the desired playout latency and transmission rate,and the network characteristics are 1This work was supported by Grant No.ECS-0225315of the National Science Foundation,Philips Corporation,and the Max Planck Institute.all taken into account to optimize the streaming strategy.Addi-tionally,due to theﬁne scalability property of embedded wavelet coefﬁcient coding,the optimization process is approximated as a convex optimization problem,which can be efﬁciently solved by standard optimization techniques.
In the receiver-based framework,it is the responsibility of the receiver to efﬁciently select the data to be retrieved from the sender.Therefore,the rate-distortion proﬁle of the coded data has to be available at the receiver ahead of the video playout session. In addition,the losses and excess delay in the backward channel experienced by the requests issued from the receiver could block the transmission.In this paper,we describe a sender-based frame-work where the sender selects the data to be transmitted.To reduce the computational complexity at the sender,an ofﬂine-computation approach is proposed.
The remainder of the paper is organized as follows:In Sec-tion2,we brieﬂy describe the structure of the3-D wavelet video coding scheme adopted in this work.In Section3,we present the proposed sender-based framework for streaming the3-D wavelet video over networks.The ofﬂine-computation approach is dis-cussed in Section4.Finally,experimental results are presented in Section5.
2.MOTION-COMPENSATED3-D WA VELET
In this work,a3-D wavelet coder using motion-compensated lift-ing is adopted to encode the video sequence-.A multi-level temporal discrete wavelet transform(DWT)implemented using motion-compensated lifting isﬁrst applied across the video frames to decompose them into temporal subbands,followed by a multi-level2-D spatial DWT decomposing the temporal subbands into wavelet coefﬁcients.The SPIHT(Set Partitioning in Hierarchical Trees)algorithm isﬁnally applied to encode the wavelet co-efﬁcients of each subband into a scalable bitstream.The SPIHT algorithm provides a scalable representation so that different re-construction qualities of the video frames can be obtained by trun-cating the coded bitstreams at different lengths.
To decode a particular video frame,only a few subbands rel-evant to synthesizing the frame need to be reconstructed.The truncated bitstreams of these subbands available at the decoder are decoded into reconstructed wavelet coefﬁcients by the inverse SPIHT algorithm.Then the inverse2-D spatial DWT is applied to reconstruct the temporal subbands.Finally,the inverse motion-compensated lifting procedure is performed to carry out the inverse temporal DWT in order to reconstruct the video frame.